79

I know that arXiv have their own identifier system, but considering how widely adopted DOI is, why do they not use DOI perhaps as a supplement to their own identifier?

Thomas Arildsen
  • 1,303
  • 1
  • 10
  • 12

3 Answers3

62

At least in mathematics, the arXiv is a pre-print server --- papers are mostly eventually published, and receive DOIs then. In fact, the arXiv encourages authors to add these DOIs to the arXiv metadata when they become available.

I think it could be quite confusing for papers to end up with two DOIs. Given that the arXiv numbering scheme works quite well, and in practice everyone knows how to resolve handles of the form arXiv:NNNN.MMMMM, why add the complication?

Scott Morrison
  • 721
  • 1
  • 4
  • 5
  • 30
    (I'm on the mathematics advisory board for the arXiv, so I'm quite interested in this question, and very happy to consider other positions. It's certainly something we could think about doing, if it made sense.) – Scott Morrison Jan 30 '16 at 05:52
  • 2
    'Avoid duplication' is a key point. The DOI wasn't intended to be a unique identifier, but it inevitably gets used as one, and duplicate assignment complicates things. – Andrew is gone Jan 30 '16 at 11:16
  • 12
    @Andrew But the Object represented by the Digital Object Identifier, i.e. the published paper, is not the same Object represented by the arXiv identifier, i.e. the preprint. So "a paper" (really, a preprint and a published paper) having two distinct identifiers is not duplication, it's correct bookkeeping. –  Jan 30 '16 at 15:38
  • 4
    @NajibIdrissi this is indeed correct, but if we provide two DOIs people will probably find it tricky to maintain that distinction. Having a separate type of identifier avoids a bit of the confusion. – Andrew is gone Jan 30 '16 at 16:36
  • 10
    To the extent that the preprint and the published paper have distinct identities, remember they already have distinct, globally unique identifiers (an arxiv id and a DOI). – Scott Morrison Jan 30 '16 at 21:32
  • A proposal: any arxivv paper with a handle of the form arXiv:NNNN.MMMMM could be assigned an unique doi handle in the form 10.XXX/NNNN.MMMMM (could be done a posteriori + include revisions) - this would make these widely more accessible to the academic world (and in practice to current bibliography management tools ) - still they would be unique and not complicate anything (easy, unique conversion from one to the other). – meduz Oct 07 '20 at 08:13
  • in practice everyone knows how to resolve handles of the form arXiv:NNNN.MMMMM, This must be field specific, because I've never had cause to resolve a handle like that. – Azor Ahai -him- Feb 22 '22 at 18:21
  • @meduz That's basically what they opted for: 10.48550/arXiv.NNNN.MMMMM. – Anyon Feb 24 '22 at 00:36
42

DOIs have a technical purpose and a bolted-on social purpose.

The technical purpose for DOIs is to be an actionable identifier for intellectual works (such as articles) that outlives technology changes, domain-name changes, business-model failures, mergers and acquisitions, and all the other stuff that makes ordinary URLs 404. (Thinking of it as a URL-indirection layer is not a bad way to get your head around it.) The thing is, DOIs are not the only scheme that accomplishes this technical goal. (In fact, technically? DOIs are actually handles.) arXiv appears to have rolled its own scheme with underlying infrastructure to match.

The bolted-on social purpose? In the early days of web-accessible journals, it wasn't always obvious what was a legitimate journal and what was woolly-wild-Web content. (Not, obviously, that such things are exactly clear as crystal now! However.) Because nobody making woolly-wild-Web content bothered to buy DOIs, DOIs became a convenient heuristic for determining whether online content belonged to a journal.

In so doing, they accreted Mystical (but let me assure you, wholly imaginary) Powers of Reputability in the eyes of many people who really ought to know better... to the extent that anything without a DOI started to look fishy, including in the eyes of many people who really ought to know better.

So. Where does that leave arXiv? With an adequate technical solution to the 404 problem, but without the Mystical Powers of Reputability that DOIs are (erroneously) thought to confer. I hypothesize that arXiv doesn't think it needs to pay for Mystical Powers of Reputability... and it's flourishing, so if that is indeed what arXiv is thinking, arXiv appears to be correct.

D.Salo
  • 6,816
  • 2
  • 21
  • 32
  • 10
    In my field (where arXiv is widely used) I don't think DOI's are generally perceived as granting any reputability factor (and arXiv only a minimal amount). In fact I often prefer not to include DOI's in my references because I think it makes my references more cluttered and ugly. – Kimball Jan 30 '16 at 03:41
  • 14
    @Kimball Are references there to look pretty or to enable to find the cited work as easily and quickly as possible? As a reader, being able to simply click on the DOI number and instantly get the article is very useful... –  Feb 04 '16 at 13:23
  • 3
    @NajibIdrissi Form and function are not entirely separate. But if links are included, one can also click on the arXiv id to go directly to the paper. – Kimball Feb 04 '16 at 13:54
  • 3
    The mystical thing is something Crossref is trying to counter, e.g. http://blog.crossref.org/2013/09/dois-unambiguously-and-persistently-identify-published-trustworthy-citable-online-scholarly-literature-right.html . DOIs also provide another function: a cross-publisher metadata store in which you can look up information about publications regardless of who published it or where. – Joe Feb 23 '16 at 17:31
  • 4
    I would like to emphasize what D.Salo wrote: DOIs are more about intellectual property than about journal vs preprint distinction. For example, the preprint system of Open Science Framework automatically assigns a DOI to new preprints and new projects, and depositing these preprints is not subject to any review, and (luckily) not even to screening or filtering, as it is for arXiv instead. – pglpm Jan 29 '18 at 14:29
17

This behavior changed in January 2022. ArXiV now assigns DOIs, as well as arXiv IDs. The change was supposedly made to improve discoverability, and to "help arXiv meet the ‘FAIR Guiding Principles for scientific data management and stewardship’".

Starting in January 2022, arXiv began registering DOIs and submitting associated article metadata to DataCite on behalf of (and at no cost to) arXiv authors. The first articles to receive DOIs are those with 2201.NNNNN identifiers, with all new articles receiving DOIs going forward. Following the successful launch, we will begin minting “arXiv DOIs” for the approximately 2M articles in arXiv’s corpus published between 1991 and 2021. The article abstract (/abs) pages are also now updated to display the arXiv DOIs following the registrations.

Why add DOIs when there are arXiv identifiers? Are arXiv identifiers going away?

The arXiv identifier has existed for more than 30 years; we will continue supporting it and you may use it in your citations as an alternative to the arXiv DOI. We are issuing DOIs for several reasons:

  • Making article metadata available in DataCite’s centralized location allows research outputs to be more discoverable and harvestable.
  • Some funding agencies require DOIs for the research they are supporting.

The DOI is constructed from the ID according to

An author can determine their article’s DOI by using the DOI prefix https://doi.org/10.48550/ followed by the arXiv ID (replacing the colon with a period). For example, the arXiv ID arXiv:2202.01037 will translate to the DOI link https://doi.org/10.48550/arXiv.2202.01037

Anyon
  • 26,132
  • 8
  • 87
  • 116