35

If I have a DB or set of data, and want to grant access to it to a limited group of experts in a field (not general access), and don't want others using it for their own publications, nor seeing the information being offered for free on the Internet.

What is the best approach to deposit the information somewhere to make sure, in a hypothetical future, that I can claim copyright over it?

Basically, I want to protect myself upfront against My research work stolen and published as his own by the co-author without my consent

Quora Feans
  • 4,063
  • 4
  • 22
  • 33
  • 6
    I'm having trouble understanding your first sentence. To me it sounds like you're saying that you both do and don't want to share a certain set of data. Can you clarify? – Pete L. Clark Jun 13 '14 at 17:59
  • 9
    I didn't really understand the question for the reasons provided by @PeteL.Clark but depending on the size, you may be interested in Proof of existence. – Trylks Jun 13 '14 at 18:01
  • @Trylks That's fascinating. – xLeitix Jun 13 '14 at 18:05
  • Trylks: you are right on track. However, I still would need to know whether it would have any weight and a real copyright claim. – Quora Feans Jun 13 '14 at 18:16
  • 7
    @QuoraFeans It would help if you stated clearly what your purpose is. Do you want to make money selling access to the data? Do you want credit for being the one to collect the data? Do you want to keep them secret for a short while to bid you time to conduct more research on them before they go public? Do you simply hate the world and don't want them to see them? – Federico Poloni Jun 14 '14 at 18:37
  • 4
    I don't think there should be a possibility to do this. In effect, you want to claim rights on some mysterious results you got, without others being able to know what it is. Then, in case someone else has the same results (maybe after years of work and as yours are mysterious they could not have known the results are already there) you would like to pull yours out of the back pocket and wave it at them, destroying all their hard work and still keeping the gain away from public knowledge? No. I don't think this is how science works. Neither how anything should work. – skymningen May 08 '17 at 06:45

6 Answers6

46

Basically, I want to protect myself upfront against My research work stolen and published as his own by the co-author without my consent

What you need is to convince the research community that this is your work, so that if anyone tries to steal it, then it will be considered professional misconduct, they won't be able to publish their theft, etc. I'm not convinced copyright law is the right tool for this. Sure, being able to sue someone for copyright violation could be useful in certain circumstances, but often it won't actually settle the academic issues. For example, collaboration. I might claim to have collaborated with you on the research contained in the database, in which case I would be entitled to be a coauthor on academic publications, and you would be considered to be acting unethically if you denied me coauthorship. There's no way to defend against this using copyright registration, cryptographic time-stamping, etc. You might be able to prove that you already had a copy of the database in the past, but it's much harder to prove that I didn't somehow contribute to it, except in extreme cases such as having had a copy before I first studied this field.

In practice, people often deal with this difficulty by telling more people. If you tell just one person about your work, then they can steal it, and it's your word against theirs. If you tell ten people, then it's much harder for a thief to get away with it, since there are nine other witnesses. If you tell a hundred, then it becomes really difficult to steal your work. Unless someone immediately tries hard to steal it, it will become impossible: the community will react by saying "Wait, Quora Feans told all of us about this database last year. If it was your work, why didn't you say anything back then?"

Whether more publicity is a viable solution depends on your circumstances, but there's a fundamental trade-off here. Ultimately, academia cares about credit for the ideas and research, not just who owns the copyright. (If I write a paper about your work, then I own the copyright to my words, but I don't deserve credit for the ideas.) The more you keep your work secret, the harder it is to prove anything about who deserves the intellectual credit.

Anonymous Mathematician
  • 132,532
  • 17
  • 374
  • 531
  • 12
    Good. Your points are too often not at all understood by (especially) younger, nervous people. – paul garrett Jun 13 '14 at 22:50
  • 1
    I would add: telling via email may be especially good, as it leaves a trace. – Piotr Migdal Jun 13 '14 at 23:33
  • @PiotrMigdal Cryptographically signed email might be even better. If practical, including hashes of the data normalized to some useful format might further make it possible to prove you had access to the data at that time. – user Jun 14 '14 at 18:22
  • 2
    @MichaelKjörling May point was to supplement details for a social, not technological, solution. – Piotr Migdal Jun 14 '14 at 20:29
25

See this question on StackOverflow: Is there a way to digitally sign documents to prove they existed at a certain point in time.

The answer is yes. Several solutions were suggested; my answer involves an Internet time-stamping service which signs a cryptographic hash of your document. So if you keep that version of the document, you can prove that it existed on the date in question; but otherwise, nobody ever sees the document except you.

However, if you are actually expecting copyright or IP challenges, I cannot promise that this will stand up legally; you should consult a lawyer.

Nate Eldredge
  • 133,015
  • 44
  • 379
  • 480
  • 4
    I don't believe the "sealed envelope to yourself" method would stand up in any court. It's simply too easy to fake it. – Quora Feans Jun 13 '14 at 18:49
  • 33
    @Quora: If what you care about is standing up in court, then I'll repeat myself: consult a lawyer, not us random clowns from the Internet. – Nate Eldredge Jun 13 '14 at 18:56
5

You can deposit the data in a research data repository that allows to restrict the access to the data while making the metadata available and citable, e.g. with a DOI. This could be your institutional research data repository (if your institution/institute has one and it has these features) or you could use Zenodo run by CERN which is open for everyone to register and deposit data and publications.

Restricted Access: Users may deposit restricted files with the ability to share access with others if certain requirements are met. These files will not be made publicly available and sharing will be made possible only by the approval of depositor of the original file. Source

FuzzyLeapfrog
  • 4,850
  • 1
  • 20
  • 44
3

A good institutional data repository should be able to handle this for you. As an example, The Dataverse Network allows you to deposit data, gives that data a persistent DOI, and then allows you to manage permissions. So you could, for example, deposit the data to make it citable and immediately establish your authorship but then only make it available to specified users (who would need to create Dataverse accounts). In the future, you could make the data more broadly available if you so desired.

Other institutional data repositories should be able to handle this process as well.

Thomas
  • 4,094
  • 20
  • 29
3

Use ProofOfExistence.com to put a hash of your data on the Bitcoin blockchain. Blocks in the blockchain are time-stamped.

By putting a hash of your data on the blockchain, you are not publishing your data or making it in any way publicly available.* In the future, when you release the data, anyone can perform a SHA256 hash of it and verify that it matches the hash in the time-stamped block on the blockchain, thereby proving your priority.

Cf. the ¶ "Demonstrating data ownership without revealing actual data" of ProofOfExistence.com's about page.

*ProofOfExistence.com does client-side hashing (using a JavaScript library), so even that website doesn't ever access your data-set itself.

Geremia
  • 3,535
  • 5
  • 24
  • 39
  • I've edited my answer with more explanation. I think not understanding how ProofOfExistence.com is the answer to the asker's question might be why my answer was down-voted. – Geremia May 07 '17 at 21:45
  • It just proves you were in possession of the data (or at least the checksum of it) at a certain point in time, not that you are the owner or copyright holder. – BlackJack Mar 26 '18 at 15:37
  • @BlackJack It proves priority. – Geremia Mar 26 '18 at 16:03
  • Maybe I have the wrong notion of „priority“ here, but if I steal a database or document from you and enter a hash of it into the bitcoin blockchain, does this really prove priority? It just proves that I was the first one to do this, not that I didn't steal the data or document from someone. – BlackJack Mar 26 '18 at 16:15
  • @BlackJack "if I steal a database or document from you and enter a hash of it into the bitcoin blockchain, does this really prove priority?" If you were the first to do so and if I can't prove I produced the document before the timestamp of the hash, how could I substantiate my case for priority? – Geremia Mar 27 '18 at 18:24
  • Well, you can't. And that's where I wonder if I got the meaning of „priority“ right. But apart from that, timestamping with this technique seems useless to me in this case, as it can't be used to prove the data belongs to/originates from a specific person, but just who used the blockchain method first. – BlackJack Mar 27 '18 at 22:30
  • @BlackJack That's all that matters from the OP's perspective: They can prove that they used the blockchain method first. – user2768 Feb 20 '19 at 08:03
  • This answer could be expanded beyond blockchain, e.g., a hash could be added to a website and that website could be cached by archive.org (which is now legally recognised---in at least one jurisdiction---as a timestamp). – user2768 Feb 20 '19 at 08:05
  • @Geremia It allows you to prove that you had this data before the thief published it (and by implication before they claim to possess it). To make use of this, you likely have to have been loudly claiming to the relevant community that the data was yours, so that the thief has to claim that you've never been involved and never had the data before publication. Once they've done that, you can prove that they are lying.

    If they admit in advance that you were involved in the data before publication, then they have to acknowledge as co-author, etc.

    – Brondahl Jun 19 '20 at 19:03
  • ProofOfExistence.com is too expensive since it costs 0.00025 BTC which amounts to $8.36 per proof. This cost can be reduced tremendously since one can use a blockchain with lower transaction fees (it does not matter too much which blockchain you use since the proof of one blockchains existence will eventually be posted on other blockchains), and one can also combine an unlimited number of timestamps into one proof using Merkle roots. – Joseph Van Name May 23 '21 at 13:22
  • @JosephVanName Yeah, you don't need ProofOfExistence.com; you can do your own. – Geremia May 24 '21 at 09:49
2

This is country dependent and more of a legal question, but basically you have to register your work.

Most of the people in academia SE seem to be from USA, so you can check the process for USA. I was not aware of this, but the site is well ranked in Google and seems properly informing.

Your country should provide some similar mechanism. This kind of mechanisms are the best option if you want to be sure. Otherwise, you can use other kind of mechanism as proof of existence or CKAN and then use that as a proof to register the copyright if needed (but you may as well register in the office up front instead of worrying about doing something else and then registering).

Trylks
  • 4,256
  • 4
  • 20
  • 30
  • I wonder whether we need country specific procedures here. If someone gets copyright granted in the US, this copyright should be valid everywhere. It's not like a patent, which indeed is country specific. Maybe we should just fill for copyright in countries which aren't too bureaucratic to work with, and maybe let you do everything online. – Quora Feans Jun 13 '14 at 18:55
  • @QuoraFeans: But the concepts of copyright differ much e.g. between the US and the European Union, particularly with respect to the inalienable authorship rights, and also with respect to the treatment of data bases. – cbeleites unhappy with SX Jun 15 '14 at 13:32
  • @QuoraFeans some countries completely disregard copyright. People who want to enforce copyright in one of those countries would require their own army (and an invasion). In the end you can prove that you had something at some certain date, but not much more. – Trylks Jun 16 '14 at 10:10