59

I have just received a decision letter for my submitted manuscript to an Elsevier journal. It was a revise and resubmit. However one of the reviewer asked for an executable file in order to check my results. (I felt distrust from his comment..)

This is regarding a computer science paper on testing the efficiency of an algorithm on a set of instances from the literature. I compared the results of the algorithm with those of other authors.

eykanal
  • 48,364
  • 18
  • 119
  • 216
Marvel LePont
  • 730
  • 1
  • 5
  • 10
  • 31
    To clarify: the paper talks about some software, and the reviewer wants to be able to run the software?

    I don't know how common that is (I'm not a computer scientist), but so long as any license on the software permits you to give it to the reviewer it does sound like a reasonable request.

    – Flyto Apr 27 '17 at 21:40
  • 3
    Well if you publish an algorithm you need to somehow publish the actual code. It sounds like your manuscript doesn't include it at all? –  Apr 27 '17 at 22:17
  • 2
    If the algorithm is ganna be published anyway, I don't see "why not"! – The Guy Apr 27 '17 at 22:38
  • 18
    @DSVA: "If you publish an algorithm you need to somehow publish the actual code"... what percentage of published algorithms do you think come with source code? – user541686 Apr 28 '17 at 05:47
  • 89
    @Mehrdad "what percentage of published algorithms do you think come with source code" a lot less than the percentage that should come with source code. Imho if you can't verify the claim without implementing the algorithm, then it should have source code attached, and the source code should be reviewable, or else it's not good science. No reviewer in their right mind would accept a paper with "magic happens here" in the middle of the proof so "closed source software happens here" shouldn't be allowed either. – Sumyrda - remember Monica Apr 28 '17 at 06:42
  • 40
    A distrustful reviewer is a good thing, it means they are going to give your paper a stern test, and if there are problems with the paper, they are more likely to find them, which is to your long term benefit. I very much doubt they distrust your honesty, just the result. – Dikran Marsupial Apr 28 '17 at 09:25
  • 4
    As someone with little-to-no experience publishing software (or algorithms), wouldn't providing them with a compiled executable only provide them with a way to run your code, not view it? – Pants Apr 28 '17 at 12:34
  • 2
    @Sumyrda "Imho if you can't verify the claim without implementing the algorithm, ..., it's not good science" Unfortunately, things are often not that simple for algorithms. Many algorithms can be proven correct, yet still be entirely non-trivial to implement. This is the reason why algorithm engineering is a serious field. One such case is Chazelle's triangulation algorithm. His proof is most likely correct, but there is no implementation for a while (as far as I know), even though it is an often 'used' algorithm! – Discrete lizard Apr 28 '17 at 13:29
  • 2
    In general Science terms, only a genuine peer-reviewer is liable to do such a thing. An excessive percentage are liable to flag it away as too hard. Their was a day (I'm told) when peer-review meant REVIEW!. Someone else was confident that you knew your stuff and would put their name (albeit anonymously) to the conclusion. Such people are of course very annoying, but they are part of the foundation on which real science is built. Or used to be. – Russell McMahon Apr 28 '17 at 13:41
  • 6
    What other way do you suggest should he go to verify your results? – PlasmaHH Apr 28 '17 at 14:29
  • 1
    @Discretelizard "Many algorithms can be proven correct, yet still be entirely non-trivial to implement." But that's exactly what I said. You either show pseudocode and prove that it is correct, or if you can't do that for some reason, especially when you claim that your algorithm is faster than some other algorithm, then you should have to show your code - or provide some other way to verify your claim if you can think of something. – Sumyrda - remember Monica Apr 28 '17 at 18:46
  • Please clarify: Is the paper about an algorithm? Or a program? Or the results a specific implementation produced on specific input? – Hagen von Eitzen Apr 29 '17 at 12:32
  • You should never distribute executables, but always the source code. Nobody, wants to execute an binary program without compiling it from source. It could do anything on your system. One can try to workaround with an sandbox, but why not distribute the source, if it is part of your scientific work? – Jonas Stein Apr 29 '17 at 18:11
  • @JonasStein I think the only reason I would want the source code would be so I could examine it for errors, not because I'm worried another researcher is going to bug my system. Especially one who didn't volunteer such executable but instead is producing it on request. – corsiKa Apr 30 '17 at 01:51
  • 3
    @PhdStudent "The good thing about science is that it's true whether or not you believe in it." I certainly have trust in our scientists, but that trust is earned through verification and peer review. When we publish science without verification, we open ourselves to pesudo-science at best and willful deception at worst. – corsiKa Apr 30 '17 at 01:56
  • 1
    I observe the use of male pronouns to refer to a presumably anonymous reviewer. I think this is our implicit biases working against us. – Greg Martin Apr 30 '17 at 03:46
  • @Sumyrda There's a difference between source code and pseudo-code. Source code IS an implementation, just not in executable form yet. Pseudo-code is not source code, it is only a method to describe algorithms. But if you meant that at least some sort of verifiable (by the reader) argument for the results of an algorithm must be present, then I of course agree. – Discrete lizard Apr 30 '17 at 10:06
  • @Discretelizard Exactly, it has to be verifiable. Whether it's verifiable by going through the pseudo code alongside your reasoning or by reviewing your source code doesn't matter, but one or the other has to be possible. I'm torn about verifying the claim by running tests against a closed source executable - there's still too much "magic happens here" in there for my taste - but it's better than nothing. – Sumyrda - remember Monica Apr 30 '17 at 14:43

8 Answers8

174

I don't know if it's normal, but it should be normal for all reviewers to make reasonable efforts to verify that the claims authors make are correct, so to the extent that it's not normal, I can only commend the reviewer for being willing to make an effort that other reviewers don't make. What you sense as "distrust" is the reviewer doing their job, nothing more or less (and it is probably somewhat accurate to say that a reviewer's job is to distrust the author's claims, so I don't see the idea of being distrusted by a reviewer as something to be ashamed of or offended by).

By the way, it should also be normal for authors to make available any software (including source code whenever possible) needed to replicate and verify their results. So if you are unhappy with the reviewer coming back to you with annoying requests that delay the decision on your paper, next time around you can preempt such issues by releasing your source code (or at least submitting it to the journal) alongside your manuscript. I am sure the reviewer would be much happier and ultimately everyone would benefit, including you.

Dan Romik
  • 189,176
  • 42
  • 427
  • 636
  • 4
    Thank you for your comment, perhaps I felt offensive because it is my first journal paper. However, in my research field it is not common that authors make available their code..
    Anyway, i will send the executable to the reviewer because he stated that without this check the paper cannot be accepted . Thank you again.
    – Marvel LePont Apr 27 '17 at 23:12
  • 96
    As someone who works in the industry and reads papers as "an outsider", it has always struck me as very odd that there's a tendency to "hide the code". My feeling is that code, just like a paper, should be out there and available. I'm absolutely with Dan here, it should be done more. To me at least, not publishing code always has a bitter aftertaste of there being something dodgy going on. – SBI Apr 28 '17 at 07:38
  • 2
    Unfortunately, most of the algorithms and numerical methods described in journals like Journal of Computational Physics (a very reputable journal!) are implemented in in-house closed source codes. I did some review there and I just had to trust the results. – Vladimir F Героям слава Apr 28 '17 at 11:18
  • It would be nice for authors to make available software even when it isn't needed to replicate or verify their results. I can think of one algorithm paper which I once tried to implement, and I gave up because it didn't seem to type. A reference implementation would have helped understand the paper. – Peter Taylor Apr 28 '17 at 11:53
  • Agree about the proper framing of the feeling of "distrust" - better to interpret it as healthy scientific skepticism, which every reviewer should have. I put a level of trust in peer-reviewed papers (especially for topics I'm not intimately familiar with) precisely because they have satisfied a panel of reviewers' skepticism. – Nuclear Hoagie Apr 28 '17 at 13:05
  • 12
    @SBI It's not odd at all. If you invest 4 years of work into a code, you need it to pay off in enough papers/citations/attention to get you a promotion. One paper will not do that, and if you release your code, you'll need to spend another lengthy period of time coding up a new publishable result while your rivals use your code to scoop you. It's not so much "publish or perish" as "publish more than alternative hires or perish". (Personally, I did release my code; I also perished.) – Xerxes Apr 28 '17 at 13:41
  • The answer has assumes that the reviewer has only the best intentions at heart along with a idealistic tendency to improve the scientific culture. While I'd commend those things instantly, unfortunately we live in the real world. One needs to consider that the reviewer is an anonymous person asking for the core part of a research process with a clear expertise in the field and the consequent potential interest in it. I'd advise carefully considering such requests also from the perspective that there are individuals who could try to exploit their reviewer anonymity and privilege. – user3209815 Apr 28 '17 at 14:49
  • 4
    Another red flag that catches my attention is that they request an "executable file". The user has no way to verify that the executable doesn't just print the results with no algorithm behind, aside from reverse-engineering it (without going into too much detail here). Such request can possibly mean that the reviewer tries to lull the author into a false sense of security by intentionally not requesting the source code. – user3209815 Apr 28 '17 at 14:53
  • @user3209815 : They may be catering toXerxes point: releasing the source code may not be acceptable because other authors can then scoop the OP. Making the file executable means they'll have to decompile it... which may not be viable. Although, as you say there's no way to verify that such an executable doesn't just print out the results required. – Peter K. Apr 28 '17 at 15:19
  • 23
    @Xerxes How can your idea be publishable in a paper though, if it depends so much on closed source code? I mean, you lay open your entire idea anyway. I've written code basing on papers countless times. Usually, code is more a proof of concept rather than holding more information than what you publish anyway. The other way around though, it becomes trickier. Making claims without showing they actually work is... difficult. – SBI Apr 28 '17 at 15:32
  • @SBI Of course, the paper must be such anyone can implement the algorithm indeed! But that can be a lot of hard boring software engineering work. – Vladimir F Героям слава Apr 28 '17 at 17:32
  • 1
    @Xerxes Ideally your work would be adequately supported by some funding agency (usually public). In return, you open source your code. Practically, I understand that you do what you have to do. – emory Apr 29 '17 at 00:22
  • 1
    @Xerxes that is why code should be released with licenses that requires attributions. If people are going to take your code and publish a paper based on it, they would have to attribute you as the original author of the code, just like they'd have to cite you if they used results from your published papers. – Lie Ryan Apr 30 '17 at 16:32
  • 2
    @VladimirF: "I just had to trust the results". No, you didn't. If results are produced by software, then you should be able to verify the results, and if you can't verify them, then you don't accept it. There's the saying "pictures, or it hasn't happened". – gnasher729 Apr 30 '17 at 22:26
  • I'd add that the executable (compiled code) by itself, while it would allow replication of effort, wouldn't be particularly useful in the grand scheme of things. The source code, as others have said, is the more important thing. Either it's commonly available software (in which case, why couldn't the reviewer(s) download it themselves?), or it's a rare or proprietary code, in which case the source code is the more important detail. It's the method that needs to be replicated, and the binary is a black box that hides the code, which is the method. – jvriesem May 01 '17 at 16:43
  • 1
    Reviewers are bound by confidentiality—you can provide code to reviewers without releasing it. Yes, reviewers cheating happens sometimes—though seldom. There are multiple movements towards reviewers trying out code; see http://www.artifact-eval.org/ for one approach. – Blaisorblade May 01 '17 at 16:56
  • @gnasher729 Do you trust experiments even when the authors did not make their facility available to you to re-do their experiments? What if that numerical method was implemented in a CFD code to which the authors don't have full redistribution rights? The computation may take many a lot of cpu time on a supercomputer and the reviewers will not re-compute them anyway. I wouldn't for that paper I reviewed. The analysis of the results might also be time consuming. – Vladimir F Героям слава May 01 '17 at 19:47
  • @Xerxes No, you did not spend 4 years on your code. You spent four years on your algorithm, which you are already letting the world know about through your paper. The code is simply an implementation of it that you could have written in a fraction of the time, had you started on it after you had settled on the algorithm. – Jasper May 02 '17 at 14:49
63

I come from a different field, in which the code we use isn't a major output. But if a referee asked for the code, we would provide it and happily. Most of our work is done in python so an executable wouldn't be usual, the source would (also true for matlab).

In fact the only thing I find slightly odd here is the use of executable rather than source.

Don't be offended by the request for a couple of reasons: It's not the reviewers' job to trust you; it's their job to check your paper. If a reviewer takes enough interest in your work to want to run your code, they haven't dismissed your paper out of hand.

Chris H
  • 8,576
  • 22
  • 36
  • 13
    I like the final point. It would even seem a bit honouring to me if the referee would be willing to try out my software :) – Džuris Apr 28 '17 at 08:32
  • 4
    @Džuris indeed, it means they are taking your paper very seriously. – Dikran Marsupial Apr 28 '17 at 09:42
  • 8
    Thank you for your second paragraph! No one here seems to see this. When I read the title of the question, I assumed OP found it odd that someone would request an executable rather than the source code because that indicates they don't care enough to check the source code for errors / mistakes / deliberate mistakes and in fact are too stupid or lazy to compile the source code (which I assumed was provided) themself. – UTF-8 Apr 28 '17 at 12:59
  • 2
    @UTF-8 It's also possible they don't have a compiler (maybe can't because of licensing/platform issues) or don't have the skill in that language to make sense of the source. Even then I'd still want the code – Chris H Apr 28 '17 at 13:05
  • 4
    Upvoted and a ++ for the fact that it is weird they want an executable and not the source. Heck when I grade homework, I want the source and anything else required to get it to compile, including compiler commands/statements/options/arguments if it is anything more complex than "g++ foo.cpp -o foo" – ivanivan Apr 28 '17 at 21:36
15

To summarise the situation with your data:-

1) You came up with an algorithm on paper/Matlab/whatever.

2) You implemented that algorithm in some programming language.

3) You built a set of test data to exercise your algorithm, and came up with some results for what it should do in theory.

4) You put that test data through the code and came out with some results for what it does in practise.

In this process there are various places where things can go wrong with your methodology. Your code may not correctly reflect your algorithm. Your test data may have been worked backwards from the code instead of forwards from the algorithm. Your test data for your algorithm and your test data for your code may not be the same.

Unless the reviewer has the algorithm and the source code and all the test data for both and all the output data for both, they cannot verify that your work is sound and your conclusions are valid. This is not subject to dispute - it is logically impossible, if they want to properly review your work. Anything else is making assumptions which may not be valid.

I have personally been affected by this situation, when my company bought some control theory IP from a researcher. He'd written papers on how this was supposed to work and the theory behind it, and then he'd built some electronics to implement his theory. His papers covered the theory, and also included schematics for the electronics. When I read this to work out how to implement his theory in software, I found that the schematic had an extra filter in it. The action of this filter turned out to be critical to the system being stable or even effective, but it was not documented at any point anywhere in his work. It wasn't until we had a phone call with him that we found out what the purpose of the filter was, and how we were supposed to tune it.

This was in a paper which theoretically had been peer reviewed when it was published. Clearly it hadn't been peer reviewed thoroughly enough! His results showed that given the same data, the implementation output was pretty close to the theoretical expected output, and the effect of the filter was at a different place in the response. Still though, the implementation flatly would never have worked without this filter present, and it wouldn't have been at all hard to include this in the theoretical model. He could even have said "this filter is required for these reasons, but can be ignored in this area of the response we're looking at for these reasons" and he would have been covered. What is not acceptable is what he did, which is to fail to mention it at all, because the end result of that is that someone trying to implement his work would be unable to.

Like I said, he still got his paper published, and no-one complained at the time. It should have been spotted by his original reviewers though. In your case, your reviewer should be looking for discrepancies like this - it's the whole point of peer review. So if people are asking you for things you haven't made available, (a) it's a good sign they're checking thoroughly, and (b) you should have made it available in the first place as best practise.

Graham
  • 7,012
  • 18
  • 26
6

Artifact submissions are a thing in CS. What I've seen is that you'd prepare a virtual machine, where your software is already set up and ready for making experiments. So, the reviewer may be referring to that the journal has some official procedure for artifact submissions. Alternatively, some authors just make the source code of their tools and benchmarks available via services like github, and the reviewer may be suggesting you should also do this. Regarding the distrust, computing people are naturally wary about benchmarks and tool comparisons, as the final figures may depend a lot on how your experiment is set up (e.g., if you compare to your own implementation of an existing algorithm, did you implement it correctly). It could also be that the numbers that you give in the paper seem a bit odd, but then the reviewer would have pointed to what exactly doesn't look right to them.

Alexey B.
  • 1,205
  • 1
  • 8
  • 8
3

Submitting an executable isn't the same as submitting source code. An executable doesn't really give the recipient any access to your original code (as a computer science student should already know, of course). I don't see a problem with this request.

Sod Almighty
  • 131
  • 4
  • 5
    I see a major problem: the executable run on the same test cases is going to produce the same results. even if those results are wrong due to an error in the program. – jamesqf Apr 28 '17 at 05:20
  • 8
    I'm curious how you would send an "executable" of, say, Python code, without giving access to your original code? Are you expected to obfuscate it? – user541686 Apr 28 '17 at 05:48
  • 4
    @jamesqf They could try other cases, not covered by the authors. – Captain Emacs Apr 28 '17 at 06:06
  • 1
    Any executable can be decompiled to produce functionally the same code. Any code that compiles to JVM or .NET can be decompiled to something relatively close to the original, and even machine code can be torn apart with enough work. While I deeply believe source code should be published with publications like this, if you refuse to do so, you can't offer an executable as if it will hide what you wanted to hide from your source code. – prosfilaes Apr 30 '17 at 01:06
  • @CaptainEmacs but they can't do that if they can't see what the program actually does. It might as well have data embedded in the executable or do something else than what is claimed. – mathreadler Apr 30 '17 at 15:41
  • @mathreadler Of course, it could all be unlucky. However, I have had exactly this situation during my PhD, and the authors sent me an executable and I could try my examples with it. If there is bad coding (hardcoded constants), sometimes it is still possible to edit the binary to fix that (I have done that, too, once). – Captain Emacs Apr 30 '17 at 21:28
1

Given my personal experience with open source communities and the assumption that the paper includes the entirety of the algorithm in question, then sending the source code or related compilation of said software wouldn't produce many negative effects.

This would allow the reviewer to verify results and claims made by the paper's author. The key issue the reviewer might be looking for is that you correctly implemented algorithm in source code an are not mistakenly relying on a feature of the programming language, OS, or hardware to make claims about its running time or other features.

Off the top of my head I would relate that in I/O bound cases its easy to mistake efficient algorithms for, as an example, Javascript's ability to make almost every function call asynchronous. Of course this is mostly seen in I/O bound operations rather than proliferative computational loops. Then the efficiency measured is not that of the algorithm as a formal proof but; instead it relies on a language specific feature.

The salient point is that there are many cases in which the formal algorithm and the implementation can diverge from representing each other faithfully and in doing so the conclusion, if based on empirical metrics such as running time, can run into many issues where an improper implementation can attest to an incorrect conclusion.

1

Source code can have bugs, and to truly effectively review an algorithm, a prose description of the method alone may well be insufficient. Sharing something beyond the text is beneficial; a good paper with the actual source code (+sample inputs) is the gold standard for reproducibility.

One fun wrinkle: depending on where your reviewer is, you might not be allowed to give them a binary. Eg, some code uses proprietary libraries that are licensed freely in academia, but someone in industry might require a separate license to even use an existing binary, much less compile it. (this happened to me once, though not as part of peer review)

abought
  • 341
  • 1
  • 3
-1

It is an idiotic request on his part.

  1. He could catch a virus.
  2. There is no realistic way he can check that the executable implements what is described in your paper, ergo no way the request has any scientific value whatsoever.

He should be asking for the source code, and that is all you should agree to give him.

user207421
  • 115
  • 4
  • 10
    #2 is often false. A significant number of problems have the characteristic that checking a solution is much easier than finding a solution. In such a case, a reviewer can verify that a black box does indeed produce correct solutions and measure the runtime complexity. This would be particularly valuable if, for example, the reviewer noticed that all the examples used in the paper had particular characteristics that made them easier to solve than the general case, as he could formulate his own test cases. – Ben Voigt Apr 29 '17 at 03:52
  • 2
    @BenVoigt But what black box? How can the reviewer know he has the black box implementing what the paper claims? – user207421 Apr 29 '17 at 06:03
  • 1
    There is a valid point that use of a black box doesn't assure that paper contains an accurate description and explanation of the method used -- but the reviewer may be much less worried about the risk of having faked the description given that a novel method is proven to exist, at least compared to the risk of faking both method and description. – Ben Voigt Apr 29 '17 at 06:08
  • 1
    @BenVoigt I cannot make head or tail of that after the word 'but'. A black box does not prove anything about the claims in the paper. It only proves that a black box exists that produces the results claimed, somehow. – user207421 Apr 29 '17 at 09:42
  • 5
    The reviewer could run the executable with a different set of parameters to reproduce a known result. OP doesn't seem to give us enough information to know that it is the case. As for the part with the virus, a Linux user might feel safer than a Win user. –  Apr 29 '17 at 12:03
  • 1
    @Magicsowon I would like to sell you the Brooklyn Bridge. I have the title deeds. I won't show them to you but I can send you an .exe that will print 'yes' every time you ask it whether I own it. – user207421 Apr 30 '17 at 10:36
  • 4
    @Magicsowon “a Linux user might feel safer than a Win user” Linux user here. I don't agree at all. You could easily write a program that damages personal documents or sends them through a network connection on both platforms, if the recipient is kind enough to run it for you. Linux is safer for "widespread" malware that is not targeting you specifically, also because there are more which are made for Windows. – Andrea Lazzarotto Apr 30 '17 at 17:06
  • @AndreaLazzarotto The keyword is "if". Linux users who depend on their machines for research are often paranoid enough to have their data backed up in 3 different places and run third party software only where it can't damage anything. That doesn't rule out people like me who have never lost data in the way you describe and simply feel safe because of it. –  May 01 '17 at 06:59
  • @EJP I will do it as soon as my very distant aunt from Liberia is sending me my inheritance so I can pay you. –  May 01 '17 at 07:03