36

I copied (and failed to cite) two lines of code from the OpenJDK source for an undergraduate Data Structures project. Yet, the code comparison shows an alarming amount (40%) of similarity. Here is the side-by-side comparison with my file.

Based on these grounds, my professor wants to give me a -100% on the assignment, which would bring down my overall grade by 15% total, probably causing me to not make the C-wall (depending on how well I do on the final exam). For this reason (and my conscious), I decided to appeal.

However, I believe that most of the similarity in the report comes from my copying of lines 142-152:

static int hash(int h) {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

I did not cite these two lines, but I did intend to delete them later.

In fact, this whole function can be removed without affecting the program at all, which results in this file comparison.

Then, only lines 114-126 are a problem:

MyEntry<K, V>[] newArr = new MyEntry[newSize];
// Copy
for (int i = 0; i < data.length; i++) {
    MyEntry<K, V> e = data[i];
    if (e != null) {
        data[i] = null;
        do {
            MyEntry<K, V> next = e.next;
            int j = e.hash % newSize;
            e.next = newArr[j];
            newArr[j] = e;
            e = next;
        } while (e != null);

However, this snippet is my own. I wrote these lines without referencing HashMap.java, and this is a common algorithm for chaining that I can explain thoroughly, and have known about for years.

Yes, I know the fact I copied the other two lines compromises my integrity and made the 20% into 40% to begin with, so how can I prove this?

I'm not sure how to defend this whole case before a Student Conduct Board who knows very little about programming. My board hearing is in a month. Does these two snippets of code constitute plagiarism of my entire project? Is -15% to my overall grade fair?

Sidenotes:

  • Our projects are pretty extensive since we aren't allowed to use java.util.* (like 1k+ lines for each project in 8 days), and I did not copy any other code. I'd say the actual data structure implementation is only meant to take about 1/5 our time spent per project.
  • Over 30% of the class has been reported for academic integrity violations on projects over the semester, and the newly graduated professor doesn't seem to think himself or his assignments are the problem. I should have caught onto the warning before this last project of the semester...
aeismail
  • 173,481
  • 34
  • 418
  • 736
TheSmartWon
  • 455
  • 1
  • 4
  • 9
  • Comments are not for extended discussion; this conversation has been moved to chat. – ff524 Dec 07 '17 at 15:57
  • 5
    This is the type of mess that made me leave academia. – T. Sar Dec 07 '17 at 17:27
  • 2
    I want to point out that this piece in dispute is verbatim here and here. – Gryph Dec 07 '17 at 17:53
  • 1
    @Gryph There is a bigger dispute, see the chat referenced above. – Patricia Shanahan Dec 07 '17 at 19:06
  • Some of the newer comments suggest that you haven't picked up on the cultural norms regarding plagiarism in the West. So I dunno if this might be completely off-base, but one reason someone might not know the norms here is if they're from Asia, e.g. an international student or an immigrant who's returned to school as an adult. Someone who grew up in Asia could understandably have picked up some of their cultural norms about plagiarism during their early education, which are very different from Western norms. This wouldn't get you off-the-hook, but it might be a mitigating factor. – Nat Dec 09 '17 at 14:22
  • VTC as too specific. We're not here to evaluate file diffs. – Azor Ahai -him- Nov 13 '20 at 17:54

10 Answers10

60

It appears strongly likely that you did plagiarise, and if this is the case you'd be better to proceed by recognising your mistake than continuing to argue.

Looking at your code, there is a great deal that appears to have been taken from the HashMap.java file. I strongly suspect that you began with this code and then modified it to produce your code, or closely followed it as a template when writing your own. The highlighted code sections have picked up some of this, but other sections closely resemble the other file too, although with slightly altered comments and ordering.

If this is the case, then the code you have shown us is not your own original work but actually a plagiarised version of HashMap.java. Leaving in the two lines which are unconnected is a smoking gun of the link that is likely to be conclusive to the panel. Your point that they are unnecessary will actually count against you because you will be unable to explain why they are there at all. And whether this was your intent or not, it is likely that the panel (and your professor) will see your changed comments and ordering as an attempt to cover up your tracks and conceal your plagiarism from automated tools.

The panel is more likely to show leniency towards a student who appears to at least be contrite, and recognises that they've made a mistake, than a student who denies wrong-doing in the face of apparently very strong evidence.

Jack Aidley
  • 11,898
  • 2
  • 34
  • 51
  • 1
    Oh, I'm not denying using it as a reference, which is allowed in the syllabus. My claim is that I forgot to cite/didn't see the need to cite, because I am new to academia, the assignment lended itself to it, and I'm used to treating assignments like this in regards to hobbiest projects that can borrow freely from GPL code, not academia ones that forbid its use. The claims my professor brought against me only pointed to the MOSS, though, not the file as a whole, so that is why I only ask about the two lines. – TheSmartWon Dec 07 '17 at 18:14
  • 1
    I do actually have a good reason for writing those lines to begin with. We had to hash a lot of two character strings (i.e. "AB"). Without the auxiliary hash function, there were a lot of collisions, and I was afraid my code would time out in the automated grading, which would delay development a LOT based on how our automated Jenkins grader works. I included it because once I learned how it worked I thought it was cool, and my project could use it. Yes, I feel contrite now, realizing that academia has different standards, that really weren't made clear to me. – TheSmartWon Dec 07 '17 at 18:18
  • 14
    @TheSmartWon, I would recommend that in the future, even for hobbyist projects where you aren't intending to disseminate the code, you should still cite code you have borrowed from elsewhere. If you decide later to publish or sell your code, without these citations, you might forget what was borrowed and from where (and run the risk of violating the license). – PersonX Dec 07 '17 at 19:32
  • 7
    @TheSmartWon 1) The MOSS identified a lot more than just two lines. 2) you should know that copying code you are meant to write yourself is cheating on an assignment, that's just common sense, 3) you did not write those lines, and 4) I don't believe you actually were afraid of it timing out. To me that statement smells like more attempted covering up. – user253751 Dec 07 '17 at 22:24
  • @immibis 1) Look at the second MOSS in the OP. 2) I did write it myself -- using a source, and making some of my own modifications. 3) Agreed. 4) I have no reason to lie here. I think like an engineer (I thought the function was cool and I had good use for it!), so I used it, not an student trying to turn in "their own work." – TheSmartWon Dec 07 '17 at 23:13
  • 2
    @TheSmartWon 1) The same one that identifies 3 further sections? 2) "using a source and making some modifications" is not "writing it yourself". 4) Your reason to lie is to try to avoid punishment. it is a normal response to do everything you can to make it look like you should not be punished. An engineer also would not use a function just because s/he "thought it looked cool". You also already know that assignments need to be your own work; if you are unable to remember not to hand in other peoples' work then you have a problem. – user253751 Dec 07 '17 at 23:54
  • Yes. Talk about them if you wish. 2) I agree. We are encouraged to use the Java API as a template, though. 4) If I was trying to avoid punishment, I wouldn't have made this post. I want what's fair in the eyes of academia. 4.2) You failed to recognize my second clause "...and I [understood the function] and had good use for it." Engineers include other people's optimizations all the time, especially in OSS!
  • – TheSmartWon Dec 08 '17 at 00:13
  • 1
    Here is where intent is such a difficult issue to contend with when it comes to plagiarism. Did the defendant intend to plagiarize? Knowing that an idea, quote, code, belongs to another entity, and despite knowing the rules still decide to utilize without citation. If the class never stressed code citation, (and based on the question and comments from OP thus far) it would appear that such conventions were never taught but still enforced. If you set a rule without explaining it, how would it be fair if students don't understand/follow it? – Bluebird Dec 08 '17 at 00:25
  • 1
    @TheSmartWon I don't think serious OSS programmers are anywhere near as casual as you about intellectual property. I am both a member of the Apache Software Foundation, and a committer on a couple of ASF projects. To become a committer, I had to sign a legal agreement that says, among other things "You represent that each of Your Contributions is Your original creation", and "You represent that Your Contribution submissions include complete details of any third-party license or other restriction...of which you are personally aware and which are associated with any part of Your Contributions." – Patricia Shanahan Dec 08 '17 at 07:43
  • @FrankFYC “it would appear that such conventions were never taught but still enforced” the concept of “don't cheat by copying other people's work” is not a “convention” and it does not need to be taught. It is obvious (actually, it is found in pretty much any academic/university rule book). – Andrea Lazzarotto Dec 08 '17 at 22:19
  • @AndreaLazzarotto I don't disagree with the statement that plagiarism is wrong, unethical, and should be punished (in some cases, severely). But perhaps my words were more confusing than enlightening. If the claim here is plagiarism with the following parts: knowledge - did the defendant know what plagiarism is, intent - and still decided to plagiarize then the jury (student board) can find OP guilty as charged. The defense that OP can put forth is that the current definition of plagiarism doesn't work well when it comes to computer programming, as well as the fact that (continued) – Bluebird Dec 08 '17 at 22:36
  • A commonly used mechanism to allow students to cite their sources (a bibliography for example) was not demanded by the professor. Another argument would be that the professor had another faculty member present during the grade reviews, a possible violation of FERPA. Another argument is the rather high number of academic misconduct claims in the class - of which may signal that the professor has not put forth effort in educating his class on plagiarism specifically about programming citations. – Bluebird Dec 08 '17 at 22:38
  • @FrankFYC I do not agree on the defense, it is quite moot. I myself studied CS and we were never given specific direction not to plagiarize. We knew it was wrong and it applies just fine to code as well. Granted, I am in a different country and here plagiarism is a criminal offense so the situation might be a bit different than the US. – Andrea Lazzarotto Dec 09 '17 at 01:54
  • @AndreaLazzarotto Exactly my point. Plagiarism is a serious charge. Determining whether or not a person has committed a crime is not an easy matter to make. In this case, the professor only used a single test to determine whether or not student plagiarized. Would that be fair? In addition, the professor allegedly broke US privacy laws by allowing another faculty member to see the student's grade. The professor also has an absurdly high rate of academic integrity cases in the class. Wouldn't you wonder how the professor is teaching the class (continued) – Bluebird Dec 09 '17 at 02:44
  • may be affecting the student's decision-making? If you set impossible goals, no matter how well-intentioned, without a means to achieve it ethically, how many would decide to deviate and 'cheat'? Having just one person who has the unilateral power to determine who has cheated is potentially rife with abuse. Lastly, and I wholeheartedly agree with you, plagiarism is a serious issue. But that doesn't mean someone is automatically at fault without evidence and a thorough investigation. Innocent until proven guilty - that's the american justice system (imperfect at times, but its all we've got). – Bluebird Dec 09 '17 at 02:48
  • 1
    @PersonX It's also really helpful to have those cites if you're looking back at your code a year from now. It's pretty easy to just copy-paste the link you're looking at into a comment. It also gives the added benefit if you want others to learn from your code to read from. – Byte11 Dec 09 '17 at 18:37
  • 1
    @FrankFYC: depending on the class size, that rate is not "absurdly high". I'm a grader for a class of usually a dozen people, and most years I've had 3 cases of obvious plagiarism (those where students copied each other's sentences and mistakes in basic arithmetic). If next year one extra student decides to split their efforts with a classmate, suddenly it'd be an extraordinary event? (We sit those students down, we don't throw the full book at them the first time around, and invariably their justification is that they have too many assignments from other classes that don't leave enough time.) – nengel Dec 11 '17 at 08:00