46

I did my thesis using corporate data (I used to work there, they don't give their information freely or sell it), data is from 2009 to 2018.

I collect the data, analyze it, everything normal. However, after my thesis was approved and submitted, I saw that the company has modified some of their data from 2009 to 2018, historical data from the past was changed (they have valid reasons) but I am worried that if someone tries to verify the source of my information, they will find different data and think that I commit fraud or data manipulation.

My supervisor and many people told me that nothing wrong will happen because my research was in a specified period of time when data was presented like that, they also told me that after a thesis approval, no one verifies data source. They also told me that the verification of data source is done before thesis approval.

However I am still worried, because it is not common for a company to modify data from the past.

My thesis wont be published because the used data is privated, so they are allowing me to not publish it.

Any ideas of what should I do?

Wrzlprmft
  • 61,194
  • 18
  • 189
  • 288
Carlos Varas Tello
  • 691
  • 1
  • 6
  • 11
  • 109
    Nothing. Adviser is correct. No one is going to check your data. It might be interesting to see if your results match up on the new data set, but no one is going to care. In the absolute worst case that someone does care you can simply point out that they revised the data. – JoshuaZ Jan 27 '19 at 21:38
  • 4
    I did the experiment again and my results match up (hypothesis, discussion and conclusions still the same). – Carlos Varas Tello Jan 27 '19 at 21:40
  • 23
    Go find another project to worry about - that one is dead and sorted... – Solar Mike Jan 27 '19 at 21:50
  • 50
    You can always attach a short note saying what you just wrote here, namely the analysis was done on a data set that latter was corrected by the company, but it has not implication on your conclusions. This kind of anomalies are only problem when a researcher changes the data on purpose. – Greg Jan 28 '19 at 04:21
  • But the problem is that I can explain the company updated their data, but people may not believe that, they might believe that I manipulate data. – Carlos Varas Tello Jan 28 '19 at 12:09
  • 14
    Dont trust random people on the internet over your adivsor... Especially in this case, when he is correct ;) – Lot Jan 28 '19 at 12:39
  • 6
    The fact that you are still concerned about this after your supervisor’s and our assurances makes me concerned for your anxiety levels. If you are having other symptoms of anxiety I would encourage you to seek out mental health resources. – Dawn Jan 28 '19 at 17:32
  • 4
    @Dawn, I'm sorry, but that is the worst kind of armchair psychology, and almost certainly quite uninvited. You are really saying that Carlos is making you uncomfortable. As far as his case, I think it quite sensible in fact that he looks into this carefully, especially in this non-privacy, high focus era. – narration_sd Jan 28 '19 at 20:25
  • 1
    Carlos, the thought occurs to me to get your advisor/s to put their judgement into writing for you -- that will give you something credible to attach as a note to anyone you need to show the thesis to for credit. Best fortune, and I like that you care, for your results, as basis to caring about how they are believed. – narration_sd Jan 28 '19 at 20:29
  • @narration_sd no one will read the thesis because it wont be published. University and the Organization that regulates the degrees in my country are just keeping the thesis as a validation that I did a research, nothing more. – Carlos Varas Tello Jan 28 '19 at 20:41
  • @carlos, fine enough and understood. I would think it comfortable to have that 'certificate' available, but it's up to you. Again, good fortune, and soon your vita should show experience which means any remaining importance much diminishes. – narration_sd Jan 28 '19 at 21:41
  • what do you mean by 'certificate'? – Carlos Varas Tello Jan 28 '19 at 21:56
  • 1
    Unless your degree was in parapsychology or meteorology, nobody expects you to be able to predict the future. – David Schwartz Jan 29 '19 at 17:54
  • 2
    @Lot — Where in the question does it say the advisor uses male pronouns? – Reid Jan 29 '19 at 22:41
  • @Lot I dont understand your question – Carlos Varas Tello Jan 29 '19 at 23:14
  • Probably everyone has this nightmare at one time or another: "What if someone finds an error in my doctoral dissertation? Can they revoke my doctorate?" The answer is no unless they can prove deliberate dishonesty. – user247327 Jan 30 '19 at 15:08
  • Does it apply to an unpublished thesis? – Carlos Varas Tello Jan 30 '19 at 16:07
  • @narration_sd I think you might consider reading the three questions OP has posted on this an all his comments. Then note that I say "If you are having other symptoms..." This question does not make me uncomfortable at all. I feel sympathetic to OP and want to kindly point out a concern I have based on substantial lived experience. – Dawn Jan 31 '19 at 21:46
  • @dawn I understand you on this, then - thanks. I've made some further approaches on this as a response below. Take care. – narration_sd Feb 02 '19 at 09:52

7 Answers7

93

Revoking a degree is rarely done and, then, only for the most extreme reasons such as explicit dishonesty and such. Any results in any thesis are subject to revision as new information becomes available that was not present in the past. It doesn't mean that the work was wrongly done, but only that what is known has advanced.

Since you re-did the experiment and came to the same conclusion, you may be able to publish something based on the new results and, when citing your unpublished theses, mention that the conclusions there were verified with new data.

Buffy
  • 363,966
  • 84
  • 956
  • 1,406
19

Rest assured that few people read theses (PhD or otherwise), and as Buffy says, degrees never get revoked for outdated data sources so your degree is safe.

However, if you are concerned about people reading your thesis and not being able to reproduce your results - publish an updated version (say, on ArXiv if you're in a rush, or in a journal/conference if you want the paper to be peer-reviewed), explicitly referencing the thesis and emphasizing the fact that your results are on updated data. This is actually good practice that would help future researchers who may care about your work, and save them the trouble of trying to recover your result.

einpoklum
  • 39,047
  • 6
  • 75
  • 192
Spark
  • 27,465
  • 10
  • 62
  • 101
11

I did my thesis using corporate data

Remember, though, that your research was not the data, but its analysis and conclusions you drew. And since you "did the experiments again and the results match up" - then the thesis is perfectly valid not just at its time of publication, but now as well.

but I am worried that if someone tries to verify the source of my information, they will find different data and think that I commit fraud or data manipulation.

This should not be an issue if your citations were accurate. You should have included the physical document, or the on-line resource, you obtained the data from, and indicated a date of publication (at least a month of publication anyway). If the company now presents different data - it published this data at a later date. edit: If the company does not indicate when publication happened, or if the data is not available from a single source, then you should have included a paragraph (or a small section) in your the thesis explaining how you obtained/collected your data - and when. Specifically, dates of downloading data from internet URLs.

Like others suggest, an addendum/errata to your thesis and/or any paper you've published based on the data, mentioning the change, is a good idea. If the changes had affected your conclusions, then it would have been very important (IMHO).

My supervisor and many people told me ... However I am still worried,

You'll have far worse things to worry about in life - so don't worry about this minor issue :-)

because it is not common for a company to modify data from the past.

How do you know that? I'm not sure that's true. Also, it's very common for companies to not archive past data, and in that case as well, a researcher's work may not be thoroughly-verified.

My thesis wont be published because the used data is privated, so they are allowing me to not publish it.

Definitely publish it! Academic work should, and I might venture to say must, be shared and made public. The least you could do is censor out the private data and keep all the other parts of your work.

Any ideas of what should I do?

Take a deep breath and relax!

einpoklum
  • 39,047
  • 6
  • 75
  • 192
  • I included a screenshot of how the data was presented during my research, I compared to the current data and it shows and obviously update. Is that valid? – Carlos Varas Tello Jan 28 '19 at 18:46
  • I would say the most important thing is an explanation about when, how and why the data changed; and the claim that you have verified that the same analysis and conclusions apply; beyond that it's all a courtesy to the reader/viewer. – einpoklum Jan 28 '19 at 19:01
  • there wont be any reader/viewer. Thesis is not going to be published. – Carlos Varas Tello Jan 28 '19 at 19:03
  • @CarlosVarasTello: Oh, you mean in your thesis... well, in that case, I'm not sure I would add something like a screenshot - that begs for "where is the data this shot is based on?" - sometimes less information is better. But whatever your advisor thinks would work, should work. – einpoklum Jan 28 '19 at 19:05
  • the screenshot was taken from the company system, I included it in order to show that my source is valid and I am not fabricating anything. – Carlos Varas Tello Jan 28 '19 at 19:12
  • "If the company now presents different data - it published this data at a later date." This is possibly optimistic; outside of academia things often aren't "published" in the same way with a trail of breadcrumbs where you can see definitively what changed over time and when and why. Question suggests the company just modified existing material in a way that a new visitor would not necessarily understand as having been changed since it was originally made available. Otherwise good answer. – Lightness Races in Orbit Jan 29 '19 at 00:46
  • @LightnessRacesinOrbit : See edit – einpoklum Jan 29 '19 at 07:06
  • @einpoklum Perfect; spot on now :) – Lightness Races in Orbit Jan 29 '19 at 10:56
  • @LightnessRacesinOrbit can you be more specific in your answer please, I didn't understand you – Carlos Varas Tello Feb 01 '19 at 20:52
  • @CarlosVarasTello I haven't written an answer. – Lightness Races in Orbit Feb 02 '19 at 13:54
6

In addition to @Buffy's answer:

  • Revoking a degree is such a serious matter that the revoking committee would have to prove positively that you did falsify data. I.e., in your case, they'd need to show that the data were not as you said it was at the time of your thesis writing.
    Which, obviously, they can not.

  • Nevertheless, it is a good idea (take home message for future) to keep a copy of your raw data if at all feasible*. If you are not the owner of the data, you can

    • ask the owner for permission to keep a copy for the explicit purpose of being able to show original data for your thesis/paper should anything be questioned.
    • if they don't agree, ask them to keep a copy available in case such a request comes in.

    IMHO, such points should ideally be part of the thesis contract between you, company and university. But from the science point of view, while of course public open data is nicer, granting access to the data only after an NDA is signed does satisfy requirements to be able to answer questions about your work and possible further questions about the original data.

  • Also, I think @Greg's comment about a note about the correction occuring afterwards is a good idea where practicable: you won't change already deposited copies at the library - but if you have your thesis online, I'd note it there. If you write a publication now, I'd probably use the corrected data and say that thesis [citation] was done on an earlier version of the data set discussed in the paper.


* Not even all scientific institutes allow that when you leave them. I have a former employer (research institute) that insists that leaving employees do not take any data with them without written permission (which is not given feely). They do promise to take care of archiving the original data and paper lab books, though.

cbeleites unhappy with SX
  • 23,007
  • 1
  • 44
  • 91
  • 1
    Another case where it might be difficult to keep a copy of the data is when there are privacy concerns. – Anyon Jan 28 '19 at 15:04
  • @Anyon: good point. The paperwork that is necessary to the allow the student/researcher to work with the data at all would be a good point where also the decision who archives and how could be included. Approaching data management plans, are we? (I'm trying to generate/write down here advise for future me ...) – cbeleites unhappy with SX Jan 28 '19 at 15:12
  • update: my thesis was only to get a engineer degree, it is not a PHD or master. My thesis wont be published. Any advice now? – Carlos Varas Tello Jan 28 '19 at 15:25
  • @CarlosVarasTello: still all apply kind-of. However, I'd not make too much hassle: ask them politely (email leaves an almost-paper trail ;-) ) whether you should keep a copy in order to have your thesis archive complete, or whether they prefer to archive the original data. If they say no to both, and anyone ever asks, show the email exchange: that shows that you really did all you could legally do - if more records should have been kept, that's not your fault. As for (3), put a yellow sticker with the note into your copy (in case you hand it to someone at some point). Enjoy your degree! – cbeleites unhappy with SX Jan 28 '19 at 15:34
  • @cbeleites I only have a screenshot of the dataset (it is included in the thesis) when it was used by me and another screenshot of the current dataset (which is not included in the thesis), if you compare, it reveals that the old dataset was updated. Would that be valid?

    Anyways, I have this doubt: Is my case unique? or Could it be the situation of anyone who works with a corporation's data?

    – Carlos Varas Tello Jan 28 '19 at 15:44
  • @CarlosVarasTello: If your thesis even includes the original data as it was, you're fine!. Yes, you can just put a loose sheet with the updated table. I don't think this is as rare as you think. More knowledge is gained, replacing old data. And in some fields, reality evolves (think epidemiology, frequencies and distributions of species, all kinds of data related to society etc.). On a side note: reminds me of a research stay shortly after my Diplom (master), PI asked me pretty much first thing "What mistakes have you already found in your thesis?" and went on to explain that this is natural. – cbeleites unhappy with SX Jan 28 '19 at 16:12
  • @cbeleites, I know that the information is always updated, my worries are that is not common to have data from the past updated. Anyways, advisor and the engineer faculty secretary told me that no one verifies source info and that the company is free to modify data as they want, Any final opinions? – Carlos Varas Tello Jan 28 '19 at 16:33
  • 1
    Re: when an institution can or can't revoke a degree, yes it is a serious matter, but the rules of what an academic institution needs to do in order to revoke a degree are going to vary. – De Novo Jan 28 '19 at 21:26
1

I think you're worrying too much. Add a footnote under the data and/or somewhere else saying "Data as supplied at time of writing" (or as of XX/YY/ZZ) and leave it at that.

0

If it'll help, imagine that you'd graduated 20 years ago doing a thesis on, say, IBM's sales data. IBM sees there was a problem with their historical data in archives, and publishes a corrected set of sales data. Do you think the university would revoke your degree from 20 years ago?

No. Of course not. Because when you wrote your thesis, you were using the information you had at the time, that by all appearances was correct. And the degree was awarded because your thesis came to a justifiable conclusion with the dataset you worked with. The data may have changed 20 years later, but that has no bearing on the work/skill/etc that was put into the thesis.

Kevin
  • 454
  • 2
  • 7
0

Carlos, I am going to ask you a simple question, because it is very confusing that you continue asking for something, for 'advice now'.

At the same time, I've thought that an amount of misunderstanding here has been because of different native languages. You are writing in English, but I think you are from Peru.

So, is the problem you ask about actually that your thesis 'won't be published', as you've said several times? In other words, are you looking for a way it can be published?

A native speaker would understand [won't] to mean absolutely not. If you said [can't], then the understanding would be that if a barrier were removed, it could be published.

If you wanted advice to remove a barrier, then you would say 'can't be published because of [the barrier], so please suggest how I could remove [the barrier]'.

If what I describe is not your question, then another possibility is that you are worried that someone will attack your thesis, if they don't like something you might say or do because of what you learned in writing it, the analysis. Thus you worry the attack could have some consequence on your accomplishment, your degree.

This is where the note I suggested could give you position of better confidence, where your professors write as you indicate they say, that your thesis is valid, on data you used that was valid at its time, and sign it.

I called the note a 'certificate', only because it verifies truth, not because it is any special printed kind of paper from any government. It's just truth anyone would understand.

I hope that answers your question about what I meant in my answer above.

As far as what would adequately protect you if there is a challenge, you must surely use your judgement there. In any culture, there are things which can be risky in some way to do. Then we decide, and certainly may decide not to do them, if the risk is too high.

In that case, we move on in our life, and take up another task which is more appropriate to do, no?

I wish you well, Carlos, and hope you come soon to a path that will relax you about this -- whether we answering here understand or not isn't the first importance.

narration_sd
  • 101
  • 1
  • yeah you're correct. In my country Peru nobody cares about thesis after they are submitted (except for plagiarism). Also, it is logic that the company can do whatever they want with their data, I must feel happy that God give me the degree. I think that I need to do more exercise, hang out, visit family, etc. – Carlos Varas Tello Feb 02 '19 at 16:16
  • great, that soiunds very good. When we work very hard to accomplish something, then it's not unusual that our vision gets narrowed in. Going back out in the world of family, friends, nature, and all of spirit fixes that -- and we learn another thing ;) You enjoy, Carlos, and others here smile also – narration_sd Feb 02 '19 at 20:47