0

Question is just out of interest.

If we take md5 hash from all integers from 1 to n, what is the first n, where its md5 collides with some previous md5?

Has anyone ever calculated and proved the smallest integer, where the collision occurs?

If not, is there considerations about md5 pseudo-randomness, which make probability of colliding md5-s higher for consecutive integers than for cryptographically random strings (birthday attack -> 2^64)?

This, for example, might be useful to know, when trying to hide real sequential order numbers.

  • From the very last input? It'd be the same as the chance of any two MD5 hashes colliding. – forest Nov 30 '18 at 05:42
  • I mean from the very first. If we try to hide order numbers, like McDonalds in London by hashing them with md5, how many orders we can take before the first order with the same hashed number as we have already had? For example: "Damn, our latest order 99974367 has the same hash as order 43342425 ten years ago!" :) – Rauli Rajande Nov 30 '18 at 05:50
  • 2
    We don't know. You would expect a collision after about $2^{64}$ orders because of the birthday problem. So that's in the same ballpark as Google required to find a collision in SHA-1. Presumably they didn't have to create a rainbow table with $2^{64}$ entries though :P – Maarten Bodewes Nov 30 '18 at 05:52
  • 1
  • 1
  • @fgrieu I saw this, but didn't realise that this is the same question as the question doesn't talk about integers and answers were absolutely ony about long strings and in what endian they should be interpreted. Sorry about my question, should i delete this? – Rauli Rajande Nov 30 '18 at 17:48
  • That other question's last line "What is the minimal $b\in\mathbb N$ such that there exists $a\in[0,b)$ with $\mathbf{MD5}(a)=\mathbf{MD5}(b)$" differs from you statement only by starting from 0 where you start from 1 (and that's unlikely to change the outcome). Thus yes, I think this is a dupe, and it is reasonable to remove it (no pressure). MD5 hashes bitstrings, not integers, therefore how the conversion from integer to bitstring is made matters to the exact value, and that's why the answers discuss that. AFAIK, we do not know the exact value for any definition of that conversion. – fgrieu Nov 30 '18 at 18:33

1 Answers1

0

MD5 has a 128-bit output, so it's expected that the first collision is around $2^{64}$. This is because of the Birthday-Problem in probability theory.

So you should be safe to take orders up to $$2^{64} = 18,446,744,073,709,551,616$$

But that shouldn't be a problem for you, because even though this number might not seem like a large number it actually is huge:

An Example:

If every human ($ \approx 7,000,000,000$ people) would place an order every second, then it would take $${2^{64} \over 7,000,000,000 * 3600 * 24 * 365} \approx 83.5$$ years.

So you can be extremely sure that you won't have a collision of hashed order-numbers.

AleksanderCH
  • 6,435
  • 10
  • 29
  • 62
  • I know this general answer. It is well known, that the md5 is not really well randomised. My question is more about, if anyone has ever calculated and proved the smallest order number, where the collision occurs? – Rauli Rajande Nov 30 '18 at 17:50