MD5 implementation

Question

As a personal project, I want to implement MD5 on an FPGA, but I have some doubts about the specifics of the implementation. My first source of how the algorithm is implemented was the RFC 1321, where there is a pseudocode that explains that round 1 will be performed the following way:

 /* Round 1. */
 /* Let [abcd k s i] denote the operation
      a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
 /* Do the following 16 operations. */
 [ABCD  0  7  1]  [DABC  1 12  2]  [CDAB  2 17  3]  [BCDA  3 22  4]
 [ABCD  4  7  5]  [DABC  5 12  6]  [CDAB  6 17  7]  [BCDA  7 22  8]
 [ABCD  8  7  9]  [DABC  9 12 10]  [CDAB 10 17 11]  [BCDA 11 22 12]
 [ABCD 12  7 13]  [DABC 13 12 14]  [CDAB 14 17 15]  [BCDA 15 22 16]

Ok. Fair enough. It broke my dreams to be able to parallelize the algorithm since in each step one variable (A, B, C, or D) is updated and each step needs the previous values.

So, looking for some way to parallelize some part of the algorithm I found this chapter of a book:

Hardware Implementation of Hash Functions

Table 2.1 of page 31(5) gives another formula for the algorithm. Sure, it is similar but is not the same. According to this book, the only value ever updated is B. Clearly, this is not the same as the original implementation as seen in RFC 1321.

My questions are:

Is this another form of the original implementation with the exact same results? (I suppose so, but as the book does not explain the steps that led to the equations, I want to be sure)
And how are those new equations derived?

Is there some reason you chose MD5? It is known to be a broken hash function (at least for collision resistance, but likely breaks in other properties will follow later). If you can chose, use a more modern hash function. — Paŭlo Ebermann, Feb 11 '13 at 18:29
There is no special reason. I will not use it for anything serious, I just wanted a simple cryptographic hash function to implement on an FPGA with the purpose to learn and MD5 seemed appropiate. Nevertheless, I may follow your advice (maybe SHA 256?). But anyway, I am still curious about this "discrepancy" — Fackelmann, Feb 11 '13 at 18:37
Actually, if you're looking at a parallizable hash, you might want to look at SHA-3. — poncho, Feb 11 '13 at 19:16
Yeah, Keccak is probably a good idea, it has been designed for parallel operation (at least the primitive itself). You can try a smaller permutation for testing in a "restrained environment". — Maarten Bodewes, Feb 12 '13 at 00:58
Thank you very much for the hint about SHA-3. I have been looking into it and it seems interesting. I think I'll give it a try. — Fackelmann, Feb 13 '13 at 14:07

score 4 · Accepted Answer · edited Oct 07 '21 at 06:47

The description in RFC 1321 is correct. So is the one in the book sample. The difference amounts to notation. What is noted A B C D in the book amounts to the first, second, third and fourth arguments of the [abcd k s i] notation in RFC 1321, and the question.

The book accesses the same variables at each round. The RFC's variable access pattern is different, saving three 32–bit assignment per round, which would have a sizable cost in software. Detailing that (following comment):

Starting from the book's $A,B,C,D$ matching the RFC's A, B, C, D, in the first step noted [ABCD 0 7 1] (in the RFC)
- the book moves the former $D$ to the new $A$, the former $C$ to the new $D$, the former $B$ to the new $C$, computes $B_\text{new}$ and assigns it to the new $B$.
- the RFC computes the same value $B_\text{new}$ that the RFC notes a, assigns it to A, and leaves B, C and D unchanged.
Now with the book's $A,B,C,D$ matching the RFC's D, A, B, C, in the second step noted [DABC 1 12 2] in the RFC
- the book moves the former $D$ to the new $A$, the former $C$ to the new $D$, the former $B$ to the new $C$, computes $B_\text{new}$ and assigns it to the new $B$.
- the RFC computes the same value $B_\text{new}$ that the RFC notes a, assigns it to D, and leaves A, B and C unchanged.
At that point the book's $A,B,C,D$ matches the RFC's C, D, A, B.

This rotation of one at each step goes on. After step $i$ with $i$ multiple of $4$, including after the final round, the book's $A,B,C,D$ matches the RFC's A, B, C, D.

The only variation I know in MD5 is getting the byte order wrong in the 32-bit words. That happened in the first publication giving an MD5 collision, which was promptly corrected.

Ok, I am oficially stupid. Now I understand it. Thank you very much. It was so simple, yet I thought there was some sort of derivation along the way, thank you! — Fackelmann, Feb 11 '13 at 19:05
I came here with the same question but I am none the wiser reading your answer. The way understood it, the RFC uses a, b, c and d denoting arguments to the operation [abcd k s i], so for a step expressed as [ABCD ...] in RFC, a is A, b is B, c is C, and d is D -- with the implication the step assigns A, the first 32-bit word; with a step [DABC ...] a is D, b is A, c is B, and d is C -- the step assigns D (third 32-bit word). I am not getting how the first step could be considered equivalent to assigning B being the second 32-bit word? — Armen Michaeli, Mar 07 '21 at 14:36

MD5 implementation

1 Answers1