As a personal project, I want to implement MD5 on an FPGA, but I have some doubts about the specifics of the implementation. My first source of how the algorithm is implemented was the RFC 1321, where there is a pseudocode that explains that round 1 will be performed the following way:
/* Round 1. */
/* Let [abcd k s i] denote the operation
a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
/* Do the following 16 operations. */
[ABCD 0 7 1] [DABC 1 12 2] [CDAB 2 17 3] [BCDA 3 22 4]
[ABCD 4 7 5] [DABC 5 12 6] [CDAB 6 17 7] [BCDA 7 22 8]
[ABCD 8 7 9] [DABC 9 12 10] [CDAB 10 17 11] [BCDA 11 22 12]
[ABCD 12 7 13] [DABC 13 12 14] [CDAB 14 17 15] [BCDA 15 22 16]
Ok. Fair enough. It broke my dreams to be able to parallelize the algorithm since in each step one variable (A, B, C, or D) is updated and each step needs the previous values.
So, looking for some way to parallelize some part of the algorithm I found this chapter of a book:
Hardware Implementation of Hash Functions
Table 2.1 of page 31(5) gives another formula for the algorithm. Sure, it is similar but is not the same. According to this book, the only value ever updated is B. Clearly, this is not the same as the original implementation as seen in RFC 1321.
My questions are:
- Is this another form of the original implementation with the exact same results? (I suppose so, but as the book does not explain the steps that led to the equations, I want to be sure)
- And how are those new equations derived?