0

What padding algorithms are there that are MD-compliment so two different inputs don't pad to the same thing? Which would cause a collision in a hash function.

Nat
  • 415
  • 3
  • 12
  • Please define "MD-compliment". Even if I guess Merkle-Damgård-compliant, what that means is not trivial. In particular, there's a little more than "two different inputs don't pad to the same thing" in the MD padding as practiced, for good reasons; see this. – fgrieu Jul 27 '17 at 21:22
  • @fgrieu I would like to know more about collision resistant padding functions and merkle damgard padding was one example but I could not find details of how it works – Nat Jul 27 '17 at 21:29

1 Answers1

2

There are at least two different criteria for message encoding into fixed-size blocks:

  1. no input has its encoding equal to the encoding of any other input;
  2. no input has its encoding at the tail end of the encoding of any other input.

Criterion 1 is often used together with block ciphers for encryption or Message Authentication Codes. Usual padding techniques meeting it:

  • Bit padding, also known as ISO/IEC 9797-1 Padding Method 2. We append a single 1 bit to the message, then just as many 0 (possibly none) until filling a block.
  • Various forms of byte padding, applicable to messages already formatted as bytes (octets); most often, if the message is $n$-byte and a block is $b$-byte, it is added $p=n-(n\bmod b)$ byte(s) (thus $0<p\le b$), the last of which having value $p$; there are various conventions for the other added bytes: $0$ (ANSI X.923), $p$ (PKCS#7), random, or just unspecified.

Criterion 2 is stronger than criterion 1. It is sometime used for hashing, especially by iterating a one-way compression function, where this criteria strengthens the hash against some attacks (see this answer). A common implementation is:

  • Append a single 1 bit to the message (this step is customary, but has no justification AFAIK).
  • Append just as many 0 (possibly none) until there remains exactly $s$ unused bits in the message, for some fixed parameters $s$ (the message must have size less than $2^s$ bits); common choices are $s=64$ (MD5, SHA-1, SHA-256) and $s=128$ (SHA-512).
  • Append $s$ bits (thus filling the last block) coding the length of the message in bit (before padding), under some prescribed endianness convention; MD5 uses little-endian, SHA-1 and SHA-2 use big-endian.

I fail to track the original reference for criterion 2 (much less its common realization); it does not seem to be quite this way in: Ivan Bjerre Damgård, A Design Principle for Hash Functions, in proceedings of Crypto 1989; nor in its reference: Ralph C. Merkle, One Way Hash Functions and DES, in proceedings of Crypto 1989.

fgrieu
  • 140,762
  • 12
  • 307
  • 587