What is the exact purpose of length padding in Merkle–Damgård hash functions?

Question

Is a length padding technique in a hash function used to avoid length extension attacks ?

Clearly including the length of the message in the padding in Merkle-Damgaard hashes does not prevent length extension attacks since they are vulnerable to this attack. — CodesInChaos, Sep 03 '14 at 11:07

ddddavidee · Answer 1 · 2014-09-03T11:19:38.290

-1

Hash functions are defined to work on an arbitrary integer number of blocks. The minimal quantity of data that can be processed by an hash function is a single block.

So, if the message size is not an integer multiple of the block size one has to pad it to the right size.

One needs also to make the message + padding have one and one only interpretation, otherwise it would be simple to create collisions. Therefore adding the padding lenght gives an unique way to interprete the message+padding couple.

As example: let's imagine that one pads addings just zeros to achieve the right lenght. Therefore message $m$ and message $m||0$ have the same hash image.

edited Sep 03 '14 at 11:19

answered Sep 03 '14 at 11:07

ddddavidee

3,324
2
23
34

But that doesn't explain why the padding contains the length of the message. At least that's how I understand the question, but it's so vague that your interpretation of "why is padding used" might also be what the OP wants. – CodesInChaos Sep 03 '14 at 11:08
you don't need to put the padding lenght in the padding, you could add $10\dots 0$ to achieve the right size, with the rule that if the message size is a integer multiple of the block size you'll add an entire block as padding. – ddddavidee Sep 03 '14 at 11:14
MD hashes do put the length into the padding. I interpret the question as asking why they use the length in the padding instead of a simpler padding, like the one you suggest. My favourite hashes (Skein and Blake2) use even simpler padding (just zeros) and use a different mechanism of signaling the end of the message. – CodesInChaos Sep 03 '14 at 11:22
Ok, I've probably misunderstood the question. (Now I'm intrigued about Skein and Blake techniques... have you a good pointer?) – ddddavidee Sep 03 '14 at 11:24
MD has a security reduction for the collision resistance of the whole hash to the collision resistance of the compression function. This reduction relies on the message length being part of the padding. For details, check the question I linked as duplicate of this one. – CodesInChaos Sep 03 '14 at 11:28
1

Normal compression functions have two inputs, the chaining value and the message block. Skein uses a tweakable compression function which has a third input. It uses this input to signal the end of the message end of the message, preventing length extension attacks. It also uses it to pass a kind of block counter to each compression, so a unique compression function is used for each position in the hash, which improves second pre-image resistance compared to MD hashes. Take a look at the Skein paper, it's pretty readable. – CodesInChaos Sep 03 '14 at 11:37
BLAKE uses a similar technique, but combines it with a traditional padding. In Blake2 we replaced the traditional padding with zeros since that's simpler, still secure and reduces the number of compression function calls if the message length is an integral number of blocks (important for tree hashing). – CodesInChaos Sep 03 '14 at 11:38
Voted down as bit padding - which is usually performed for hash functions - is fully reversible. You don't need to encode the length to achieve the same thing as you describe (as long as you always pad). – Maarten Bodewes Sep 03 '14 at 21:02
Look at my comments. I wrote the same. – ddddavidee Sep 03 '14 at 21:05

What is the exact purpose of length padding in Merkle–Damgård hash functions?

1 Answers1