2

Given sha1(pad(A) || pad(B)), where B is known, can I calculate sha1(pad(A))? pad(A) means its length is exactly 1 block (64 bytes for SHA-1)

If yes, for which other hash functions it will work too?

Would it work vice versa when A is known?


Why do I think it's possible?

If you take a look on SHA-1 source, 16 ints block is expanded to 80 ints and then during 80 rounds one int is added using invertible operations. When you have the block in plaintext, you can calculate the expanded block and revert the hash round by round until you'll get the IV of the block, i.e. the hash of the last block. Do you understand what I mean?

Smit Johnth
  • 1,681
  • 4
  • 17
  • 27
  • 2
    This sounds like a homework question. Also, what do you mean by 'pad(A)'? – pg1989 May 30 '13 at 19:28
  • pad(X) means it has length of 1 block (20 bytes in case of SHA1). "This sounds like a homework question." - even if, is that bad? – Smit Johnth May 30 '13 at 19:33
  • @SmitJohnth: I personally don't have problems with homework questions, but I think it is common courtesy to specify if it is, e.g. with the [homework] tag. – Reid May 30 '13 at 19:36
  • @Reid no, it's not, but what would it change? – Smit Johnth May 30 '13 at 19:36
  • @Smit Johnth: last time I checked, SHA-1 had an (input) block size of 512 bits, that is 64 bytes, not 20 bytes. Regardless of how pad() pads: no, I do not think that you can do what you ask for. – fgrieu May 30 '13 at 19:43
  • Yes, you are right - 16 ints / 64 bytes. 20 bytes is hash length.
    I think it will work - if you have current block data, you can generate expanded buffer and then reverse the hash generation to beginning of the block.
    – Smit Johnth May 30 '13 at 19:53
  • 2
    @SmitJohnth: Some people prefer to give hints to homework questions instead of full-blown answers; this is extremely common on math.se, for instance. Others might consider it dishonest to ask a site like this a homework question without advance notice. I'm ambivalent on the whole matter, personally. – Reid May 31 '13 at 00:59
  • 2
    @Smit Johnth: with current block data, here pad(B), yes you can "generate expanded buffer"; but, even knowing the final hash, your intuition that you can "reverse the hash generation to beginning of the block" seems wrong to me, based on the fact that the round function $S_{j+1}=F(S_j,M_j)$ has the structure $F(S,M)=E_M(S)\boxplus S$ where $E$ is a block cipher, and $\boxplus$ is some hybrid between addition and XOR. – fgrieu May 31 '13 at 05:02

1 Answers1

4

SHA-1's compression function (as well as MD5 and SHA-2) is build from a (custom-made) block cipher in the Davis-meyer construction. (These are the "invertible operations" you describe in the question.)

The basic idea is to use the message block as the key for the cipher to "encrypt" the previous state:

                    message block
                          |
                          ↓
                     .---------.
previous state ----> | Encrypt | ----> new state
                     '---------'

If this would be the only thing to do, then we actually could (knowing a block of the message and the new state) revert the cipher to calculate the previous state, just like you sketched in your question.

A compression function with this property is not what we call a "one-way" compression function, and would not be a good thing to use as a compression function for a hash function. Therefore this is not what is actually done, the Davis-meyer construction has one more step:

                    message block
                          |
                          ↓
                     .---------.
previous state ----> | Encrypt | --⊕---> new state
                 \   '---------'   ↑
                  \                /
                   \--------------/

This XORing of the "ciphertext" with the "plaintext" has the effect that we don't know the ciphertext output of the block cipher, and can't do our backtracing. This is actually enough to make the compression function one-way, if the block cipher is secure (for some formal notations of one-way and secure).

If yes, for which other hash functions it will work too?

As explained, no hash function with your trackback property can be deemed secure, so we can conclude that no modern hash function should have this property.

Would it work vice versa when A is known?

If A is known, you also know sha1(pad(A)), e.g. the state after hashing the first block, e.g. the input to the second block.

Looking at the diagram above, we now know both the plain text input as well as the cipher text output of the block cipher. From this deriving the (or any) used key is known as a known-plain text key retrieval attack (or chosen-plaintext, if the attacker can actually influence A and see hash(pad(A)||pad(B))). If the block cipher is any good, it should not feasible to retrieve any useful information about the key pad(B) (other than by trying examples, i.e. brute forcing).

Paŭlo Ebermann
  • 22,656
  • 7
  • 79
  • 117