9

In FIPS-202 specification, the padding required for SHA3 were not clearly mentioned. so we have analyzed the NIST test vectors for SHA3, which states that append "0x06" (never used 1 followed by 'j'zeros and then 1 specified in FIPS-202) to the message digest, which is a contradiction to the specification.

The link for NIST test vector is :-

http://csrc.nist.gov/groups/ST/toolkit/examples.html#aHashing

http://csrc.nist.gov/groups/ST/toolkit/documents/Examples/SHA3-224_Msg0.pdf

could you clarify padding mechanism for SHA3?

Biv
  • 9,979
  • 2
  • 39
  • 67
Vani
  • 91
  • 1
  • 3

3 Answers3

10

In FIPS-202 specification, the padding required for SHA3 were not clearly mentioned.

I beg to differ. From the FIPS 202. section B2.:

For most applications, the message is byte-aligned, i.e., $len(M) = 8m$ for a nonnegative integer $m$.

In this case, the total number of bytes, denoted by $q$, that are appended to the message is determined as follows by $m$ and the rate $r$: $$q = \frac{r}{8} – (m \bmod{\frac{r}{8}})$$

The value of $q$ determines the hexadecimal form of these bytes in this case according to the conversion functions specified in Sec. B.1. The padded messages that result are summarized in Table 6:

+-------------------------+--------------------------------+
|                         |                                |
| Number of padding bytes |       Padded message           |
|                         |                                |
+----------------------------------------------------------+
|                         |                                |
|         q = 1           |  M || 0x86                     |
|                         |                                |
|         q = 2           |  M || 0x0680                   |
|                         |                                |
|         q > 2           |  M || 0x06 || 0x00... || 0x80  |
|                         |                                |
+-------------------------+--------------------------------+

the notation 0x00... indicates the string that consists of $q – 2 $ “zero” bytes.

If you look at your test vector, for $\operatorname{SHA3-224}$. $224$ means $224$ bits of security. So we have $c = 224 \times 2 = 448$ and $r = 1600 - c = 1152$.

Because message is empty, $len(M) = 0$, therefore $m = 0$. Hence $$q = \frac{1152}{8} - 0 \bmod \frac{1152}{8} = 144$$

Therefore we are in the third case: ($q > 2$).

The padding will be : 0x06 || 144 - 2 zeros || 0x80.
Therefore we find back the data to be absorbed :

06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 

06 followed by 284 0 (or 142 00) then followed by 80. The followings 00 are provided to show that the capacity won't change during the absorption.

Biv
  • 9,979
  • 2
  • 39
  • 67
5

Your confusion comes from the SHA-03 domain, the padding is as specified 10*1; however when you prepend the domains to the padding you get:

domain    result         used in
  01     M||0110*1     SHA3-224 to 512
  11     M||1110*1     RawSHAKE128 or 256
 1111    M||111110*1   SHAKE128 or 256
Gregor y
  • 181
  • 5
0

in chapter 5, the padding function pad: P = 1 || 0^j || 1 ,has two purposes, one for example, another one is, it shows a multi-rate padding for a bit-aligned message.

however, the message in sha3 is byte-aligned, so another multi-rate padding function is provided in Appendix B.2.

tyskin
  • 1