3

Section B2 of FIPS-202 specification describes the Hexadecimal Form of Padding Bits which is basically the translation of the bit padding 0110*1. However at the beginning they suppose that the message is "byte-aligned, i.e., len(M) = 8m", so what should I do if it is not (ex when the length is odd)? In section B1 they also suppose the hexadecimal string to be 2m long, but what if it's odd?

If we have "abcd" we pad with ||060*80 so that the first line of the Keccak's state is $[00\ ..\ 00\ 06\ cd\ ab]$ and that's ok.

However if we have "abc" and we do the same we end up with abc||060*80 that gives a state whose first line is $[00\ ..\ 00\ 06\ c0\ ab]$ which is not the right pad. I also tried to add just 60*80 and then swap the 6 with the last element of the original string so \begin{equation*} ab\,c \rightarrow ab\,c6\,0*80 \rightarrow ab\,6c\,0*80 \rightarrow [00 \ ..\ 00\ 00\ 6c\ ab] \end{equation*} but this also gives wrong hash compared to SHA3 calculator online. So what is the right way to do it?

IAmUser
  • 33
  • 3

1 Answers1

3

The bid padding of KECCAK family

Actually, the padding is defined bit based in section 5.2 for the general case;

Algorithm 9: pad10*1(x, m)

Input: positive integer x; non-negative integer m.

Output: string P such that m + len(P) is a positive multiple of x.

Steps: 1. Let j = (– m – 2) mod x. 2. Return P = 1 || 0^j|| 1.

and also noted as

Therefore this at least a 2-bit padding. If the message size $m \bmod r = \{0,-1\}$ then a new block will be required.

The $x$ is set as $x = r (rate)$


The suffix and padding for $\operatorname{SHA3-224}$

The Keccak uses domain separation for SHA3, rawShake, and Shake series. For SHA3-x the message is first suffixed with 01 then the padding is applied (section 6.1);

$$M||\underbrace{01}_{suffix}||\underbrace{10^*1}_{padding bits}$$

$$\operatorname{SHA3-224}(M) = \operatorname{KECCAK}[448] (M || 01, 224);$$

In this case, the minimum added bits is 4, 2 from the suffix, and at least 2 from the padding.

byte padding fix for odd hex numbered messages.

but to do a bit of padding I first need to convert the input from hex to binary. To do so I could use Algorithm 10 in section B1 but this works only with hex string of even length so it cannot be used so that's why I'm asking for detail about the hex padding

Since the encoding trans $\texttt{abc||060*80}$ to $[80\ ..\ 00\ 06\ c0\ ab]$ the obvious solutions is using $\texttt{ab6c||0*80}$ and this will be translate to $[80\ ..\ 00\ 00\ 6c\ ab]$ as required.

The confusion come from the fact that the NIST defines the message as

message A bit string of any length that is the input to a SHA-3 function.

So the bit padding is naturally applied to a bit string. For the hex valued string, it must be converted Algorithm 10: h2b(H, n) but the document implicitly defines the byte padding as before the conversion. after the conversion it is still the bit padding.


Note: the $\phantom{a}^*$ represents the Kleene star simply means zero or more.

kelalaka
  • 48,443
  • 11
  • 116
  • 196
  • I got this part but how does this translate with hex input? – IAmUser Nov 25 '20 at 20:54
  • You mean the message is hex? You see it has at least 4 bits. You can align them to the multiple of $r$ by increasing the 0s. – kelalaka Nov 25 '20 at 21:04
  • Also, you can use the test vectors – kelalaka Nov 25 '20 at 21:28
  • but to do a bit padding I first need to convert the input from hex to binary. To do so I could use Algorithm 10 in section B1 but this works only with hex string of even length so it cannot be used so that's why I'm asking detail about the hex padding – IAmUser Nov 25 '20 at 21:31
  • So, you want to use the byte-aligned but you are not bight aligned! The standard (Hexadecimal Form of Padding Bits) doesn't apply then. You don't need to convert from hex to binary, you only need to build the padding and convert the result into hex. – kelalaka Nov 25 '20 at 21:43
  • @Alessio what is the website? I tested one and 0A and A are same! so, how reliable? – kelalaka Nov 25 '20 at 22:06
  • ok so if the input is not byte aligned I am forced to convert to binary and then pad. The fact is that for the moment our program works fine with any text input and even hex input by doing hex padding so I thougth there was a solution for odd length hex input without having to convert to binary. – IAmUser Nov 26 '20 at 07:16
  • at first we used this site since it shows every state (and the padding also) but we think sometimes the padding is wrong. We also saw the one you sent and it seems correct but sadly it just shows the result so it's a bit tricky to understand how it works – IAmUser Nov 26 '20 at 07:21
  • "ok so if the input is not byte aligned I am forced to convert to binary and then pad" the input is already in binary if everything is normal. You've hopefully just chosen to represent that binary using an (odd) number of hexadecimal digits. Note that hexadecimals are just there to make the binary easier to interpret by us humans - any good implementation of a primitive only contains hexadecimals for constants (yes, including those for padding if the input is byte aligned) and only prints out hexadecimals for debugging purposes, if at all. – Maarten Bodewes Nov 26 '20 at 09:13
  • @Alessio you need to test with a certified code, see from the Keccak team's software – kelalaka Nov 26 '20 at 09:16
  • 1
    ok I'll watch certified implementation and maybe working in binary straight from the start is the best, thanks – IAmUser Nov 27 '20 at 16:46
  • @Alessio let us know the result. Could we close the question by accepting? – kelalaka Nov 27 '20 at 18:03
  • @kelalaka yes. My conclusion is that you simply cannot neither convert to binary nor pad a hex string of odd length so the version I'm implementing will do like here that let you put in binary, text and only even length hex string. I still don't get what other online tool do (I'm pretty sure this is wrong, but this one is still a mistery). I'll check the certified code you sent more in detail. Thank you both for your time – IAmUser Nov 28 '20 at 07:36