SHAKE256 XOF: Absorb incrementally vs all at once

Question

I'm diving into SHAKE256's XOF (Extendable Output Function), and I've got a bit of a head-scratcher.

I'm wondering if there's any difference between incrementally absorbing bytes and absorbing everything in one go when it comes to generating the digest. I've scoured the web but couldn't find an answer.

Scenario 1 Incremental Absorption

First, we absorb the byte array A, then B, then C and then we squeeze, like this

absorb_inc(A)
absorb_inc(B)
absorb_inc(C)
squeeze

Scenario 2: All-at-Once Absorption

shake256(A || B || C)

Are the results of these two scenarios the same, or are there any quirks I should know about?

Welcome to Cryptography. In [so], we upvote if we find the answer is useful to us, and we accept if satisfies us so that other people can see that the answer is useful and satisfactoty. — kelalaka, Sep 18 '23 at 20:45

aiootp · Answer 1 · 2023-09-18T23:50:07.777

Are the results of these two scenarios the same

For $\mathrm{SHAKE256}$, both cases result in exactly the same final internal state.

are there any quirks I should know about?

The canonical meaning of the inputs to hash and XOF functions should be reflected in a change to their internal states & output digests. However, ambiguity between inputs may prevent this.

If $A$, $B$ and $C$ are truly different conceptual things, then the typical way to avoid canonicalization issues is to do:

$$ \mathrm{L} = \mathrm{len}((A, B, C)) $$ $$ \mathrm{\bf{xof}} = \mathrm{SHAKE256}(L \space || \space \mathrm{len}(A) \space || \space A \space || \space \mathrm{len}(B) \space || \space B \space || \space \mathrm{len}(C) \space || \space C) $$ $$ \mathrm{result} = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) $$

which is exactly the same as:

$$ \mathrm{\bf{xof}} = \mathrm{SHAKE256}(\mathrm{len}((A, B, C))) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{len}(A) \space || \space A) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{len}(B) \space || \space B) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{len}(C) \space || \space C) $$ $$ \mathrm{result} = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) $$

Prepending the length metadata of each item & the number of items in the whole collection prevents there from being any confusion. A canonical encoding is one which can be uniquely, completely & unambiguously decoded back into its original meaning. And it's similar to what's done in the TupleHash construction.

Another related quirk, which arises when using hash & XOF objects as entropy pools, is temporal & caller context confusion. For instance, say there's an XOF object like this, which is shared across multiple threads:

$$\mathrm{\bf{xof}} = \mathrm{SHAKE256}(\textrm{sufficient_initial_entropy})$$

And there's three threads using that object, $\mathrm{thread}_A, \mathrm{thread}_B, \mathrm{thread}_C$. Each thread seeds $\mathrm{\bf{xof}}$ prior to retrieving a digest to use as entropic material. They each expect the following to happen:

$$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_A) \tag{$\mathrm{thread}_A$ sees} $$ $$ \mathrm{entropy}_A = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_B) \tag{$\mathrm{thread}_B$ sees} $$ $$ \mathrm{entropy}_B = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_C) \tag{$\mathrm{thread}_C$ sees} $$ $$ \mathrm{entropy}_C = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) $$

But, the real execution order looks like this:

$$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_B) \tag{$\mathrm{thread}_B$ sees} $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_A) \tag{$\mathrm{thread}_A$ sees} $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{seed}_C) \tag{$\mathrm{thread}_C$ sees} $$ $$ \mathrm{entropy}_A = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) \tag{$\mathrm{thread}_A$ sees} $$ $$ \mathrm{entropy}_C = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) \tag{$\mathrm{thread}_C$ sees} $$ $$ \mathrm{entropy}_B = \mathrm{\bf{xof}}{\mathrm{.digest}}(x) \tag{$\mathrm{thread}_B$ sees} $$

Each thread intended to do incremental absorbsion & incremental squeezing out of their own distinct intermediate digest. But it turns out, they wound up doing all-at-once absorbsion & each squeezed out the identical final digest. Yikes.

There are various mitigations for this, such as:

Making a copy of $\mathrm{\bf{xof}}$ prior to seeding & retrieval within each calling thread.
Including with each seed temporal & caller metadata as domain separators.
Using different domain constants when updating the global $\mathrm{\bf{xof}}$ from when updating its copies.

Here's an example:

$$ \mathrm{\bf{xof}}_A = \mathrm{\bf{xof.}}\mathrm{copy}() \tag{$\mathrm{thread}_A$ sees} $$ $$ \mathrm{seed}_{A^*} = \mathrm{\bf{canonicalize}}( \textrm{thread_id}_A, \space \textrm{monotonic_counter}(), \space \textrm{time_now_ns}(), \space \mathrm{seed}_A ) $$ $$ \mathrm{\bf{xof.}}\mathrm{update}(\mathrm{\bf{canonical}\textrm_\bf{pad}\textrm_\bf{to}\textrm_\bf{blocksize}}(\text{$``$xof-global-update''}, \space \mathrm{seed}_{A^*})) $$ $$ \mathrm{\bf{xof}}_A.\mathrm{update}(\mathrm{\bf{canonical}\textrm_\bf{pad}\textrm_\bf{to}\textrm_\bf{blocksize}}(\text{$``$xof-copy-update''}, \space \mathrm{seed}_{A^*})) $$ $$ \mathrm{entropy}_A = \mathrm{\bf{xof}}_A{\mathrm{.digest}}(x) $$

Since $\mathrm{\bf{xof}}_A$ is $\mathrm{thread}_A$'s independent copy of the global entropy pool, as long as its seed is unique, it is overwhelmingly likely that the internal state will not become the same across threads. In this case, the four values $( \textrm{thread_id}_A,$ $\space \textrm{monotonic_counter}(),$ $\space \textrm{time_now_ns}(),$ $\space \mathrm{seed}_A )$, & the domain (customization) strings, provide a good amount of unique temporal, caller & function context to mitigate the issue described above. However, the specifics of what may be needed to uniquely identify a calling context can vary across applications & environments.

Ok, that is the way, especially the first part. I've confused with the update part. Is there a paper related the first part as for reference/security analysis? — kelalaka, Sep 17 '23 at 18:45
@kelalaka Could you clarify which aspect of which section you mean by the first part? — aiootp, Sep 18 '23 at 00:14
For sponges, length values can be considered an extension of the data. Which is to say, there's no impact on the Sponge security since we "can treat all input bits on an equal footing", as the Keccak team states. In practice however, concatenating metadata changes the length of the inputs, which will cause the internal permutation to summarize states at different points along the absorbsion, which may be where it rubs against security proofs. On the other hand, not including metadata essentially allows second preimage attacks, so it can't be worse. — aiootp, Sep 18 '23 at 15:41
If one must not change when the absorbion occurs over the data, it might be best to include all the metadata upfront in an unambiguously formatted header, then pad the header to the next blocksize, & then start absorbing the data. That would leave the f invocation locations unchanged, relative to the data. — aiootp, Sep 18 '23 at 15:49
The idea is that trivial non-equal input tuples can be found which produce the same internal hashing state / output digests through an ambiguous encoding which transforms an input tuple into an input string. This is the canonicalization attack. As a mitigation, the TupleHash construction was designed. It's described as "a variable-length hash function designed to hash tuples of input strings without trivial collisions." — aiootp, Sep 18 '23 at 18:42
Now, If I understand the problem correctly, it is related to combining the inputs, well not sure that I'm calling this like secondary image attack. In the sense of the Keccak it is not matter. To be on the safe side for applications TupleHash is designed, right? — kelalaka, Sep 18 '23 at 18:52
Exactly, it's an input combining problem, which isn't specific to just Keccak. I'd argue that any high-level API which allows non-cryptography experts to hash collections of inputs should do something similar to TupleHash internally (transparently), because it's what users expect from a hash-like function, & input mangling constitutes an easily exploitable attack vector. Though it does come with the performance cost of additional data encoding. — aiootp, Sep 18 '23 at 19:12

SHAKE256 XOF: Absorb incrementally vs all at once

Scenario 1 Incremental Absorption

Scenario 2: All-at-Once Absorption

1 Answers1