6

Assuming that $m$ is a multiset of bitstrings where all bitstrings have the same length, let $D(m)$ denote the number of distinct elements in $m$. That is, $D(m)$ is equal to the dimension of $m$. For example, if $$m = \{00, 10, 11, 10, 11\},$$ then $D(m)=3$.

Let $F(x) = \text{Keccak-}f[1600](x)$, the block permutation function of SHA-3 (for $64$-bit words). We can define the following notation: $$\begin{array}{l} {F^0(x)} = x,\\ {F^1(x)} = F(x),\\ {F^2(x)} = F(F(x)),\\ {F^3(x)} = F(F(F(x))),\\ \ldots \end{array}$$

Assuming that $A$ and $B$ are two different natural numbers greater than or equal to $0$, let $G_{A, B}(x)$ denote a function defined as $$G_{A, B}(x) = F^A(x) \oplus F^B(x),$$

where $x$ denotes a $1600$-bit input and $\oplus$ denotes an XOR operation.

Assuming that $L = 2^{1600}$, let $S_i$ denote an $i$-th bitstring from a set of all possible $1600$-bit inputs:
$$\begin{array}{l} S_1 = 0^{1600},\\ S_2 = 0^{1599}1,\\ \ldots,\\ S_{L-1} = 1^{1599}0,\\ S_L = 1^{1600}.\\ \end{array}$$

Let $A$ and $B$ denote two arbitrarily large, but different natural numbers (one of them is allowed to be equal to $0$). For example, $$A = 0, B = 1$$ or $$A = 2^{3456789}, B = 9^{876543210}$$ are valid pairs.

Then

$$\begin{array}{l} S_{A, B}[i] = G_{A, B}(S_i),\\ C_{A, B} = \{S_{A, B}[1], S_{A, B}[2], \ldots, S_{A, B}[L-1], S_{A, B}[L]\}.\\ \end{array}$$

The question: can we assume that $D(C_{A, B})$ is expected to be approximately equal to $$(1-1/e) \times 2^{1600} = 10^{481} \times 2,810560755\ldots$$ for all (or almost all) pairs of $A$ and $B$?

lyrically wicked
  • 1,337
  • 7
  • 10
  • 1
    $F$ is a permutation, so you can use $y = F(x)$ and simplify $G$ using $G'(y) = y \oplus F(y)$. Because $F$ is bijective, the number of possible $x$ is the same as the number of possible $y$. – Future Security Jun 14 '18 at 17:09
  • What does the notation Keccak-f1600 mean? – kodlu Jun 15 '18 at 00:30
  • @FutureSecurity: Of course, $F(x)$ and $G(x)$ have equal number of possible inputs (they operate on 1600-bit blocks). Basically, we are xoring a 1600-bit block with another (almost independently pseudo-random) 1600-bit block. I think that this leads to $(1-1/e)\times 2^{1600}$ different 1600-bit blocks, so I am asking the question to verify this. – lyrically wicked Jun 15 '18 at 04:22
  • 2
    @kodlu: $\text{Keccak-}f1600$ is the underlying function of SHA-3. It transforms any 1600-bit input to a 1600-bit output. – lyrically wicked Jun 15 '18 at 04:25
  • Perhaps you should define it as $G_{A,B}(x)$, and then you are asking $|G_{A,B}(\cdot)|$, in other words, how many unique $G$ functions are there? – MotiNK Aug 20 '18 at 09:59
  • @MotiN: the question is about the number of different outputs that any function $G_{A, B}(x)$ has. – lyrically wicked Aug 21 '18 at 06:38

1 Answers1

2

Let $\pi$ and $\sigma$ be two independent uniform random permutations, and $f$ a uniform random function. The best advantage of any $q$-query algorithm to distinguish $\pi + \sigma$ from $f$ is bounded by $(q/2^n)^{1.5}$[1]. In this case, the expected fraction of distinct outputs of $\pi + \sigma$ can't be too far from the expected fraction of distinct outputs from $f$, which is $1 - e^{-1} \approx 63\%$.

What about $\sigma = \pi^2$, or $\sigma = \pi^k$ for $k > 2$? Then $\pi$ and $\sigma$ are not independent. Nevertheless, it would be rather surprising if this situation were substantially different.

What about $\pi^{2^{3456789}} + \pi^{2^{987654321}}$ instead of $\pi + \pi^2$? This is the same as $\pi + \pi^{2^{987654321 - 3456789}}$. It's not clear why you would be worried about uncomputably large exponents like this unless you were flailing around without principle trying to make a design that looks complicated.

Squeamish Ossifrage
  • 48,392
  • 3
  • 116
  • 223
  • 2
    $\pi \oplus \pi^2$ is what we called single-permutation EDMD. We conjectured it is about as indistinguishable as the two-permutation case, but much harder to show that is the case. The $\pi + \sigma$ case is expected to have slightly more collisions than a random function: $\pi(x) \oplus \sigma(x) = \pi(y) \oplus \sigma(y)$ implies $\pi(x) \oplus \pi(y) = \sigma(x) \oplus \sigma(y)$, neither side of which can be 0. But this does not move the needle in any significant way. – Samuel Neves May 24 '19 at 00:08