3

Follow up on this question: How efficient are the generic attacks regarding near-collision-resistance?

Let $H:\{0,1\}^∗→\{0,1\}^n$ be a cryptographically secure hash function. Let $k\in \mathbb{N}$ be $0 \leq k \leq n$. Without further details, how much effort is needed to find two strings $x_1,x_2$ such that $\Delta(H(x_1),H(x_2))\leq k$ holds? $\Delta(\cdot,\cdot)$ denote the hamming distance.

The accepted answer mentioned that it requires $\displaystyle \sqrt \frac{1}{p}$ hashes where $p=\displaystyle \frac{\sum_{i=0}^{k}{n\choose i}}{2^n}$ to have probability of $1/2$ to obtain one near collision pair. Why is that?

I tried to work out the math using birthday paradox. I get the minimum number of hashes one has to compute is something like:

$$\frac{2^{n/2}}{\sqrt{\sum_{i=0}^k{n\choose i}\cdot 2^i}}$$

Where did I go wrong ?

Maarten Bodewes
  • 92,551
  • 13
  • 161
  • 313
DiamondDuck
  • 403
  • 3
  • 17

1 Answers1

2

OK, so let's go from start to finish on this.

Note that for a hash function we can model the output bits as independent binary variables and thus apply a binomial distribution model to find out how many bits take a certain value. So the Probability that $k$ bits don't match is $\binom{n}{k}2^{-(n-k)}2^{-k}=\binom{n}k2^{-n}$ (intuitively we aggregate all possible allowed missmatches here). Now we want that at most $k$ bits don't match and thus we can sum: $\sum^k_{i=0}\binom{n}i2^{-n}$. Now note that $2^{-n}$ is a constant factor and thus $p=\sum^k_{i=0}\binom{n}i2^{-n}=2^{-n}\sum^k_{i=0}\binom{n}i=\frac{\sum^k_{i=0}\binom{n}i}{2^n}$, now finally we can use said birthday argument to find the number of messages needed to be $\sqrt{\frac1p}=\sqrt{\frac1{\frac{\sum^k_{i=0}\binom{n}i}{2^n}}}=\sqrt{\frac{2^n}{\sum^k_{i=0}\binom{n}i}}=\frac{2^{n/2}}{\sqrt{\sum^k_{i=0}\binom{n}i}}$ which is the expected result.

Note as mentioned in this answer, this result can be proven more rigourously as well, which was done in "Memoryless Near-Collisions via Coding Theory" by Mario Lamberger, Florian Mendel, Vincent Rijmen and Koen Simoens (PDF).

SEJPM
  • 45,967
  • 7
  • 99
  • 205