0

I have some questions about the chapter of Birthday Attack in Introduction to Modern Cryptography.

When $q=\Theta(2^{l/2})$ the probability of this collision is roughly $1/2$

What's the meaning of $\Theta(.)$ and $\Theta(2^{l/2})$, and why the probability of this collision is roughly 1/2 when the $q=\Theta(2^{l/2})$

Thanks in advance.

kelalaka
  • 48,443
  • 11
  • 116
  • 196

2 Answers2

4

We say $f(n)=\theta(g(n))$ if $$ cg(n)\leq f(n)\leq Cf(n),\quad 0<c\leq C<\infty $$ as $n\rightarrow \infty.$

Apply the $\ell-$bit hash function to $k$ randomly chosen inputs. Let $n=2^{\ell}.$

The chance of two values picked being unique is $n- 1 \over n$ because when picking the second value you only have $n-1$ unique values left in the range. Repeating this argument, the chance of picking $t$ unique values is:

$${n - 1 \over n} \times {n- 2 \over n} \times \cdots \times {n- (k- 1) \over n}.$$

This is exactly the same as the limit for the birthday paradox.

Now $1-x \leq e^{-x}$ and hence $1-(v/n) \leq e^{-v/n}$ for $1\leq v\leq n,$ and thus the probability of no collisions is at most $$e^{-{1 \over n}} \times e^{-{2 \over n}} \times \cdots \times e^{-{k - 1 \over n}} =\exp\left\{-{1 + 2 + \cdots + (k-1) \over n}\right\}$$ which equals $$e^{-{k(k-1)/2 \over t}} = e^{-{k(k-1) \over 2n}} $$

The chance of a collision is 1 minus this quantity. Plugging in $k=\sqrt{n}=\sqrt{2^{\ell}}$ yields a value not too far from $1/2.$

kodlu
  • 22,423
  • 2
  • 27
  • 57
  • Appreciate your accurate answer. – Simon Hu Aug 26 '19 at 07:45
  • BTW, forgive my poor sense of math knowledge, how to determine the value(i.e., $k=\sqrt{n}$) that yields a value not too far from 1/2. Can you show me or give me some relevant links about that question? – Simon Hu Aug 26 '19 at 08:05
2

@kodlu gave an accurate answer, I will try to give one with less math. The $\Theta$ notation says asymptotically speaking(i.e for large numbers) the functions behave the same. Are bounded above and below by some constant multiplicative factor. You may be more familiar with the Big O Notation which gives only an upper bound. (This page also defines Theta notation). Informally you can think of it as a way of saying approximately or on the order of.

Throwing balls into $n$ bins the probability of getting a collision passed 50% after approximately $\sqrt{n}$ balls. But that isn't very scientific. For an l bit hash function it means that the probability of a collision becomes half after sampling on the order of $2^{l/2}$ values.

Formally after $\Theta({2^{l/2}})$ which means there are constants $c_l$ and $c_h$ such that for sufficiently large $l$ the probability becomes half after more than $c_l \cdot 2^{l/2}$ and less than $c_h \cdot 2^{l/2}$

If you are just seeking a good approximation use $\sqrt{2\cdot ln(2) \cdot n}$

Meir Maor
  • 11,835
  • 1
  • 23
  • 54