1

I'm now familiar with a lower bound for the birthday problem as exposed in the theorem A.16 of Katz and Lindell book (alternatively see this webpage).

If one denotes by $C(q,N)$ the probability of collision when taking elements independent and uniformly distributed from a set of size $N$ The bound is obtained by assuming $q \le \sqrt{2N}$:

$C(q,N) \ge \frac{q(q-1)}{4N}$

However, the bound that I has been hold in my class is (without the inequality assumption in $q$):

$\forall N \in \mathbb{N}.C(q,N) \ge \frac{(q-1)^2}{2N}$

How can I prove this bound correct?

enter image description here

user1868607
  • 1,243
  • 12
  • 29
  • Could you write the proof from your class? – kelalaka Dec 05 '18 at 10:35
  • @kelalaka unfortunately (to me) this was just stated and written to me by e-mail, i have no proof whatsoever, so what i'm expecting is to have a counterexample here. i will ask however more details myself – user1868607 Dec 05 '18 at 10:37
  • Note that the bound from class is weaker (that is the lower bound is higher) than the one from the book. – SEJPM Dec 05 '18 at 10:44
  • @SEJPM in that sense is a "better" lower bound right? that's why i ask – user1868607 Dec 05 '18 at 10:49

2 Answers2

2

The question's $\displaystyle C(q,N) \ge \frac{(q-1)^2}{2N}$ bound is wrong. That's incompatible with the well-known fact that for $N=q^2$ and large, $C(q,N)\approx1-e^{-1/2}\approx39.3\%$.

A correct bound is $\displaystyle C(q,N) \le \frac{(q-1)\,q}{2N}$, valid for all $q$ and $N\ge1$, and tight when $q\ll\sqrt N$.

Another correct bound is $\displaystyle C(q,N) \ge \frac{(q-1)^2}{4N}$ when $1\le q\le\sqrt{2N}$, which follows from the question's lemma.

See this answer for some derivations, but beware that the notation there is $(n,k)$ and $p_n$ where the question has $(q,N)$ and $C(q,N)$.

fgrieu
  • 140,762
  • 12
  • 307
  • 587
1

A larger lower bound is better. The Katz Lindell book bound gives the correct formula. It is $$ (1-e^{-1})\frac{q(q-1)}{2N} \approx 0.316360 \frac{q(q-1)}{N}, $$ which they weaken further to $$ \frac{q(q-1)}{4N} $$ for simplicity.

The bound you ask about which is $$ 0.5 \frac{(q-1)^2}{N}, $$ is actually not weaker but stronger than even the stronger bound of Katz Lindell with the $(1-e^{-1})$ factor, as $N,q$ grow, and I don't see how it can be correct, regardless of the value of $q.$

Yehuda Lindell
  • 27,820
  • 1
  • 66
  • 83
kodlu
  • 22,423
  • 2
  • 27
  • 57