PRF and pairwise independent hash function

Question

I'm confused with the concepts of pairwise independent hash function and pseudorandom function. They seem identical to me.

A family of hash functions $H=\{ h:U \to [m] \}$ is $k$-independent if for any $k$ distinct keys $(x_1, \dots, x_k) \in U^k$ and any $k$ hash codes (not necessarily distinct) $(y_1, \dots, y_k) \in [m]^k$, we have:

\begin{equation}\Pr_{h \in H} \left[ h(x_1)=y_1 \land \cdots \land h(x_k)=y_k \right] = m^{-k}\end{equation}

This definition is equivalent to the following two conditions:

for any fixed $x\in U$, as $h$ is drawn randomly from $H$, $h(x)$ is uniformly distributed in [m].
for any fixed, distinct keys $x_1, \dots, x_k \in U$, as $h$ is drawn randomly from $H$, $h(x_1), \dots, h(x_k)$ are independent random variables.

These properties are exactly the properties of a PRF, aren't they? So what is the difference between these two definitions?

score 5 · Answer 1 · answered Sep 04 '15 at 13:30

You are right; these are the properties of a PRF. In fact, a $k$-wise independent hash function has exactly the same distribution as a truly random function, as long as you only see up to $k$ points. This is the difference: a pseudorandom function has to be indistinguishable from random for any polynomial number of samples viewed. This is both weaker and stronger than $k$-wise independence: a PRF is only computationally indistinguishable from random (weaker), but maintains this property for any polynomial number of samples.

Note also that $k$-wise independent hash functions become very inefficient when you look at large $k$. So typically they are very useful when you only need pairwise independence or a small $k$.

PRF and pairwise independent hash function

1 Answers1