Difference between strongly universal and $\delta$ univeral hash functions

Question

Define a family of strongly universal hash functions as:

$$\forall x_1, x_2 \in \{0,1\}^n, \forall y_1, y_2 \in \{0,1\}^m, ~ x_1\ne x_2: ~~ \Pr_{h\in H} [h(x_1) = y_1 ~\text{and}~ h(x_2) = y_2] \le \frac{1}{2^{2m}}$$

if $h: \{0,1\}^n \rightarrow \{0,1\}^m$ and a family of $\delta$ universal hash functions: $\forall x_1, x_2 \in \{0,1\}^n$ where $x_1 \neq x_2$: $$Pr_{h \in H} [h(x_1) = h(x_2)] \leq \delta$$

(Definitions according to Moni Naor slides.

Then I see why strongly universal implies $2^{-k}$ universal (simply pick $y_2 = y_1 = h(x_1)$), but according to Moni Naor slides, $\delta$ universal does not imply strongly-universal.

Since I don't fully understand the counterexample on the slides ($h(x) = x$) I am searching for a counterexample and an intuitive description of the differences between the two definitions?

score 5 · Accepted Answer · answered Oct 01 '19 at 03:53

There are two orthogonal question axes here:

Universal vs. strongly universal. A universal hash family has bounded collision probability: for any inputs $x \ne y$, the probability that they collide under a random hash function $H$ is bounded by $1/t$, where $t$ is the number of possible hash values: $$\Pr[H(x) = H(y)] \leq 1/t.$$ (If hash values are $m$-bit strings, then $t = 2^m$.) However, although the probability of collision may be bounded by $1/t$, knowledge of $H(x)$ may inform you about $H(y)$ for certain pairs of $x$ and $y$.

In a strongly universal hash family, sometimes called pairwise independent, this does not happen because the hash values of any two inputs $x \ne y$ are independent uniform random variables: $$\Pr[H(x) = u, H(y) = v] = \Pr[H(x) = u] \cdot \Pr[H(y) = v] = 1/t^2,$$ for any hash values $u$ and $v$. This is called pairwise independent because it may be limited to any two variables—for any positive integer $k$, there's a corresponding notion of $k$-wise independence.

Obviously any pairwise-independent hash family is universal (proof: exercise), but the converse does not hold. For example, fix a prime $p$, and define $H_1(x) = a x$ on $\mathbb Z/p\mathbb Z$ for uniform random $a \in \mathbb Z/p\mathbb Z$. Then $H_1(x) = H_1(y)$ means $ax = ay$, an event which, for $x \ne y$, happens if and only if $a = 0$, so $$\Pr[H_1(x) = H_1(y)] = \Pr[a = 0] = 1/p.$$ But if $H_1(x) = a x = u$, there is only one possible value of $H_1(y) = a y = v$, namely $v = y u/x$, so

\begin{equation*} \Pr[H_1(x) = u, H_1(y) = v] = \begin{cases} 1, & \text{if $x v = y u$;} \\ 0, & \text{otherwise.} \end{cases} \end{equation*}

Hence $H_1$ is not pairwise independent, because for certain values of $x$, $y$, $u$, and $v$, $\Pr[H_1(x) = u, H_1(y) = v] = 1$ is far above the bound of $1/p^2$. That is, if you know $x$ and $y$, and you learn $u = H_1(x)$, you can perfectly predict what $v$ will be even if you didn't know a priori what the secret hash key $a$ was.
Universal vs. $\delta$-universal. This is just a generalization of the concept: you replace the bound $1/t$, for a hash family taking on $t$ possible hash values, by the bound $\delta$, which is usually a small multiple of $1/t$. There is a corresponding notion of a $\delta^2$-strongly-universal hash family. A $1/t$-universal hash family is just universal. A $1/t^2$-strongly-universal hash family is just strongly universal.

For example, fix a prime $p$, say $2^{130} - 5$, and let $r \in \mathbb Z/p\mathbb Z$ be uniform random. For a polynomial $f$ over $\mathbb Z/p\mathbb Z$ of degree at most $\ell$, define $H_2(f) = f(r)$. We can encode a message as the coefficients of the polynomial $f$. If $H_2(f) = H_2(g)$ for polynomials $f \ne g$, then clearly $f(r) = g(r)$, so $r$ is a root of the polynomial $f - g$. But there are only at most $\ell$ such roots. Since every $r$ has probability $1/p$, we have

\begin{equation*} \Pr[H_2(f) = H_2(g)] = \Pr[r \mathrel{\text{is a root of}} f - g] \leq \ell/p. \end{equation*}

Thus, $H_2$ is $\ell/p$-universal. Here $\delta = \ell/p$ is a small multiple of the number $1/p$ of distinct outputs from $H_2$. This hash family $H_2$ is noteworthy in cryptography because it is the basis of the Poly1305 message authentication code, which is a contender (along with GHASH) for the most popular MAC in the world. (Some background on the history and role of universal hashing in message authentication codes in cryptography.)

score 1 · Answer 2 · answered Oct 01 '19 at 01:04

Welcome to Crypto Stackexchange! This is a good question.

Strongly universal hash functions have the property that the probabilities of two hash values being equal is limited by the function $\frac{1}{2^{2m}}$. The $\delta$ universal hash functions, however, are limited by $\delta$, which may be any function.

So, to say that a function is strongly universal is essentially saying, "we have a function where the probability of a collision is bounded above by $\frac{1}{2^{2m}}$".

To say that a function is $\delta$ universal is saying "we have a hash function where the probability of a collision is bounded by some function $\delta$ ".

So it is clear that having a hash function for which the probability of a collision is bounded by $\frac{1}{2^{2m}}$ therefore implies that the probability of a collision for that function is bounded. This means that strongly universal implies $\delta$ universal.

Note, however, that the reverse statement is not true. A function which is $\frac{2}{2^{2m}}$ universal is not strongly universal. Saying the probability of a collision being bounded by $\frac{2}{2^{2m}}$ does not imply that it is bounded also by $\frac{1}{2^{2m}}$.

Only when $\delta \leq \frac{1}{2^{2m}}$ will $\delta$ universal imply strongly universal. In general, this will not always be the case.

Difference between strongly universal and $\delta$ univeral hash functions

2 Answers2