6

Let $H = \{h_r : U \rightarrow [m]\}$. What are the currently known most efficient algorithms such that $H$

  • is a universal family and
  • fulfils the homomorphic XOR operation property $\forall h \in H \forall x,y \in U: h(x \oplus y) = h(x) \oplus h(y)$?
Martin Kromm
  • 407
  • 2
  • 8

2 Answers2

6

I believe that the internal GHASH function from GCM would meet that criteria (if you trim off the length word, and require universality only with equal length inputs [1]); it can be defined as:

$$\operatorname{GHASH}_k( M_n, M_{n-1}, …, M_0 ) = \sum k^i M_i$$

With the input $M_n, M_{n-1}, ..., M_0$ being the input message divided into 128 bit blocks, $k$ being the universal hash key, and the arithmetic (both the additions and the multiplications) done over the field $\operatorname{GF}(2^{128})$

It meets the criteria:

  • It is universal (for equal length messages); for random $k$ and any two distinct equal length messages $M, M'$, we have $\operatorname{GHASH}_k(M) = \operatorname{GHASH}_k(M')$ with probability $\le |M| / 2^{128}$

  • It meets your homomophic requirement; this is because addition in $\operatorname{GF}(2^{128})$ is exclusive-or, and we have $k^i M_i \oplus k^i M'_i = k^i( M_i \oplus M'_i)$

  • It is quite efficient (especially with AES-NI instructions); I can't say that it's the most efficient possible...


[1]: You cannot get both the homomorphic properties and the universality (across messages of different lengths) to hold simultaneously. The homomorphic property requires that $h_k(0) = 0$ and that $h_k(00) = 0$, hence we have two different messages $0$ and $00$ which hash to the same value with high probability (actually, 1), thus $h_k$ is not a universal hash family.

Squeamish Ossifrage
  • 48,392
  • 3
  • 116
  • 223
poncho
  • 147,019
  • 11
  • 229
  • 360
  • Surely you mean CLMUL, not AES-NI? – Squeamish Ossifrage Jun 04 '19 at 16:01
  • @SqueamishOssifrage Even though technically CLMUL isn't contained in the AES-NI, they usually appear together, so many people consider CLMUL to be part of AES-NI for all practical intents and purposes... – SEJPM Jun 04 '19 at 16:02
4

Any polynomial evaluation hash or polynomial division hash, without length padding, has the property you seek:

  • Polynomial evauation. If $H_r(m) = m(r)$ where $m$ is a polynomial of zero constant term and degree $\ell$ over some field and $r$ is an element of the field, then we have $$H_r(m) = m_1 r^\ell + m_2 r^{\ell-1} + \cdots + m_{\ell-1} r^2 + m_\ell r,$$ so clearly $H_r(m + m') = H_r(m) + H_r(m')$. Standard examples of this form are Poly1305 and GHASH. If the field has characteristic 2, as in GHASH, then $+$ is xor. This obviously generalizes to multivariate polynomials too, e.g. the dot product $H_{r_1,r_2}(m_1 \mathbin\| m_2) = m_1 r_1 + m_2 r_2$ (which naturally attains a lower collision probability).

  • Polynomial division. If $H_f(m) = (m \cdot x^n) \bmod f$ where $m, f \in \operatorname{GF}(p)[x]$, and where $f$ is irreducible and of degree $n$, then clearly

    \begin{align} H_f(m + m') &= \bigl[(m + m') \cdot x^n\bigr] \bmod f \\ &= (m \cdot x^n) \bmod f + (m' \cdot x^n) \bmod f \\ &= H_f(m) + H_f(m'). \end{align}

    Polynomial division hashes are related to CRCs and Rabin fingerprints. When $p = 2$, $+$ is xor.

Beware that multiplication in fields of characteristic 2 is generally not efficient in software, and that the most efficient software is riddled with timing side channels—unless you can fruitfully organize your computation to simultaneously compute a batch of (say) 64 instances of it in parallel using bitslicing.

Squeamish Ossifrage
  • 48,392
  • 3
  • 116
  • 223
  • Marim specifically asked it to be homomorphic over XOR; hence you're stuck with a field with characteristic 2. – poncho Jun 04 '19 at 15:45
  • @poncho Yes. Just wanted to make sure that Martin is aware that characteristic 2 is dangerous in software! – Squeamish Ossifrage Jun 04 '19 at 15:46
  • Actually I think that one could get say 64 parallel instances going using what poncho outlined in this older answer. – SEJPM Jun 04 '19 at 16:04
  • @SEJPM Yes, but you need your message to be at least $8k$ bytes long to get at most a factor of $k$ improvement, and it's not a priori clear where the performance cutoff will be between a leaky table-driven implementation and a safe bitsliced implementation. My point is just that characteristic 2 can be dangerous for software because it requires you to do this analysis and tempts you into security-damaging performance tradeoffs. – Squeamish Ossifrage Jun 04 '19 at 16:14