Are there cryptographic hash functions with homomorphic properties?

Question

Are there cryptographic hash functions that have homomorphism-like properties?

E.g. satisfying following relation $h(a || b) = h(a) · h(b)$, where $h(x)$ is hash function itself, $x || y$ is concatenation and $x · y$ is some hash-specific combination function making single hash given two of them. If that makes any difference, it can be also assumed $a$ and $b$ are of equal length, and $h(x)$ is expected to produce hash sums of same length regardless of $x$.

I can think of basic, non-cryptographic example — $h(x)$ can be plain arithmetic sum of character codes modulo 256, and $x · y := (x + y) \bmod 256$, so

$$ h(\texttt{"foo"}) = (102 + 111 + 111) \bmod 256 = 68 $$ $$ h(\texttt{"bar"}) = (98 + 97 + 114) \bmod 256 = 53 $$ $$ h(\texttt{"foobar"}) = (102 + 111 + 111 + 98 + 97 + 114) \bmod 256 = 121 $$ $$ h(\texttt{"foo"}) · h(\texttt{"bar"}) = (68 + 53) \bmod 256 = 121 $$

So, is there something similar, but with high collision resistance?

The normal definition of "distributive" is that two operations + and * share it if A(B+C) = (AB) + (A*C), for all A, B, C. You might want to clarify what property you are looking for. Also, I suspect that any such property would make it easier to find hash collisions; what is the problem you're actually trying to solve? — poncho, Sep 10 '14 at 17:56
@poncho, yes, distributive properties like you describe will likely to reduce the collision security level (in bits) by the factor of 2 or so. — Dmitry Khovratovich, Sep 11 '14 at 11:56
We already have questions here that provide lots of information on this subject. Simply searching on "homomorphic hash" in the search bar at the upper right turned up the following: http://crypto.stackexchange.com/q/6497/351, http://crypto.stackexchange.com/q/8074/351, http://crypto.stackexchange.com/q/12719/351. In the future, you might want to do a bit of research before asking, to help you ask a more informed question. — D.W., Sep 17 '14 at 04:28
$SL_2$ homomorphic hash functions: worst case to average case reduction and short collision search. Des. Codes Cryptogr. (2016) — Blanco, Jun 16 '20 at 09:17

score 6 · Answer 1 · edited Apr 13 '17 at 12:48

My understanding is that, for the even more special case where a and b are not only of equal length but some power of two times a fixed block size, all hash tree systems (also called a Merkle tree system or a binary hash chain) meet your criteria.

E.g. satisfying following relation h(a || b) = h(a) · h(b), where h(x) is hash function itself, x || y is concatenation and x · y is some hash-specific combination function making single hash given two of them.

In particular, the hash specified by the Tree Hash EXchange format (THEX) spec uses the hash-specific combination function x · y == SHA1( 0x00 || x || y ). whenever the underlying pieces of text a and b are the same length and are both some power of two times a fixed block size.

When c and d are exactly one block in size, the tree hash T() used in THEX is defined something like

T(c) == SHA1( 0x01 || c ) # only for 'c' exactly 1 block long
T(d) == SHA1( 0x01 || d ) # only for 'd' exactly 1 block long
T( c || d ) == SHA1( 0x00 || T(c) || T(d) )
            == SHA1( 0x00 || SHA1( 0x01 || c ) || SHA1( 0x01 || d ) )

Typically a block has a size of 1024 bytes; Dan Williams and Emin G¨un Sirer have written a paper on picking an optimal block size.

There are apparently two common ways to avoid the easy collisions described by " What is the purpose of using different hash functions for the leaves and internals of a hash tree? ":

some Merkle trees -- such as the THEX described above -- use one hash function for the leaves, and a different hash function for the internal nodes.
Other Merkle trees -- such as the one used by BitTorrent -- keep track of both the file length and the root SHA1 hash value, and files are considered "the same" only if both match. This allows them to use the same unmodified SHA1 hash function for both the leaves and the internal nodes. (Some people think of this as a single "tree hash value" that includes two parts, the file length and the cryptographic hash value).

Merkle trees can handle files with size that is not a power of two -- How does a "Tiger Tree Hash" handle data whose size isn't a power of two? -- but if the first file isn't a power of two times the fixed block size, concatenating 2 files doesn't give the nice relationship you wanted between the hash of the two smaller files and the hash of the bigger combined file.

Are there cryptographic hash functions with homomorphic properties?

1 Answers1

Linked