Hash which can be used to verify one of multiple inputs?

Question

Is there a hash function H1 which, when given inputs a, b, etc., produces output Z, and an H2 which can verify any one of a, b, etc. when given Z like this:

H1(a, b) => Z
H2(Z, a) => true
H2(Z, b) => true
H2(Z, c) => false

And where Z is (currently thought to be) cryptographically secure (i.e., modifying Z in any knowable way makes H2 return false for any of the provided inputs)?

Edit: An additional condition which I forgot to articulate originally. a cannot be derived from b and Z, and b from a and Z. Without this condition, we can just use Shamir's Secret Sharing, with a k value of 1, a randomly generated secret Z and a, b, etc are our n values.

I suspect there is no established way of doing this, but I don't see any theoretical reason it can't be done. All I need is an answer from Adi Shamir! — Nathan, Jan 15 '15 at 09:31
If you want to keep your assumption, that $H_1$ and $H_2$ are hash functions, then this is very unlikely to exist. Because in this case, $H_2$ could somehow identify partial-preimages of $Z$... Then either $H_1$ is not a cryptographically secure hash function or $H_2$ is not computationally feasible. — tylo, Jan 15 '15 at 14:08
@tylo While I agree that this is quite unlikely to be a secure hash function in the "standard" meaning of the term, I feel that such a thing is somewhat similar to a hash function: the goal is comparable in that it is very hard to obtain a preimage, but it is easy to check whether a given message is a preimage. In this case, the only difference is that it is also easy to verify that something was part of the preimage (which is a set), but only if you already know that thing. — yyyyyyy, Jan 15 '15 at 15:59
That has nothing to do with the term hash function, even without the property cryptographically secure. Because what you need to achieve something like that is a trapdoor one-way function. — tylo, Jan 16 '15 at 12:49
@tylo, what I was something "somewhat similar to a hash function" in the way yyyyyyy describes. I didn't mean something which conforms to any technical definition. Sorry if my choice of words caused confusion. — Nathan, Jan 16 '15 at 13:57
This should not end up in a discussion of sorts, but using correct terms is quite important. If an accumulator fits your idea, then that's fine. — tylo, Jan 16 '15 at 15:32

cygnusv · Accepted Answer · 2015-01-16T09:41:58.470

What you are looking for reminds me of one-way accumulator functions [1]. Basically, a one-way accumulator permits to prove the membership of an element $x$ to a set $S$ without revealing the actual members of the set. See for example this definition from [2]:

The most common form of one-way accumulator is defined by starting with a “seed” value $y_0$, which signifies the empty set, and then defining the accumulation value incrementally from $y_0$ for a set of elements $S = \{x_1,···,x_n\}$, so that $y_i = f(y_{i−1},x_i)$, where $f$ is a one-way function whose final value does not depend on the order of the $x_i$'s. [...] Because of the commutative nature of $f$ , a source can digitally sign the value of $y_n$ so as to enable a third party to produce a short proof for any element $x_i$ belonging to $S$ [...]. A well-known example of a one-way accumulator function $f$ is the exponential accumulator $exp(y,x) = y^x \mod N$, for suitably-chosen values of the seed $y_0$ and modulus $N$. In particular, choosing $N = pq$ for two strong primes $p$ and $q$ makes the accumulator function $exp$ as difficult to break as RSA cryptography [1].

Update: Let's see an example using the $exp$ function defined above. Suppose $a$, $b$, and $c$ are inputs, and that $y_0$ and $N$ are chosen properly. Then the accumulator algorithm $H_1$ will do the following:

Compute $s = a \cdot b \cdot c$ and $y = exp(y,s) = y_0^s = y_0^{abc}$
For each element $i$ in the input, compute $y_i = exp(y,1/i) = y^{1/i}$. For example, $y_a = y_0^{bc}$.
Output $z = (y,y_a,y_b,y_c)$.

Now, if you want to check if some element $x$ belongs to the original collection, you just have to check if for any of the partial results $y_i$, the equation $y = exp(y_i,x)$ holds. For example, if you try with $a$, then equation $y = exp(y_a,a)$ holds, since $exp(y_a,a) = (y_a)^a = (y_0^{bc})^a = y_0^{abc} = y$. On the contraty, if you try with $d$ different to $a$, $b$ and $c$, then none of the equations hold.

The problem with this solution is that output $z$ grows linearly with the size of the set of members.

References:

[1] Benaloh, J., & De Mare, M. (1994, January). One-way accumulators: A decentralized alternative to digital signatures. In Advances in Cryptology—EUROCRYPT’93 (pp. 274-285)

[2] Goodrich, M. T., Tamassia, R., & Hasić, J. (2002). An Efficient Dynamic and Distributed Cryptographic Accumulator*. In Information Security (pp. 372-388).

For me it reminds Bloom filters. But have no idea how to provide security in this case.. — Fractalice, Jan 15 '15 at 20:01
The problem with Bloom filters is that false positives are possible. — cygnusv, Jan 15 '15 at 21:44
And that you cannot realistically reduce the false positive probability to something thats cryptographically acceptable. — DrLecter, Jan 16 '15 at 09:46

Fractalice · Answer 2 · 2015-01-20T12:50:53.360

1

I can propose this (just invented it on paper): ($h$ is some hash function, like SHA1, $||$ is concatenation, $+$ is usual sum (xor is ok too)):

H1(a, b) = (h(a) + h(b)) || h(h(a)) || h(h(b))

For H2, either from $a$ or from $b$ we can derive the all $h(a), h(h(a)), h(b), h(h(b))$ and check the hash.

Also, if we assume that input pair is not ordered, to avoid trivial forgery by swapping $h(h(a))$ and $h(h(b))$ we may force order on them (e.g. force $h(a) \le h(b)$, so that $H1(a, b) = H1(b, a)$).

Intuition about security: from $h(h(a))$ one can't derive information about $h(a)$ (same for $b$). So attacker can't replace any of $h(a)$ or $h(b)$ in the sum with his hash.

UPDATE: I realized it has a simple flaw. Given $H1(a,b),H1(b,c),H1(a,c)$ we can easily get (from sums) $h(a),h(b),h(c)$ and use them to make hashes.

edited Jan 20 '15 at 12:50

answered Jan 15 '15 at 19:52

Fractalice

3,087
12
10

Though $a$ can't be derived from $b$ and $H1(a, b)$, the attacker can derive $h(a)$ which allows him to forge $H1(a, x)$ for any $x$. This attack may be undesirable. – Fractalice Jan 15 '15 at 19:58
I also considered posting a similar answer to this, but it doesn't stand up to the original requirements. It's trivial for an attacker to modify the output such that one of a or b is validated successfully, but the other is not. – Stephen Touset Jan 15 '15 at 20:06
@StephenTouset how would you do it? if you change $h(h(a))$ then (from first part) $h(b)$ will be corrupted and $h(h(b))$ will not match. – Fractalice Jan 16 '15 at 18:06
Ah, actually, you're right. The attack is actually the one you described, where if you know one of a or b, you can forge a verifier for the other. But I don't think that's actually avoidable in the general case: anyone can create H1(x,y) for arbitrary x, y. If an attacker knows one of those values, regardless of algorithm, they can forge a verifier for the other. If you need it to be authenticated, replacing the hashes with an HMAC is probably necessary. – Stephen Touset Jan 16 '15 at 21:21

score 1 · Answer 3 · answered Jan 19 '15 at 20:52

1

Alternatively, let $H$ be a hash function (in the ordinary sense) from the input space to $[0,2^N)$ for some $N$, and for $0 \leq x < 2^N$ let $f(x)$ be the smallest prime greater than $2^N + x$. (Here $N$ is determined by the security parameter. You can have 'probable prime' instead, where the precise definition of that will also depend on the security parameter.) Then let $H_1(a,b) = f(H(a)) \cdot f(H(b))$, and let $H_2(Z,a)$ be true iff. $f(H(a))$ divides $Z$.

Again, this suffers from the drawback that the image space grows linearly with the number of inputs.

answered Jan 19 '15 at 20:52

Alec Edgington

111
1

Very clever, except surely it'd be simple enough to brute-force for false positives - any value which hashes to a value between the smallest prime greater than and the biggest prime less than $2^N + x$. – Nathan Jan 20 '15 at 09:23
Given two hashes $v_{ab} = H_1(a, b), v_{ac} = H_1(a, c)$, attacker can compute $f(H(a)) = gcd(v_{ab}, v_{ac}), f(H(b)) = v_{ab} / f(H(a)), f(H(c)) = v_{ac} / f(H(a))$ and make hashes of any pairs containing $a, b$ or $c$. – Fractalice Jan 20 '15 at 11:59
True. Good point! – Alec Edgington Jan 20 '15 at 19:28

Hash which can be used to verify one of multiple inputs?

3 Answers3

Linked