Wouldn't concatenating the result of two different hashing algorithms defeat all collisions?

Question

Let's say I have three messages: A B C

And I run each of these through two different Hashing algorithms: MD5 and SHA1 for this example

MD5(A) = X
MD5(B) = Y
MD5(C) = Y
SHA1(A) = N
SHA1(B) = N
SHA1(C) = M

Notice the MD5 hash of B and C collide. And the SHA hash of A and B collide.

If I simply concatenate the digests, however, the results would be unique:

Combined Digest of A:  XN
Combined Digest of B:  YN
Combined Digest of C:  YM

The underlying principle would be that whatever pair of messages could be found or constructed to form a collision with one hashing algorithm, wouldn't also form a collision with another hashing algorithm.

The combined digest length (for MD5/SHA1) would be 288 bits (128+160) -- but unless I'm missing something, this would be significantly more secure than a single hashing algorithm with a 288-bit digest.

Granted, in the example above I'm using MD5 and SHA1 which are both known to be effectively broken, but I'm hoping an answer exists that applies more conceptually to the premise than simply the choice of algorithms.

i.e., In a situation where collision resistance is critical, wouldn't the combination of SHA2-256 + SHA3-256 concatenated be more secure than a single iteration of SHA2-512, or SHA3-512?

FYI, the closest question I found that is similar is this one: https://crypto.stackexchange.com/questions/81634/security-implication-of-concatenating-two-hashes-using-different-algorithms But it didn't quite ask the exact premise of my question. — Eddie, Mar 01 '23 at 18:52
The short answer is no: a collision for MD5 || SHA-1 is almost as easy as a collision for SHA-1 alone — much worse than SHA-256 despite SHA-256 being slightly shorter, but unbroken. — Gilles 'SO- stop being evil', Mar 01 '23 at 19:25
@Gilles'SO-stopbeingevil' Thanks for the insight, Gilles. This would help in the specific case of md5 and sha1, but not in the conceptual, underlying case. — Eddie, Mar 01 '23 at 20:22
No, it's not specific to MD5 and SHA1: the fact that the attack on the concatenation is basically only as hard as the attack on the longest hash is mostly generic. — Gilles 'SO- stop being evil', Mar 01 '23 at 20:41
@Gilles'SO-stopbeingevil' Would you be willing to expand on that in an answer =). Specifically why the "difficulty" to find a collision wouldn't increase multiplicatively ? — Eddie, Mar 02 '23 at 00:05
What leads you to believe "whatever pair of messages could be found or constructed to form a collision with one hashing algorithm, wouldn't also form a collision with another hashing algorithm"? — Vaelus, Mar 02 '23 at 04:33
@Gilles'SO-stopbeingevil' : your "almost as easy" is a stretch of "wouldn't be that much more expensive" in the accepted answer, which proposes a strategy where the cost is increased by a factor $64=2^6$, and adds a parallel collision search for the MD5 part. — fgrieu, Mar 02 '23 at 07:32
You cannot defeat all collisions for a hash function with finite codomain. — kodlu, Mar 02 '23 at 10:40
@Vaelus Admittedly, I don't know the true inner workings of many hashing algorithms. But my theory was based on the internal mechanisms being different, which would make finding a collision for two algorithms that "go about creating a digest" differently exponentially more difficult. However... you're answer has clarified that my theory was incorrect. Thank you. — Eddie, Mar 02 '23 at 15:43
The pigeon-hole principle alone tells us that collisions are still possible. — JimmyJames, Mar 02 '23 at 16:16

score 4 · Answer 1 · answered Mar 01 '23 at 19:25

4

No, concatenating two hashes gives you at least the collision resistance of either but in many practical cases it will give you little more.

This is especially truely for MD hash functions where we know how to convert collisions into many way multi collisions. We can make 2^64 multi way sha1 collision and expect one will collide also in MD5.

answered Mar 01 '23 at 19:25

Meir Maor

11,835
1
23
54

but in many practical cases it will give you little more -- could you expand on this a bit? And specifically speak to why it wouldn't give you an additional "the full collision resistance" of the other hashing algorithm? – Eddie Mar 01 '23 at 20:23
try this question: https://crypto.stackexchange.com/questions/103630/does-using-multiple-hashes-to-check-if-a-file-has-been-spoofed-reduce-collisio – Meir Maor Mar 02 '23 at 11:00

score 4 · Accepted Answer · answered Mar 02 '23 at 04:55

No, concatenating the result of two different hashing algorithms does not defeat all collisions. You've overlooked the case where $\text{MD5}(A)=\text{MD5}(B)=X$ and $\text{SHA1}(A)=\text{SHA1}(B)=N$. In English, that's when a pair of inputs collides for both hash functions.

Furthermore, assuming a hash function's output is truly uniformly distributed for any given set of inputs (this isn't actually true, but for our purposes, it's close enough to true for modern cryptographic has functions), the collision resistance of $\text{HASH}^P_\text{256-bit}(A) +\text{HASH}^Q_\text{256-bit}(A)$ is exactly equal to the collision resistance of $\text{HASH}_\text{512-bit}(A)$.

Again, assuming a uniform distribution, the chance two inputs collide for an N-bit hash is $\left(\frac{1}{2}\right)^N$, or one in two for each bit of output. Assuming the chance that a pair of inputs collides for hash $P$ is independent of the chance of collision of the pair for hash $Q$, the chance a pair collides for both is the product of the chance it collides for each hash individually. Given this, it's clear the chance of collision is identical either way, since $\left(\frac{1}{2}\right)^{256}\cdot\left(\frac{1}{2}\right)^{256}=\left(\frac{1}{2}\right)^{512}$.

Perhaps some people are sufficiently paranoid that they would prefer not to assume a uniform distribution -- e.g. if in the future it turns out that one of the hashing algorithms contains an exploitable flaw but the other algorithm does not, then using the concatenation of the two algorithms' output could be more resistant to exploitation than just using a single algorithm to generate twice the number of hash-bits? — Jeremy Friesner, Mar 02 '23 at 06:14
This was the confirmation I was seeking. Thank you. Follow on question, is there a reason why doing a single 512-bit hash is "better" than concatenating two different 256-bit hashes (faster, more secure, simpler, lower cpu, anything)? — Eddie, Mar 02 '23 at 15:44
@Eddie This will depend entirely on the specific hash functions. For example, some processors implement hardware acceleration for specific hash functions, so one method may be significantly faster than the other on some hardware. In general though, assuming none of the hash functions have a known weakness, I don't think there's a strong technical reason to choose one or the other. That said, I think many people would find the decision to concatenate two cryptographic hashes perplexing in the absence of a strong technical reason, since it presumably increases implementation complexity. — Vaelus, Mar 02 '23 at 23:41

Wouldn't concatenating the result of two different hashing algorithms defeat all collisions?

2 Answers2