Hash-Then-Encrypt or Encrypt-Then-Hash on Keyed Hash Functions

Question

I have seen other answers here on Stack Exchange regarding MAC-Then-Encrypt vs. Encrypt-Then-MAC (and this article regarding MAC-Then-Encrypt padding oracle attacks on SSL) as well as generic Hash-Then-Encrypt vs. Encrypt-Then-Hash, but in this case I am seeking insights on the security aspects of a specific authentication protocol employing a keyed hash function:

$$Alice \xrightarrow{m||h_k(m)} Bob$$

In this setup, where $m$ denotes the transmitted message from Alice to Bob and $k$ represents the shared secret key, the message travels over a public channel susceptible to modification and message insertion by attackers.

Assume we utilize the encryption function $E_k$ of a one-key cipher and a hash function $h$. The cipher is deemed secure, and $h$ possesses the weak collision resistance property and is one-way.

Given that $m$ is public and the hash function $h$ satisfies weak collision resistance, which of the following keyed hash functions provides greater security?

$$h_k(m) = h(E_k(m))$$
$$h_k(m) = E_k(h(m))$$

My understanding is that Hash-Then-Encrypt might offer similar security to Encrypt-Then-Hash due to the computational infeasibility of an attacker finding another $m'$ such that $h(m) = h(m')$. Can someone confirm or provide additional insights on this?

Oh yeah, it acts as a MAC in this case, yes. Would MAC-Then-Encrypt be less secure in this case for the exact reason that it is mathematically possible to find a collision? Even if it's computationally infeasible and basically the same as winning in the lottery? — Hero, Oct 31 '23 at 11:05
See HMAC proof or KMAC constructions. This recovers a proof-based guarantee since no known attacks compromise the pseudorandomness of the compression function, and it also helps explain the resistance to attack that HMAC has shown even when implemented with hash functions whose (weak) collision resistance is compromised — kelalaka, Oct 31 '23 at 11:09
So, from the HMAC proof, both could be said to be resistant to attacks, even if collisions exist? — Hero, Oct 31 '23 at 11:17
Do I understand correctly that the keyed hash (i.e. MAC) function consists of encrypting the message and then hashing it using a normal, unkeyed hash, or encrypting an unkeyed hash over the message? — Maarten Bodewes, Oct 31 '23 at 17:44
@MaartenBodewes yes, the hash functions, h, are unkeyed in both cases - but it is assumed that they are well-designed (weak collision resistance and one-way). — Hero, Oct 31 '23 at 17:58
@MaartenBodewes the key, k, used in the one-key cipher, $E_k$, is kept secret between Bob and Alice. The key is assumed to be exchanged securely beforehand. The specific cipher used can be assumed to be known by the attacker. But, the attacker can see both $m$ as well as either of the two keyed hash function outputs, so $E_k(h(m))$ or $h(E_k(m))$. So the attacker can compute $h(m)$ themselves. — Hero, Nov 01 '23 at 04:51
Neither construction is particularly secure. Normally I'd go for 1. except for small messages (why?). For 2 imagine what happens if the encryption is a stream cipher - imagine that the attacker knows the plaintext message and another message, they obviously can calculate the hash for those... Note that the cipher itself doesn't offer authenticity. — Maarten Bodewes, Nov 01 '23 at 10:47
Yeah, 2 is of course more efficient for longer messages due to the fixed hash size, but the main question is “which of the two keyed hash functions is more secure if used in the given theoretical authentication protocol”. There is of course the mathematical possibility of an attacker finding another m’ such that $h(m) = h(m’)$, but I’d like to know whether one would normally deem them both to have the same level of security in a theoretical real world scenario. — Hero, Nov 01 '23 at 15:05

score 0 · Accepted Answer · answered Nov 09 '23 at 11:21

After talking to some professors at my university, I have come to the following answer, which I'd like to share:

First of all, what is called "weak collision resistance" here is also what's often referred to as the 2nd pre-image resistance property.

Now, while the 2nd pre-image resistance / weak collision resistance property is usually defined as "computationally infeasible" to compute another $m'$ such that $h(m') = h(m)$, it is still (although very unlikely) mathematically possible to find such a $m'$. It's like winning the lottery - not very likely, but still technically possible.

And since this question simply asks "which is more secure", then Encrypt-Then-Hash (1) is the more secure implementation out of the two. But if this was in a practical setting, one could argue that they both provide the same level of security.

However, in the context of AEAD/AE, you want a collision-resistant MAC for the scheme to be committing. Therefore, weak collision resistance doesn't provide the same level of security as it enables vulnerabilities that full collision resistance does not. — samuel-lucas6, Nov 09 '23 at 13:23

Hash-Then-Encrypt or Encrypt-Then-Hash on Keyed Hash Functions

1 Answers1