Is E_k(M)||H(m) form of MAC secure?

Question

Would this encryption and signing scenario be considered tamper-proof?

plaintext = "hello world"
password = "secret key"
hash = sha3_512(plaintext)
ciphertext = encrypt(plaintext,password)
output = ciphertext+hash

I'm computing the hash of the plaintext and then adding it to the end of the ciphertext output. I know this isn't exactly MAC or HMAC, but does this scenario work? Is it secure? From my perspective, the only way to forge the plaintext is if you know the key. Otherwise, any change will be reflected in the hash. Is this secure? Assume that the cipher is non-malleable. I believe this is the scheme used by Mcrypt for file encryption. Thanks.

This is not a duplicate because it would be infeasible to brute-force the plaintext from the hash as described in the possible duplicate question, scenario 3.

Yes and no. My scenario is the third one described in the question, which is noted as insecure. The data I'm encrypting are large files, so it would be impossible to recover the whole file from just bruteforcing the hash, right? If the file is 2KB, it would require 2^(2048*8), which is totally infeasible. — Evan Su, Jan 26 '21 at 23:20
It is not about the size of the file or brute force, it is about the possible files that you may send. The attacker, that knows a lot about you, may collect all files you have than keep the hash of them. In the end, the security is calculated about your risks. — kelalaka, Jan 26 '21 at 23:23
But that would require the attacker to have plaintext copies of my data. If I only send encrypted ones, that would mitigate this risk, am I correct? — Evan Su, Jan 26 '21 at 23:27
I don't know anything about your attacker, your habits, your files, the point is there are many cases that the files that you have sent that may leak about the message. What if the attacker is one of the three-word agencies that collects all of the open messages ( of all people) and looks at the hashes during the transmission. Can you consider yourself safe of this? — kelalaka, Jan 26 '21 at 23:34
Good point . I think you've explained it well, so you can add an answer and I'll accept it. Thanks. — Evan Su, Jan 26 '21 at 23:54
Beware that for large files the file size is often enough to identify the file. For larger file sizes there may be a quickly diminishing number of files that have the same file size in bytes. So in that case the ciphertext size itself may also leak information about the (possible) files used. In that case the hash would confirm any suspicions. — Maarten Bodewes, Jan 27 '21 at 13:36
Confidentiality is not a function that depends on file size of course, there are plenty fully public files out there that are not confidential at all, and single bit information (true/false) that is entirely confidential. — Maarten Bodewes, Jan 27 '21 at 13:37
Great point, I didn't think of that. If I never send the unencrypted "possibilities", then this scenario would never be usable, I think. — Evan Su, Jan 27 '21 at 14:11
Yes, if the message is hard to guess then you've got a bit more room. However, if you think of message m as being secret then you have a problem with length extension attacks, and you'd still not be secure. Use a MAC instead and read the link I provided under Yehudas answer. By now, "don't roll your own crypto" may be dawning on you :) — Maarten Bodewes, Jan 27 '21 at 17:20

kelalaka · Accepted Answer · 2021-01-27T00:27:58.650

The point (( also see this answer)) is that the hash calculation is free for everybody and we assume that your methodology is known by Kerckhoffs's principles. Anybody can calculate the hash of any information and this may leak the encrypted message.

In Cryptography, we consider the attackers computationally bounded, but not restricted to adapt any method on their advantage.

If the possible attacker one of the three-word agencies, that we can assume that they collect all public information, where available, including your personal information, your networks, your habits, etc. Then they can keep the hash of each information on their humongous database ( like the Utah Data Center ) and once they see a message is transmitted with $E_k(m)||H(m)$ then they will check the existence of the $H(m)$. A hit today will reveal the message you send. They will not stop here if they fail. They will store this message and regularly will check the hash with the updated database.

Therefore this is not a good way to authenticate the messages. In modern cryptography, we have Authenticated Encryption (AE) modes like AES-GCM and ChaCha20-Poly1305 that provide confidentiality, integrity, and authentication in a bundle. Remember nothing is perfect and keep the obligations like for AES-GCM to be secure.

score 4 · Answer 2 · answered Jan 27 '21 at 08:28

4

The answer given by @kelalaka is 100% correct; this breaks the security of encryption and so shouldn't be used. However, I want to add that this doesn't even guarantee integrity. In particular, integrity should hold even if the attacker knows the message. Assume that the attacker knows $m$ and wishes to change the first bit. This change can be easily made (in both CBC and CTR modes), and then the attacker can just replace $H(m)$ with $H(m')$ where $m'$ is just $m$ with the first bit flipped (or whatever other modification desired).

You might say that this only works if the attacker knows $m$, but as I said, integrity should hold in that case as well.

answered Jan 27 '21 at 08:28

Yehuda Lindell

27,820
1
66
83

But wouldn't that require knowledge of the key as well? I'm using the Serpent cipher in CTR. – Evan Su Jan 27 '21 at 14:07
No because the hash is not dependent on the key. If you want that to happen you should use a MAC, and you get into this discussion. – Maarten Bodewes Jan 27 '21 at 17:17
Thanks, that discussion is very detailed and useful! – Evan Su Jan 27 '21 at 17:56

Is E_k(M)||H(m) form of MAC secure?

2 Answers2