Can I jettison MAC if I already have SHA1(M)?

Question

I'm currently using SSL with AES-CBC and HMAC for a file transfer containing string M. Now suppose Alice already knows SHA1(M) (and the adversary does not), and she downloads M from Bob using only AES-CBC without a MAC.

Normally this would be ill-advised, but since she can check SHA1(M) after the download completes, isn't this still secure?

If you want to go this route, I'd use an encryption mode that's immune to padding oracles. With your current choice it's far too easy to make a subtle mistake that compromizes confidentiality. — CodesInChaos, Jul 29 '12 at 18:51
Your alternate protocol seems a bit underspecified. Are you still using a TLS handshake? Does Bobs response only consist of M? What padding do you use? Is the message split in several blocks, or just a single continuous stream with an IV at the beginning, and the padding in the end? Do you send multiple messages through the same channel? — CodesInChaos, Jul 29 '12 at 20:50
I know this is old, but there is some controversy here in the answers. What do you mean by "secure"? — mikeazo, Feb 20 '15 at 03:10

D.W. · Answer 1 · 2012-07-29T22:10:51.900

The short answer: No. It is not secure.

Details. To answer the question properly, we first have to decide what we mean by "secure". In this case, I assume security means confidentiality plus integrity. So let's talk about each separately.

Integrity: yes, this provides integrity, under your assumptions. @poncho explained why.
Confidentiality: no, this does not provide strong confidentiality protection. There are chosen-ciphertext attacks that may allow an active attacker to learn partial information about M.

The details and feasibility of such an attack will depend upon things like exactly what padding scheme you use, whether known plaintext is available, and so on. However, the basic concept is: the attacker replaces Bob's transmission with a carefully constructed ciphertext, and watches Alice's reaction. In many systems, Alice's reaction will reveal whether the decrypted message matched Alice's hash. Thus, the attacker learns 1 bit of information from Alice's reaction. And, in certain situations, an attacker can leverage this to learn partial information about M, by choosing his ciphertext cleverly.

Read about padding oracle attacks and that sort of thing to learn more.

Summary. Use a message authentication code in encrypt-then-authenticate mode, or use authenticated encryption. This stuff is tricky and subtle. Don't try to take shortcuts. It is too easy to open up a non-obvious security weakness.

More details: (added 7/29) I see some folks don't believe that the confidentiality attacks are possible. The misplaced confidence in the impossibility of such attacks is an interesting comment on how people's convictions about what is secure are not necessarily correlated to what is actually secure, but I guess I should give further references to demonstrate the possibility of such attacks, as they are non-obvious, counter-intuitive, and quite surprising. To learn more, see, e.g.,

Attacking the IPsec Standards in Encryption-only Configurations, Jean Paul Degabriele and Kenneth G. Paterson, IEEE Security and Privacy 2007.
Problem areas for the IP security protocols, Steven M. Bellovin, Usenix Security 1996, Section 3.8.
Password Interception in a SSL/TLS Channel, Brice Canvel et al, CRYPTO 2003.
Intercepting Mobile Communications: The Insecurity of 802.11, Nikita Borisov et al, MOBICOM 2001, Sections 4.4.2, 4.5.

Pro tip: you might want to make sure you understand the attacks in those papers before feeling too confident that the scheme proposed here can definitely resist those kinds of attacks.

An example attack: For those who have read the papers, here's a sketch of an example attack against the scheme proposed here.

Let's make some assumptions to keep the example simple. Suppose that, before encryption, the message M is padded out to a multiple of 16 bytes by appending arbitrary bytes and then a single byte indicating how many bytes of padding should be removed. Suppose for simplicity of exposition that the message M is always exactly 1 byte long. This means M will be padded before encryption out to 16 bytes by appending 14 random bytes and then 1 byte with the value 0x0F. Suppose that the attacker has previously captured plenty of known plaintext and its corresponding encryption, and now the attacker has intercepted the encryption of M and wants to learn something about M.

Now, suppose the attacker has a guess G at the value of the message (G is a single byte), and the attacker wants to confirm whether M=G or not. Here's how the attacker can confirm his guess.

Let $IV$ denote the IV used when encrypting M; note that $IV$ is known to the attacker. The attacker looks through his collection of pairs $(P_i,C_i)$ of known input-output pairs for AES (each one is a block width); these can be derived from the known plaintext and ciphertext for a CBC-mode encryption. The attacker looks for such a pair $(P_i,C_i)$ such that $IV \oplus P_i$ has G as its first byte and has 0xF as its last byte. The attacker replaces the ciphertext that Bob sent by the ciphertext $(IV, C_i)$ and waits to see Alice's reaction. If Alice rejects this message, then the attacker learns that $M\ne G$. If Alice accepts this message, the attacker learns that $M=G$.

Therefore, the possibility of this attack means that the scheme proposed in this question does not provide semantic security: it is not IND-CCA2 secure. Put in non-technical terms, it does not provide the level of confidentiality protection that we have come to expect from an encryption scheme. In practice, such a vulnerability might be quite severe and might be readily exploitable by bad guys.

If you've read the research literature that I cited above, none of this is anything especially novel or new. Of course, if you're not familiar with that research literature, this stuff is thoroughly surprising and unexpected. So, take this as a lesson about the dangers of departing from accepted practice, even if everything seems fine to you. If you're not an expert in this area, you probably aren't qualified to judge whether your scheme is secure -- and if you are an expert, I suspect you'll probably be especially leary of departing from accepted practice, given the number of surprising attacks that have been discovered in the past.

I don't agree with your short answer. It's secure if implemented correctly. But with this combination of encryption and authentication it's very easy to make an implementation mistake. — CodesInChaos, Jul 29 '12 at 18:47
@CodesInChaos, I understand that the existence of the kind of attacks I described is surprising and counter-intuitive, but it turns out it doesn't matter whether you agree or not; what matters is that the attack is possible. — D.W., Jul 29 '12 at 19:31
I believe those attacks can be avoided if you're really careful in the implementation. But I'd avoid such a scheme because they're so hard to get right, while offering no real advantage over modes which do not suffer from these issues. — CodesInChaos, Jul 29 '12 at 20:05
"I believe those attacks can be avoided if you're really careful in the implementation" - Nope. Your belief is misplaced. They cannot. There are some protocols where, no matter how careful you are in implementation, if you implement the specified protocol, you will be vulnerable. If you read the papers I cited, you will find examples. Or, read the detailed example I added to the end of my answer. — D.W., Jul 29 '12 at 20:18
I disagree with this short answer and also with the advice in this "Summary". I disagree with the short answer because it assumes that the padding scheme used is amenable to that sort of chosen-plaintext attack. In fact there are padding schemes that, when combined with implementations free of timing and error-handling side channels, prevent all such attacks. (I.e. padding schemes in which there is a bijection from message to padded-message, and therefore there is no ciphertext block other than the original ciphertext block which will result in a plaintext that matches the hash. … — Zooko, May 02 '14 at 03:32
I disagree with the advice in the summary because the proposed alternative — to use a message authentication code or authenticated encryption — has weaker security properties than the original proposal to use the secure hash of the plaintext has. Whether those reduced security properties would entail actual risk to the users can't be determined without more information about the way this protocol is used. — Zooko, May 02 '14 at 03:35
However, I should add that the short answer is likely to be right, in that the scheme and its implementation are likely to expose the sort of vulnerability that D.W. described. Likewise, the advice to use authenticated encryption might be perfect, depending on the needs of the user. So these answers are not wrong, but they overstate the case when they say that the proposal is definitely not secure. It might be. — Zooko, May 02 '14 at 03:36

score 6 · Answer 2 · answered Jul 21 '12 at 16:28

6

It is easy to see that this secure, in the sense that the attacker cannot cause Alice to accept any download except for the file that Bob originally sent.

This remains true even if the attacker knows the encryption (CBC) key (alternatively, Alice and Bob doesn't bother to encrypt the message at all), and if the attacker also knows the correct $SHA1(M)$ value, as long as the attacker cannot choose the file $M$.

This is because if the attacker can cause Alice can accept another message $M' \ne M$, then that's because $SHA1(M) = SHA1(M')$, and so the attacker has effectively found a second preimage to the message $M$. As far as we know, this is infeasible to do with SHA1.

Personally, I would suggest using (say) SHA256 rather than SHA1; that way, we don't have to even make the assumption that the attacker cannot influence $M$.

Now, you might ask "if this works, why do we bother with MACs"? Well, I believe that a large part of it is that it is fairly unusual that Alice knows the value of $SHA1(M)$ independently of the download; in most cases, the receiver has no apriori information of the message contents.

answered Jul 21 '12 at 16:28

poncho

147,019
11
229
360

1

Thanks for the reply. Your "secure" above works even without encryption(!). My intent was to preserve privacy even in some "reasonable" attack model, meaning the adversary cannot decrypt C=E_K(M) even with access to a decryption oracle and subject to the usual complexity-theoretic limits. – Fixee Jul 21 '12 at 20:10
@Fixee: not requiring encryption for security is not surprising; I was specifically talking about security from integrity attacks (that is, attacks whose goal is to modify the received message); in general, you don't need encryption for that. However, if you have a separate security requirement of privacy, then of course you need encryption to address that. I am a bit confused about your "reasonable" attack model that includes a decryption oracle; if the attacker had a decryption oracle, he could just ask the oracle for the decryption of C. – poncho Jul 23 '12 at 13:11
1

It's standard in CCA security to give the adversary a decryption oracle (that's what every definition of CCA security does, that I've seen). You of course don't give the adversary credit for decrypting a message he's encrypted with the corresponding encryption oracle. (http://en.wikipedia.org/wiki/Ciphertext_indistinguishability#Indistinguishability_under_chosen_ciphertext_attack.2Fadaptive_chosen_ciphertext_attack_.28IND-CCA.2C_IND-CCA2.29) – Fixee Jul 24 '12 at 23:12
Poncho, @Fixee was on the right track. In fact, there are some tricky attacks on confidentiality. Therefore, I do not think I would call this scheme secure. (Your answer considers only integrity, but integrity is only half of the story.) See my answer for elaboration. – D.W. Jul 29 '12 at 07:06

CodesInChaos · Answer 3 · 2015-02-19T14:01:09.330

3

You can be sure that the attacker did not manipulate the file. It's preferable to use a hash function that's collision resistant, even if it doesn't seem to be strictly necessary in your application.

But confidentiality against an active attack is problematic. Since the authentication happens inside encryption (MAC-then-encrypt). A careless implementation that leaks if the padding was valid will allow an attacker to recover the whole plaintext via a padding oracle. A careful implementation can reduce the leak, but that's pretty tricky to do and might still allows some attacks. I don't see any attack that works, but why would you take such an unnecessary risk? If you want/need to rely on hash verification, use CTR or OFB mode which are far easier to reason about than CBC or CFB.

Finally with your method Alice can only reject or accept the data one the download has completed. This requires a full re-download on error, and prevents streaming use. This can be avoided with tree-hashing or packet wise MACs.

edited Feb 19 '15 at 14:01

answered Jul 21 '12 at 14:23

CodesInChaos

24,841
2
89
128

Thanks. Actually in my application there is a digest for every 4MiB, so the resend-problem isn't a problem. I wrote my question in a simplified form to focus on the essential issue. – Fixee Jul 21 '12 at 15:46
I thought about padding oracles, by the way, but any adversarial message is overwhelmingly likely to just be rejected (the same effect a MAC would cause). Another concern is extension attacks, but I think I've ruled those out as well. – Fixee Jul 21 '12 at 20:08
1

@Fixee If you validate the padding first, and only then hash, and attacker might be able to use that to decrypt the message, even if he can't replace it by his own. – CodesInChaos Jul 21 '12 at 20:54
Yeah, I know the padding attacks well. – Fixee Jul 22 '12 at 02:35
@Fixee- "any adversarial message is overwhelmingly likely to just be rejected" - I believe this is incorrect. It may be true for a randomly constructed ciphertext, but an adversary can be smarter than that. – D.W. Jul 29 '12 at 07:08
@D.W. My assertion that adversarial messages will (almost certainly) be rejected is based on the assumption that any perturbation to a string whose SHA1-digest is fixed-and-known is effectively immutable. Gave you give an example where this is false (choose any padding scheme you like). – Fixee Jul 29 '12 at 18:44
@Fixee, sure, the ciphertext may be different but the decrypted message may be the same (once you remove padding). Thus, there is no perturbation to the "message" but there is a perturbation to the ciphertext. Thus, the attacker's ciphertext might still be accepted, even though it is different from what Bob sent. Remember, you're hashing the message, not the ciphertext, so your scheme does not ensure that the ciphertext is effectively immutable. A detailed example is too long to fit in a comment box, but read about padding oracle attacks and other chosen-ciphertext reaction attacks. – D.W. Jul 29 '12 at 19:28
@D.W. I'm not sure why I would care about perturbing the ciphertext if the underlying message is immutable; padding attacks rely on being able to change the message (in order to decrypt it based on padding-validity rules). Anyway, thanks for your input. – Fixee Jul 29 '12 at 20:20
@Fixee, See my answer (which I edited recently to elaborate) for explanation why. A full explanation is too long to fit into this comment box. "padding attacks rely on being able to change the message" - This is not correct. – D.W. Jul 29 '12 at 20:23

Can I jettison MAC if I already have SHA1(M)?

3 Answers3

Linked