Why is OTP not vulnerable to brute-force attacks?

Question

I saw this question on the book Understanding Cryptography.

At first glance it seems as though an exhaustive key search is possible against an OTP system. Given is a short message, let’s say 5 ASCII characters represented by 40 bit, which was encrypted using a 40-bit OTP. Explain exactly why an exhaustive key search will not succeed even though sufficient computational resources are available.

It puzzles me because I think that as far as I know that the key has only $40$ bits, I could try all the possible $2^{40}$ keys and XOR them with the ciphertext to recover the message... Is there something that I am missing? How can it not work if we are assuming the attacker has computational power to do the exhaustive search?

If an OTP is applied, you can brute force as much as you want to, you won't learn anything new about the plaintext. See also this related answer. — SEJPM, Mar 03 '16 at 13:01
RevK did a nice article and video about this some time ago - might be worth a read — Dezza, Mar 03 '16 at 15:48
OTP is not vulnerable to brute-force because a dictionary attack against an OTP yields the dictionary itself. — Mindwin Remember Monica, Mar 03 '16 at 17:19
The reason is that It's the same problem as with the library of Babel — Vandermonde, Mar 05 '16 at 03:49
It's important to remember that in an OTP the key must be at least as long as the text you are encoding. If the key is repeated to encode a longer text then it starts to become possible to brute force. — Eborbob, Mar 06 '16 at 21:44
Relevant: Bunnie Huang, in his free ebook "Hacking the Xbox", describes how he brute-forced for all possible keys, and histogramming the decrypted byte values. "If the key did not match, the output should be statistically “white.” In other words, a histogram of the output should show that all values are roughly equally probable for a non-matching key. However, if the key was the correct one, the histogram should be biased, with some values being significantly more popular than all the other values." // Bunnie did this for RC4. (Continued) — Krazy Glew, Mar 07 '16 at 22:38
I.e. if you can tell by looking at a candidate decrypted text whether it is legit or not, then you can brute force, even a OTP. If you cannot tell - e.g. if the plaintext is a random number - then you cannot tell if brute force has succeeded. And in true XOR OTP, every possible output string of the same length is a candidate, so you will not have learned anything. — Krazy Glew, Mar 07 '16 at 22:40
The core of the problem is that OTP doesn't give you any Oracle to know if an output is correct. As in, for instance AES-CBC, the key generation gives you additional info that can be used to test an output. — shumy, Feb 04 '20 at 11:36

score 38 · Accepted Answer · edited Mar 04 '16 at 10:42

38

Brute force on OTP will give you all sorts of messages which are meaningful and not meaningful.

For example, you have a 4-character encrypted text: weaw. Now brute-forcing will give you all sorts of meaningful and not meaningful messages like:

erwe
hell
road
....

Now, which one was the real message? That would be difficult, rather impossible to guess.

edited Mar 04 '16 at 10:42

Andrew T.

103
5

answered Mar 03 '16 at 12:53

Abhisheietk

496
4
5

I got it. So, if the message were bigger, it would be easy to distinguish the real message (because it would be unlike random texts generate English sentences), but in this case, the brute-force attack would be harder... – Vladmostov Mar 03 '16 at 13:13
16

@Vladmostov No, even for longer messages each message has the same probability. As long as the text has the same size, any message is equally possible (when looking at the binary representation anyway). "Tralalala!" is equally likely as "Shakespear" etc. – Maarten Bodewes Mar 03 '16 at 13:18
4

@MaartenBodewes: replace "same probability" with "same probability as it would have weren't you given any cipher text". Because me writing polish ASCII is much less likely than writing english ASCII and so two strings wouldn't have "the same" probability. – SEJPM Mar 03 '16 at 13:20
@MaartenBodewes thanks, I just read the link in SEJPM's comment (in the question). – Vladmostov Mar 03 '16 at 13:25
3

So, one must not have a checksum in the plaintext? – Kijewski Mar 03 '16 at 19:19
11

Why would that matter? If you have 16 bytes of plaintext and a 4-byte checksum, you only "rule out" decryptions where the checksum is incorrect. But there's an equally-plausible decryption with a correct checksum for all $2^{16}$ possible plaintexts. That said, one should always MAC after encryption and not before (for reasons I won't go into here). – Stephen Touset Mar 03 '16 at 20:15
1

Does it just reduce the number of possible decryptions to those which have a valid checksum in it? – Kevin Panko Mar 03 '16 at 20:27
12

@KevinPanko, a 32-bit OTP-encrypted checksum does mean that only one out of every four billion decryptions has a valid checksum, but the added length means there are four billion times as many decryptions. The two factors cancel out, providing the attacker with no useful information. – Mark Mar 03 '16 at 21:29
1

That applies to non-OTP as well. – micheal65536 Mar 03 '16 at 21:42
I'm not sure that's the case. It's at least not trivially so. – Stephen Touset Mar 05 '16 at 00:07
1

@MichealJohnson: That's not true. For OTP, a four-byte checksum increases the key size (and therefore search space) by 2^32. For any possible plaintext you can pick a version of the plaintext with the correct checksum and it will have the same likelihood. For traditional symmetric ciphers, the key size will stay the same, so you can use the checksum to eliminate possible plaintexts. However, this doesn't happen if the checksum occurs after encryption. – Dietrich Epp Mar 05 '16 at 01:14
"No, even for longer messages each message has the same probability." No they don't. That message you intercepted from the U.S. military is far more likely to say "bomb Baghdad" than "bomb Chicago" or "jeeowkvupbq". – David Richerby Mar 05 '16 at 09:01
3

@David Sure, but the same is true before you brute force the message: the point is that, with an OTP, brute forcing the message will never tell you anything new. All the information you have is the length of the message, that’s it (and that could well be padded). – Konrad Rudolph Mar 05 '16 at 15:12
@KonradRudolph Agreed. But that's a statement about knowledge, not about probability. – David Richerby Mar 05 '16 at 16:14
3

@David For the purpose of the discussion, knowledge = probability. In particular, brute-forcing an OTP cypher does not shift the prior probabilities of any given message. You’re right that messages don’t have equal probability (due to priors), but that’s irrelevant in a discussion about cracking a cypher. All we care about is whether the act of cracking the message changes these probabilities. – Konrad Rudolph Mar 05 '16 at 16:36
2

For a brute force attack to be useful, it has to reveal information you didn't already have before it was conducted. For OTP, brute forcing the keystream yields nothing. – Fixee Mar 05 '16 at 19:03

score 22 · Answer 2 · edited Mar 21 '16 at 15:18

What you are missing is the fact that every resulting message is equally possible. There is no way to verify that any of the resulting messages was indeed the message that was sent.

If you have $P_1P_2P_3P_4 \oplus K_1K_2K_3K_4 = C_1C_2C_3C_4$ where each $P$, $K$ and $C$ are one bit, then $C_1C_2C_3C_4$ can have any value possible.

Now assume your brute force will try $A_1A_2A_3A_4$ as key, then $C_1C_2C_3C_4 \oplus A_1A_2A_3A_4 = Z_1Z_2Z_3Z_4$ will have any value as well. There is no way to test if $Z_1Z_2Z_3Z_4 = P_1P_2P_3P_4$ though. As there is no relationship at all between different bits then every $Z$ value will be equally likely.

That's why an OTP is perfectly secure for messages of a particular size. Modern ciphers such as AES do have a (very complex) relationship between the bits, so there are possibilities to check if you have the correct plaintext for a given key with an amount of certainty. With an OTP, the chance that you get the plaintext bit back is exactly $0.5$ per bit - i.e. you don't know if you guessed right or not.

Steve Jessop · Answer 3 · 2016-03-04T01:40:20.893

First you have to understand why it is possible to do exhaustive key searches on other systems.

Suppose you have a plaintext of length n, ciphertext of the same length n, and a key of length k (all in bits). Then by trying all possible keys we obtain at most 2^k candidate plain texts. If the system has some kind of validation or message integrity built into it then it might be rather less than 2^k. It has to be at least 1, and it might only be 1, in which case exhaustive search always works against that system (of course, if k is big enough we don't care whether exhaustive search works or not, but that's beyond the scope of the question).

But supposing the system itself doesn't tell us which key is correct (which of course OTP does not): if k is much smaller than n, then only a very small proportion of all possible length-n messages will be represented in our exhaustive search. One of them is the correct plaintext, and the rest are not. How do we normally know which one is right? The answer is that normally the others will be garbage[*], because if you pseudo-randomly choose 2^k strings of length n, for k significantly smaller than n, then with very high probability all of them will be garbage. It's only because what we start with is known to be an encrypted message that we have any right to expect any of the outputs to make sense.

So, normally speaking if we find a candidate key that produces sense, we're fairly confident that we've broken the message. We still might not know for sure. For example, perhaps by chance the system has two different keys, one of which deciphers the given ciphertext to "attack at dawn", and the other deciphers it to "attack at dusk". But for cryptosystems that are subject to exhaustive search this must be very unlikely, and so as soon as we find a message that makes sense we have far more confidence than the sender of the message is comfortable with us having, that it is indeed the message they sent. If these are the only two candidate plaintexts that make sense, we've already learned way more about the message than the sender would like. Furthermore suppose (as is often the case for ciphers other than OTP) the sender uses the same key more than once, and the same key produces sense for multiple different ciphertexts. This almost cannot happen by chance, so we are now very confident that we have brute-forced the key.

Now, what about OTP? Then k = n, so even if the outputs were pseudo-random we'd expect many candidates that make sense. What's even worse, exhaustively trying every single key generates every single text of length n as a candidate plaintext. Specifically, the message M is generated by the key M XOR C, where C is the ciphertext. It is guaranteed that we will find a key that deciphers the message to "attack at dawn", and another that deciphers to "attack at dusk", and another that deciphers to "mine's a pint!", and so on for every message of that length.

So if we do our exhaustive search, all it will tell us is that "the plaintext could be any message of length n". Which we knew already. We can still rule out the garbage, but doing so leaves us with every single non-garbage message of the correct length.

The exhaustive search tells us nothing.

[*] "garbage" is not a technical term here, but what I mean is that if the plaintext message is believed to be in English, then most outputs generated will not be English. If it's believed to be a .png file, then most outputs generated will not have the correct .png file header. And so on. Many cryptosystems it's an advantage for the attacker to have a "crib" when doing an exhaustive key search: OTP it is not.

Why is this an exclusive property of the OTP? Isn't this true for all ciphers where the key length is equal to the message length? — AdHominem, May 03 '16 at 09:19
@AdHominem: I don't know that it is an exclusive property of OTP. However, it's not true for all ciphers whose key length is equal to the message length, that for any given ciphertext and given plaintext there exists a key that encrypts that plain to that cipher. It's pretty easy to invent a "toy" (not useful) cryptosystem that has extremely large keys and does not have this property. — Steve Jessop, May 03 '16 at 09:29
... in fact for any key size you can invent/modify a cryptosystem so that it spectacularly fails to have this property. Simply append an HMAC over the plain text, using the key, to the ciphertext. Then for any given ciphertext there are very few key / plaintext combinations that successfully match it (you rather hope only one), that's the whole point of HMAC. — Steve Jessop, May 03 '16 at 09:32
Could you name an example where this property is not existant? I thought if you have a ciphertext of length n and apply any possible bit mask of length n to it, you should be able to get any possible plaintext. — AdHominem, May 03 '16 at 09:33
@AdHominem: "apply any bitmask of length n" -- I don't understand, are you asking whether this is a property of every cipher that uses XOR with the key, or are you asking whether this is an exclusive property of OTP? Other than OTP there possibly are no well-known named ciphers where the key is the same size as the message, so "naming one" is a rather different issue from whether a perhaps-very-weak cipher exists with huge keys and without this property. — Steve Jessop, May 03 '16 at 09:37
"you rather hope only one" -- actually that was a silly aside, of course there's more than one because the HMAC itself is smaller than the message. What I should have said is, "you rather hope only the minimum proportion dictated by the sizes involved". — Steve Jessop, May 03 '16 at 09:39
No I just meant that if you have a cipher text of size n and an equally sized key, you can generate ANY possible plain text of size n from this in general and for all given cipher texts. Is that true? — AdHominem, May 03 '16 at 10:38
@AdHominem: no, it is false. Consider for example the null cipher (that maps plain text to cipher text unchanged), then modify it to require a key of length n that actually has no effect on the result. — Steve Jessop, May 03 '16 at 10:42
@AdHominem the reason other well defined crypto systems, even if they used a key as long as the message, aren't as "secure" as the OTP is because almost every where else, a given key is used to encrypt multiple messages. With the OTP, even if you knew the message was "valid English", there could be lots of candidates and no way to know which was correct. If you have two messages that are "valid English", now, your search space is cut down to which key produced a valid message in both cases. The reuse of the key leaks information. — iheanyi, May 23 '17 at 21:30
+1, this should be the accepted answer. It's the only one that breaks down the logic sufficiently to convince doubters: First you have to understand why it is* possible to do exhaustive key searches on other systems.* — Wildcard, Jun 12 '17 at 17:23

score 7 · Answer 4 · answered Mar 03 '16 at 17:31

7

The bottom line answer is this: every possible 5-character ASCII string is equiprobable. Therefore, if you try all possible keys (which is practical, as you noticed), then you will certainly see the correct plaintext string at some point. But you will have no way to know that the correct string is the correct string.

To make this painfully clear, consider OTP with only a 1-bit message. In other words, the plaintext is 0 or 1. Now there is a secret key bit that is 0 or 1 which is XORed with the plaintext.

The ciphertext is therefore 0 or 1, equiprobably. And you can trivially brute force the 1-bit key. But this gives you zero information about which value the plaintext has.

This same logic works for 40 bits too.

answered Mar 03 '16 at 17:31

Fixee

4,158
2
25
39

1

And if you're thinking "well, what if the message has a checksum?" Every plaintext with its valid checksum is a possible decrypt, and they're all still equiprobable, even if they're vastly outweighed by the possible decrypts with invalid checksums. – hobbs Mar 04 '16 at 05:17
"every possible 5-character ASCII string is equiprobable" Nonsense. Suppose you just intercepted an OTP-encrypted message from the U.S. military. Do you think it's equally likely to say "bomb Baghdad" as "bomb Chicago"? No, because the U.S. military doesn't generate messages by flipping fair coins. The point is not that they're equiprobable but that there are keystreams that encrypt each of those messages (and "order pizzas") to whatever cyphertext you found. – David Richerby Mar 05 '16 at 08:58
@DavidRicherby The point is that the computer has no way of knowing which of the 2^40 messages is the original. And you, yourself, don't know because even though most of the messages don't spell out English words, there are enough 5 letter English words that you have no way of knowing which was the original one. – CJ Dennis Mar 05 '16 at 09:49
@CJDennis Agreed. But that's a statement about knowledge, not probability. – David Richerby Mar 05 '16 at 16:15
1

@DavidRicherby It's not nonsense. It's math. The point is that both "bomb Baghdad" and "bomb Chicago" will appear as candidate plaintexts for a 12-character OTP ciphertext, but you have no additional information as to which it is. Or as a cryptographer would put it, "the information you have about the plaintext after seeing the ciphertext is the same as what you had before seeing the ciphertext." You say, "bomb Baghdad" is more likely, but you didn't learn that from the ciphertext; you're relying on information you already had. – Fixee Mar 05 '16 at 18:59
@Fixee It's not math at all. You're making statements about probability in a situation where you don't even have a probability distribution. You're completely correct that the cyphertext provides no information but that's not a statement about probability. Perhaps what you mean is that, assuming the OTP is chosen uniformly at random then, regardless of the plaintext, each cyphertext is equiprobable? – David Richerby Mar 05 '16 at 19:19
@DavidRicherby Agreed. A brute-force OTP crack is guaranteed to give you no extra information about the original message. And that's why it's useless. – CJ Dennis Mar 06 '16 at 04:06
@DavidRicherby If $D$ is the probability distribution on plaintexts before the ciphertext $C$ is known, then we'd say Pr$[D] =$ Pr$[D | C]$. In other words, the distribution on plaintexts is unchanged by revelation of the ciphertext. – Fixee Mar 07 '16 at 15:55
@Fixee I think you mean "$\Pr_D[X]=\Pr_D[X\mid C]$, where $X$ is the event that the plaintext takes some particular value" -- it doesn't make sense to write $\Pr[D]$ when $D$ is a distribution. Your answer still claims that "every possible 5-character ASCII string is equiprobable." This claim is still false, and is very different from the one you tried to make in your comment. – David Richerby Mar 07 '16 at 16:04
@DavidRicherby, so let's see your answer to this question, then. – Wildcard Jun 12 '17 at 17:27
@Wildcard Pointing out inaccuracies in an answer in no way obliges me to write an answer of my own. – David Richerby Jun 12 '17 at 17:36
@DavidRicherby I think you are conflating two probabilities, 1) what is the probability of a given 5-character ASCII string, vs 2) what is the probability that the given string is a meaningful plaintext. To your point, "bomb Chicago" might be less probably as the plaintext than "bomb Baghdad" which is a fair claim but it leaves out that the message could equally be "bomb XXXXXXX" where the X's are replaced with all city names of that length and possibly shorter when padded with spaces. You end up with a list of all possible cities as your target; you might as well guess. – Kelly S. French Jun 13 '17 at 15:51

htdawoud · Answer 5 · 2016-03-05T00:20:09.753

While previous answers/comments explained the basic idea, it could help to contrast OTP with pseudo-OTP. In pseudo-OTP: $Enc_s(m) := G(s) \oplus m$, $Dec_s(c) := G(s) \oplus c$, where $G$ is a pseudorandom generator. (This is OTP except that the key is replaced by the output of $G$, seeded by a shorter key $s$.)

The basic idea: brute-forcing OTP doesn't give the attacker any additional information about the plaintext that she didn't already know. Formally, let the plaintext space be the set $P$. Given a ciphertext $c = k \oplus m$ encrypted under OTP, decrypt $c$ with all possible keys $k$, and let the set of generated plaintext values be $Q$. Then $Q = P$. You've learned nothing new.

How is that different with pseudo-OTP? Decrypt $c = G(s) \oplus m$ with all possible keys $s$. Now, $Q \subset P$. Why? Because $|s| < |G(s)| = |P|$. Thus you've learned something new: the encrypted message is not in $P - Q$.

score 2 · Answer 6 · answered Mar 04 '16 at 22:48

Think of it this way: Assume you have intercepted a transmission with ciphertext $C$ of length $N$ bits, and you happen to know that it was encrypted using OTP with an $N$-bit key.

For every cleartext $Y$ (of length $N$ bits) there exists an $N$-bit key such that $Y_x = C \oplus K_x$.

That means, for example, using 8-bit ASCII characters, if a 8*40=320-bit ciphertext $C$ was received, you can derive any 40-character phrase from the ciphertext. In fact, it is trivial to find the key $K_x$ that generates the cleartext $Y_x$

$C \oplus K_x = Y_x \rightarrow K_x = C \oplus Y_x $

Now think about in terms of probability theory. Before receiving the ciphertext $C$, there is some probability distribution across all possible cleartext messages $P(Y)$ that you might intercept. This is your prior probability distribution.

The question is, how is this distribution affected by receiving the ciphertext $C_0$?

Bayes theorem says

$ P(Y_x|C_0)= \frac{P(C_0|Y_x) \times P(Y_x)}{P(C_0)} $

You're interested in finding the probability for each $Y_x$. Note, $P(C_0)$ on the bottom there doesn't matter, because it doesn't depend on $Y_x$.

$P(C_0|Y_x) = 0$ for all $Y_k$ whose length is not $N$ bits. If we assume all keys $K_x$ are equally likely, then $P(C_0|Y_x)=\frac{1}{2^N}$ for all $Y_x$ which are $N$ bits. After re-normalizing the probability distribution, you find that your $P(Y_x|C_0)$ is just the marginal probability of $P(Y_x)$ given that the message is $N$ bits.

Basically, after eliminating cleartexts that are the wrong number of bits, your probability distribution is basically just a scaled version of your prior distribution!

score -4 · Answer 7 · answered Mar 03 '16 at 21:42

-4

I originally wrote this thinking that it was an answer, but then I realised that it's not strictly related to OTP in itself but rather the ways in which OTP is commonly used in computer security (I have more experience in computer security than in pure cryptography). I'll post it anyway though as it's an interesting consideration when analysing everyday OTP systems e.g. for user authentication (2FA etc.).

OTP keys are often valid for only a limited period of time or number of attempts. After that, the system re-generates the OTP or moves onto the next key in a pre-shared list of keys, leaving the attacker with a finite length of time or number attempts to complete the attack, so the key will almost always change before the attack can be completed.

Nevertheless, in the case of an OTP-encrypted message (e.g. between two spies working together) that has been intercepted, the attacker has an infinite length of time and number of attempts to decrypt it; in situations like this the benefit of OTP is that recovery of one key does not compromise all prior or subsequent messages.

answered Mar 03 '16 at 21:42

micheal65536

101
3

2

This answer seems confused. OTP keys are not "valid for only a limited period of time". I think you are confusing the one-time pad with one-time authenticators (e.g., challenge-response authentication). The question is asking about the one-time pad, not one-time authenticators or challenge-response authentication. – D.W. Mar 04 '16 at 00:37
@D.W. in crypto OTP is always Pad, but in computer security more broadly it is also https://en.wikipedia.org/wiki/One-time_password . Of which challenge-response is one form. – dave_thompson_085 Mar 04 '16 at 10:03
In computer security OTP can be a pad. In some high-security situations, lists of passwords or cryptographic keys are used to authenticate on a computer system, with each password or key being used only once. In time-limited systems, each password or key is generated from a combination of the current time and some other information (usually a secret passphrase) so the authentication system and the user's key generator are synchronised. – micheal65536 Mar 04 '16 at 12:05
"OTP keys are often valid for only a limited period of time or number of attempts" By definition, a one-time pad (or password) is only valid once. – Kevin Mar 04 '16 at 19:00
Well, once then. But an OTP used for authentication is often re-generated after a fixed period of time, so may be valid for more than one attempt. Likewise, the same OTP may be used twice to avoid excess inconvenience for the user. – micheal65536 Mar 05 '16 at 10:21
If you use the key twice, by definition it is no longer a 'one-time' pad. In other words, what you say may be true as a compromise in practicality but it destroys the cryptographic guarantees of the OTP. Also, OTP can not be re-generated because you need to securely distribute the pad to both ends of the communication beforehand. The more I type the more your answer confuses me, what definition of OTP are you using? – Kelly S. French Jun 13 '17 at 15:58

Why is OTP not vulnerable to brute-force attacks?

7 Answers7

Linked