10

Firstly some predicates:-

  • Sufficient hardware generated one time pad key material.
  • No pad reuse.
  • Messages of 160 characters length (think Twitter).
  • 28 characters only in use (A-Z, space and full stop alphabet. Think Morse).
  • The very vast majority of messages based on English grammatical form.
  • The messages would be typed on a computer and read via computer.

Would it be possible to authenticate a one time pad message manually by hand using say pen and paper? Perhaps even a spread sheet, but certainly no programming languages. Hence the typical HMAC is out. The character set and message length are not unmanageable with some determination.

But most importantly, the authentication only has to be manually provable to work. He will not be authenticating every message, just those he chooses to test. That means that the user has to be able to verify to himself that the algorithm works on the messaging system. So he can take an arbitrary message, authenticate and satisfy himself of correctness. The purpose is to create confidence in a manually verifiable way that the system is working correctly and not a fraud or victim of a Man in the Middle attack.

High security is of secondary importance, subservient to the need for provable manual verification of authentication. Whilst a one time pad is totally secure, the authentication algorithm needn't be. Any degree of authentication will be of interest for this question. It would be nice if any proposed solution could have the security level quantified in bits.

I appreciate that there is a spectrum of levels of security for the authentication mechanism. Clearly it ranges from no security (no authentication) to very secure (HMAC calculation). I think that unfortunately we're looking at the lower end of the scale towards the no security end, as the overriding criterion is human computation.

The most basic and brute force approach that occurs is simply sending the same message in multiples. So m ⊕ k1 | m ⊕ k2 | ... and so on. I believe that you'd get 4.8 bits of security for every concatenated message, but I'm probably wrong.

David Cary
  • 5,664
  • 4
  • 21
  • 35
Paul Uszak
  • 15,390
  • 2
  • 28
  • 77
  • Sending the message in multiples is only bad if the attacker has meta data of the cypher (message length, time it was sent). If instead of 1 message you had a continuous stream (with random data filling the gaps between messages, or offline) then an observer wouldn't be able to do much in the middle. – daniel May 30 '17 at 23:40
  • @daniel This isn't a hypothetical question. UK Investigatory Powers Act 2016 means that all the meta data will be captured. I expect port hopping won't defeat the act. I'm not sure yet how much meta data can be hidden by SSL, but I think that it's not much. – Paul Uszak May 31 '17 at 00:15
  • Since comments are not for extended discussion; I moved my OTP transport chit-chat to a chatroom. After all, the Q here assumes "Sufficient hardware generated one time pad key material" to be available… most probably at both sides (sender/receiver). – e-sushi May 31 '17 at 09:47
  • what about xor-ing a known secret with a summation of the message, say the start or perhaps every 5th letter, appending that to the end of the plaintext before encrypting, separated by a reserved char. – dandavis Jun 01 '17 at 04:18
  • Hm, how about sending the first 10 and last 10 characters twice "tthhiiss. Iiss. aa tteesstt". Can however be forged once you know the pad - but that might be a desirable deniability feature. – eckes Jun 02 '17 at 06:36
  • @eckes the problems come when the attacker knows your system, so sending characters twice is pretty much the same thing as sending the whole message twice. – daniel Jun 02 '17 at 07:49
  • Yes it is the same (just with less overhead) but it is not a problem of knowing the system, the attacker does not know the OTP stream so he cannot forge it. – eckes Jun 02 '17 at 07:50
  • 1
    @eckes but for a MitM, a partially known plain text attack without other mechanisms its easy to change a value. Example "attack at 7 pm" can be changed to "attack at 8 pm" so the attack can split the 2 generals up just by adding 1 to the cipher text, if they know most of the plain text. they can do the same if the message is repeated in some way they also know. – daniel Jun 02 '17 at 09:01
  • Yes, forget my suggestion the OTP does not protect from that. – eckes Jun 02 '17 at 09:19

8 Answers8

6

I'll use $m\boxplus k$ for combining characters $m$ and $k$ by addition (modulo $n=28$) within the character set, or strings of individual characters of equal-length, rather than " m ⊕ k " in the question. I consider alphabetical ordering of $n$ (e.g. 28) characters with letters first, such that $\;\mathtt{'A'}\boxplus\mathtt{'A'}\;=\;\mathtt{'A'}\;$ and $\;\mathtt{'B'}\boxplus\mathtt{'C'}\;=\;\mathtt{'D'}\;$.

The question's method of sending $M\boxplus K_1\;\|\;M\boxplus K_2\;\|\;\dots$ is extremely insecure: replacing each ciphertext character $x$ by $x\boxplus\mathtt{'B'}$ is not detected, but correspondingly alters the message. There is a partial fix: for each character $m$ of the message, append to the message the character $m\boxplus m\boxplus m$, then encipher with OTP; since $\gcd(3,28)=1$, this makes altering a plaintext character precisely as hard (or easy) as guessing it. Even with that improvement, if the message (or a segment thereof) is known, it is possible to change it (or that segment) to anything desired, with certainty that the change goes undetected. That's insatisfactory.

Goals

  1. Integrity check should be based on the plaintext, thus catching decryption errors (accidental or not) introduced by whatever deciphers, and avoiding the need to keep the full ciphertext for integrity check.
  2. It shall remain convenient to decipher first (including while receiving), postponing integrity check.
  3. Same unconditional plaintext confidentiality as OTP in ciphertext-only attacks, and only practically negligible leak under an attack model where the MitM gets to know if an attempted forgery succeeds or not. The receiver does not know the message length in advance. We posit integrity of the random pad, that it is synchronized between sender and receiver at the beginning of each message, and is never reused by sender or receiver.
  4. In order to keep the integrity check feasible by hand, we restrict to use of the same $\boxplus$ as used for encryption, public latin square(s), random characters supplied by the random Pad (after those used by en/decryption), a moderate number of characters (or integer) variables (perhaps selection among these, and simple decimal arithmetic). Paper and pencil is OK (even if plaintext-leaking material gets written on it), a calculator is not.
  5. The amount of work for integrity check should be minimized, as well as the transmission overhead, and drain of the random pad.
  6. Quantifiable level of security against forgery, including with known plaintext (which makes sense if we do not fully trust whatever deciphers, see 1).

Circular buffer with latin square and addition (2017-06-03b)

We spare the first $s\ge2$ characters of ciphertext for integrity check, computed from the plaintext and the first $t\ge s$ characters of the random pad (or stream generator); think of $s=4$, $t=10$. The $s$ integrity check characters are computed by the sender, and optionally recomputed and verified by the receiver. All the rest is standard One Time Pad (or stream generator) and we do not further discuss encryption and decryption.

The system uses a single public $n\times n$ latin square; assume for now a random one.

For manual computation, the first $s$ characters of the random pad are written down (with - for space and + for the extra character). For each character of the plaintext, and then for each of the $t-s$ next characters of the random pad:

  • fetch the latin square at the line defined by that character, and the column defined the last character that was written;
  • combine the result using $\boxplus$ (addition modulo $n$ as used in the OTP) with the first character not striked-out (which is also the $s^\text{th}$ from the end), and strike that character;
  • write the resulting character following the $s-1$ characters not striked.

The $s$ characters remaining are the check characters.

(In)Security:

  • Each step combines the newly added character, leftmost and rightmost check characters, into a new check character, in a manner such that knowing 3 our of these 4 characters the other can be uniquely determined. It follows that the $s$ check characters remain uniformly distributed during the whole computation, for constant plaintext. No information about the plaintext can leak from the check characters, and information leak from acceptance/rejection is limited by the forgery rate.
  • The tight alternation of the latin square and $\boxplus$ combiners quickly and non-linearly diffuse changes; this is critical (forgery is trivial if we replace the latin square with $\boxplus$).
  • There is a forgery with $(n-1)^{-2}$ odds of success ($2\log(n-1)\approx9.5$-bit security): alter randomly two consecutive characters, then a third with $n-2$ non altered characters in-between. Slightly better strategies tailored to the latin square (and known/chosen plaintext if applicable) are possible. For the desired $n=28$ it is possible to objectively rank latin squares for their resistance to such attacks.
  • Security can't be better than $s\log_2(n)$-bit (forgery by random choice of check characters); nor than $(t-s)\log_2(n)$-bit for known plaintext (guess of the $t-s$ characters at end of computation).

That's calling for future work.


Methods in Mark N. Wegman and J. Lawrence Carter's New Hash Functions and Their Use in Authentication and Set Equality, in Journal of Computer and System Sciences 22 (1981) would be great, but I did not find something easy to perform without a computer.

fgrieu
  • 140,762
  • 12
  • 307
  • 587
  • I assume you mean "where $k$ is" instead of "where $k$ if" a few times in your answer. – kodlu May 31 '17 at 22:20
  • I've been sketching this out on my whiteboard (which I got for an anniversary - but I needn't have added that). Would you characterise this as a n character wide hash of the message which is then encrypted by n further characters from the OTP? And the Latin Square provides the non linear function just as a derangement might? Are we looking at 28^28 (3e40) or 28! (3e29) possible values for the check sequence then? From this can we then calculate the security level in bits? At least 49 bits for the 28! calculation? – Paul Uszak May 31 '17 at 23:35
  • My board markers ran out @ ver.48! Materially though, this is still a 4 char wide hash as evidenced by your categorisation of s as check characters? I just ask as you've not explicitly named this technique... – Paul Uszak Jun 05 '17 at 12:51
  • 1
    @Paul Uszak: My system with $s=4$, $t=10$ is a 4-char wide Message Authenticator with 10-char one-time key. Contrary to a full-blown MAC, but just as in as in a Wegman-Carter authentication / universal hashing , the key must not be reused. That, and the moderate security (especially against local alterations), is a price for simple computation, and seems to match High security is of secondary importance. I plan to publish a latin square with quantified security against the attack in the third bullet by next week. Apologies for the premature wear of your board's paraphernalia :-) – fgrieu Jun 05 '17 at 14:15
  • Yes, absolutely. Hand computation must trump all in this scenario. I'll look forward to your plan and buy more markers in preparation. – Paul Uszak Jun 05 '17 at 16:40
  • 3
    I'm interested in seeing an example of this in action. i'm trying to make sense of it, but struggling to understand it fully. A simple example with the message "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG" and OTP key would be awesome. Perhaps a pastebin? – Aaron Toponce Sep 07 '18 at 15:57
  • 1
    @fgrieu I am very interested in seeing an example of this in action. No one that I can find on the internet has given a clear example of this. Claims of good security have been made by others, but no one has followed through and shown a good example. – Patriot Jul 23 '19 at 12:27
  • 1
    @fgrieu Please do that "future work"! I am sure several readers are waiting to see it. – Patriot Jul 23 '19 at 12:51
  • 1
    @Patriot it appears Aaron Toponce has worked out an example at https://aarontoponce.org/wiki/crypto/authentication – lily Oct 15 '19 at 07:34
  • @LilyChung Thank you so much. I am sure that many readers will be delighted to see this. – Patriot Oct 15 '19 at 11:49
3

I've recently been interested in adding message integrity and authentication to hand ciphers, as I belong to a group of hobbyist classical cipher puzzle solvers, so discussions in this vein get people thinking.

I put together a wiki page on several approaches to adding message integrity as well as a couple ideas of adding authentication, including the answer by @fgrieu. Probably the the most practical approach without complicating the hand cipher too much is to:

  1. Use the socialist millionaire protocol for authentication in the plaintext.
  2. Calculate message integrity with something like the Damm algorithm.
  3. Encrypt the message.

Notice that this is a MAC-then-Encrypt approach.

The recipient then:

  1. Decrypts the message.
  2. Looks for the authentication secrets.
  3. If the secrets do not exist or are unknown, the message may be fraudulent.
  4. Calculate the checksum.
  5. If the calculated checksum does not match the sent checksum, the message might have been modified.

The goal is to produce an authentication and integrity solution that is at least as difficult to break as the security of the hand cipher. Just remember that Bruce Schneier said it best:

There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files.

This answer as about the former.

David Cary
  • 5,664
  • 4
  • 21
  • 35
Aaron Toponce
  • 246
  • 2
  • 12
3

Create an "alphabet" containing every single possible tweet character exactly once, and construct a tabula recta based on this "alphabet".

Let a plaintext character be a checksum. Calculate the checksum with your tabula recta like this:

Find the first plaintext character in the first row of the table,

  • go down until you find the second character,

    left or right until you find the third,

    up or down until you find the fourth,

    left or right until you find the fifth,

    up or down until your find the sixth, etc.

When you hit the last character, make a 90 degree turn and keep going until you hit the edge of the tabula recta. The character you land on is the checksum.

Insert the checksum at the same position in the message as the position of the first key character in your "alphabet".

If the receiver calculates the checksum of the message (ignoring the character at the position indicated by the first key character, of course) and finds it matches the decrypted checksum, he can be {100-[(1 ÷ alphabet length) × 100]}% sure the message was not modified by an adversary.

To give credit where it is due, I got the idea of this snaking operation on the tabula recta from prgomez.com.

Meler Lawler
  • 315
  • 1
  • 10
  • 1
    Very interesting. Is your formulae correct now? It's gone from 96% to 4% certainty. – Paul Uszak Oct 09 '18 at 18:59
  • 1
    And, can there be more than one tabula recta for any alphabet size? – Paul Uszak Oct 09 '18 at 19:00
  • Haha thanks for catching my silly mistake, I corrected it. Yes, there can be more than one tabula recta for any alphabet length. In fact, if the adversary doesn't know what tabula recta you use, you can keep the checksum as the last character of every message. Since even if he knows where the checksum is, he won't know how to change it to make it agree with his modifications to the message. The simplest way to generate a new tabula recta for a given alphabet is to shuffle the alphabet, then write it on 26 rows, moving the leftmost character to the right end every time you write a row. – Meler Lawler Oct 09 '18 at 19:23
  • The longer the alphabet, the greater the confidence. The shuffling of the alphabet doesn't matter. But the confidence increases more slowly the longer the alphabet gets. Going from 2 characters to 26 characters adds 46 percentage points, but going from 26 to 52 only adds 2 percentage points. Using a double-character checksum with a 26-character alphabet would offer 99.9% confidence (I think!) but I'm not sure how to make a double-character checksum with a tabula recta. I recommend contacting the author of the link I posted, no one knows pen and paper encryption better than him. – Meler Lawler Oct 09 '18 at 20:50
  • Soz, my fault. I meant using several different tables. So eg. 3 of your shuffled tables would produce 3 separate check digits. Is it possible to calculate the %age confidence in such a situation? – Paul Uszak Oct 09 '18 at 22:49
  • 1
    Your alphabet is 28 characters long so the odds of a successful forgery would be 1 against 28^3, or 0.005%. So 99.995% confidence of authenticity, assuming my math is correct. – Meler Lawler Oct 09 '18 at 22:54
  • 1
    @MelerLawler Strictly speaking, you have verified the integrity of the plaintext to a quantifiable degree, but not its authenticity, correct? We take it that the message is authentic if our one-time pad works for decryption, and especially if it works with a pre-understood value for Russian Copulation. But do we in fact know the message is authentic? That is, signed by the sender we think. Is using the OTP the same as signing? – Patriot Jul 07 '19 at 01:55
  • 1
    Correct. I think all symmetric encryption (e.g. OTP) is inherently authenticated. So there's no reason to verify that the sender is who you think they are, just that their message hasn't been edited by someone else before it reached you. – Meler Lawler Jul 10 '19 at 23:47
2

You could use a slight variant of Encrypt-last-block CBC-MAC (ECBC-MAC) with a random single-character permutation plus a one-time pad character per MAC tag "digit". ECBC-MAC is easy to compute by hand and you can make MAC tags arbitrarily large — I'll explain how below.

Generating MAC Keys

You and your partner agree to use one-time pads (OTPs) to communicate. Plaintext and ciphertext use the same alphabet $A$. (In your example, $|A|=28$.)

For each OTP key, generate and attach a MAC key:

  1. Decide how many symbols ("digits") the MAC tag should have. Call that number $N$. Each symbol increases confidence but also increases the amount of work you have to do; thus:

    MAC Size (Symbols) Probability a Forgery Succeeds When $|A|=28$
    1 $1/|A|$ 3.5714%
    2 $1/|A|^2$ 0.1276%
    3 $1/|A|^3$ 0.0046%
    ... ... ...
    $N$ $1/|A|^N$
  2. For each MAC symbol $n$ in $1,...,N$, generate:

    1. A random permutation of all $A$ alphabet symbols. All permutations must be equally likely — consider a Fisher-Yates shuffle, which you can do by hand, perhaps with the aid of dice or playing cards.
    2. A random symbol from $A$, chosen uniformly at random, which will function like an OTP key.

Example MAC Key

Example MAC key for your alphabet ($|A|=28$) and MAC length $N=2$:

MAC Symbol Number Permutation Extra Random Symbol
1 QCKJGEYMTBHPVZ DRL.UWIOXSFNA T
2 MFR.QVOETHIUKSBLAC DZWXYJGNP G

MAC Key Requirements

These MAC keys share some OTP key requirements — namely:

  1. MAC keys must never be reused. Every authenticated message will need its own randomly-generated MAC key.
  2. Generate MAC keys uniformly at random.
  3. Keep MAC keys absolutely secret.
  4. Securely distribute MAC keys.
  5. Destroy MAC keys after using them. If you're verifying a message's MAC tag, you MUST destroy the MAC key afterwards regardless of whether verification succeeds or fails. (If you reject the message but keep the key, you could inadvertently give the adversary an oracle as well as permit replay attacks.)

Calculating MACs

MAC Plaintext or Ciphertext?

You can MAC plaintext or ciphertext — there's no difference in security because you'll essentially OTP-encrypt the MAC tag.

  • MACing plaintext will help message recipients detect encryption and decryption errors but force them to decrypt messages first.
  • MACing ciphertext won't catch decryption errors, but recipients can verify messages before decrypting.

Pick one and make sure your recipients know which to use.

Calculating a MAC

Let's assume (for demonstration purposes) that you MAC ciphertext. Encrypt your message using a OTP key; then compute the ciphertext's MAC using the OTP key's associated MAC key thus:

  1. Check the length $N$ of the MAC the key generates. For each $n$ in $1,...,N$, do the following:
    1. Set $x \leftarrow 0$.
    2. For each ciphertext symbol $s_i$ (reading the ciphertext left-to-right):
      1. Let $f_n(x)$ be the symbol at zero-based index $x$ in the $n$th permutation. Set $x \leftarrow f_n(x + s_i \mod |A|)$. This assumes that the first symbol in your alphabet has numerical value $0$, the second $1$, and so on.
    3. Set $x \leftarrow x + e_n \mod |A|$, where $e_n$ is the $n$th extra random symbol from the MAC key. This effectively encrypts $x$ using a OTP.
    4. $x$ is now the $n$th symbol of the encrypted message's MAC tag.
  2. Destroy the MAC key along with the message's OTP key — it's a one-time key.

Message verification uses the same process.

Example

Let's say your ciphertext is AUTXQ, $A$ is the English alphabet plus space and full stop ($|A|=28$), and the MAC key is the example one above ($N=2$). Then:

  • For the first symbol in the MAC tag ($n=1$), the permutation is QCKJGEYMTBHPVZ DRL.UWIOXSFNA and the extra random symbol $e_1$ is 'T'.
    1. Set $x \leftarrow 0$.
    2. $s_0 = 0$ (the symbol 'A'). Set $x \leftarrow f_1(x + s_0 \mod 28) = f_1(0) = 16$ (the symbol 'Q').
    3. $s_1 = 20$ (the symbol 'U'). Set $x \leftarrow f_1(x + s_1 \mod 28) = f_1(8) = 19$ (the symbol 'T').
    4. $s_2 = 19$ (the symbol 'T'). Set $x \leftarrow f_1(x + s_2 \mod 28) = f_1(10) = 7$ (the symbol 'H').
    5. $s_3 = 23$ (the symbol 'X'). Set $x \leftarrow f_1(x + s_3 \mod 28) = f_1(2) = 10$ (the symbol 'K').
    6. $s_4 = 16$ (the symbol 'Q'). Set $x \leftarrow f_1(x + s_4 \mod 28) = f_1(26) = 13$ (the symbol 'N').
    7. Finally, set $x \leftarrow x + e_1 \mod 28 = 13 + 19\mod28 = 4$, which is the symbol 'E'. Thus the first symbol of the MAC tag is 'E'.
  • Do a similar thing for the second MAC tag symbol ($n=2$). The permutation is MFR.QVOETHIUKSBLAC DZWXYJGNP and the extra random symbol $e_2$ is 'G'.
    1. Set $x \leftarrow 0$.
    2. $s_0 = 0$ (the symbol 'A'). Set $x \leftarrow f_2(x + s_0 \mod 28) = f_2(0) = 12$ (the symbol 'M').
    3. $s_1 = 20$ (the symbol 'U'). Set $x \leftarrow f_2(x + s_1 \mod 28) = f_2(4) = 16$ (the symbol 'Q').
    4. $s_2 = 19$ (the symbol 'T'). Set $x \leftarrow f_2(x + s_2 \mod 28) = f_2(7) = 4$ (the symbol 'E').
    5. $s_3 = 23$ (the symbol 'X'). Set $x \leftarrow f_2(x + s_3 \mod 28) = f_2(27) = 15$ (the symbol 'P').
    6. $s_4 = 16$ (the symbol 'Q'). Set $x \leftarrow f_2(x + s_4 \mod 28) = f_2(3) = 27$ (the symbol '.').
    7. Finally, set $x \leftarrow x + e_2 \mod 28 = 27 + 6\mod28 = 5$, which is the symbol 'F'. Thus the second symbol of the MAC tag is 'F'.
  • The complete MAC tag is "EF".

Security

Each of the $N$ symbols of a MAC tag is an instance of ECBC-MAC using a random one-symbol permutation as a block cipher and encrypting the last "block" (the final $x$ value) using a OTP. MAC keys are used once, then destroyed. Recipients destroy their MAC keys regardless of whether verification succeeds or fails.

  1. There are $(|A|\cdot |A|!)^N$ unique MAC keys and $|A|^N$ unique MAC tags.
  2. Encrypting the MAC tag using an OTP provides perfect secrecy for the tag. Adversaries cannot know the tag you calculated prior to encrypting it with the OTP just by looking at it; therefore, they have to rely on structural weaknesses in CBC-MAC to guess MAC tags.
  3. However, several papers (see 1 and 2 for examples) proved that:
    1. if the block cipher used in CBC-MAC is a pseudorandom function, then CBC-MAC is a pseudorandom function; and
    2. the advantage a computationally unbounded adversary has for forging a CBC-MAC that uses a random function as a block cipher over guessing that random function is less than or equal to $3q^{2} m^{2}/2^{l+1}$, where $q$ is the number of MAC-generating oracle queries the adversary makes, $m$ is the message length (in blocks), and $l$ is the number of bits in the MAC tag. (A similar bound applies if the random function is a random permutation, as in our construction.) $m$ is the length of our encrypted message (because our "block cipher" is a one-symbol permutation) and $l$ is very small in our construction, $\log_2(|A|)$, so this seems like a terrible MAC to use. But note that the advantage is proportional to $q$.
  4. However, because we encrypt our CBC-MAC using a OTP and use our MAC keys exactly once, destroying them after use (regardless of whether verification succeeds or fails), adversaries cannot make oracle queries: $q = 0$. (Readers might object that eavesdropping on a message provides one oracle query — one example of a message with a valid MAC tag — but OTP-encrypting the MAC tag hides the CBC-MAC value, in effect denying the adversary even one oracle query.) Adversaries can try to forge a valid message, but they get only one chance because the recipient always destroys his/her OTP and MAC keys.
  5. Therefore, adversaries have no advantage over simply guessing MAC keys or MAC tags. MAC keys are larger than MAC tags, so intelligent adversaries would try to guess MAC tags; thus the probability that an adversary can forge a valid message in a man-in-the-middle attack is $1/|A|^N$.

If any of the assumptions above or MAC key requirements outlined earlier is violated, these guarantees no longer hold.

Also, you cannot use this scheme if multiple recipients share a MAC key. If there are $R>1$ recipients, an adversary could treat $R-1$ of them like oracles (if the recipients act in ways that reveal whether forgeries succeed or fail) and possibly forge a valid message for recipient $R$.

UPDATE: If $|A|=2$, then this scheme reduces to bit-by-bit XOR, which allows adversaries to arbitrarily permute ciphertext symbols without affecting MAC tags. If $|A|=2$, create a two-bit block permutation (block cipher) — a permutation of $\{0,1,2,3\}$ — and process ciphertext bits two at a time, adding padding if necessary.

Update: Using OTP to encrypt the CBC-MAC tag achieves two things: It prevents adversaries from cryptanalyzing the tag directly or using it to cryptanalyze CBC-MAC (perfect secrecy) and it prevents message extension attacks by transforming CBC-MAC into ECBC-MAC.

A Note on Practicality

Large $|A|$ (such as $28$ as in the OP's question) makes generating and using random permutations by hand more difficult. Transforming plaintexts to and from a decimal code (even A = $0$, B = $1$, and so on) reduces $|A|$ to $10$, creating MAC keys that are easier to use.


Update: Possible Improvements

Generating Only One Permutation

I thought of a possible improvement: Instead of generating $N$ random permutations, generate one, then choose $N$ symbols from $A$ uniformly at random (with replacement). Call each of these random symbols $v_n$ for $n$ in $1,...,N$. You still need to generate a OTP of length $N$ as before. Do this MAC calculation instead (a slight variation on the original):

  1. Check the length $N$ of the MAC the key generates. For each $n$ in $1,...,N$, do the following:
    1. Set $x \leftarrow v_n$.
    2. For each ciphertext symbol $s_i$ (reading the ciphertext left-to-right):
      1. Let $f(x)$ be the symbol at zero-based index $x$ in the random permutation. Set $x \leftarrow f(x + s_i \mod |A|)$. This assumes that the first symbol in your alphabet has numerical value $0$, the second $1$, and so on.
    3. $x$ is now the $n$th symbol of the encrypted message's MAC tag.
  2. Encrypt the MAC tag with the MAC key's OTP.
  3. Destroy the MAC key along with the message's OTP key — it's a one-time key.

This scheme has $|A|^{2N} \cdot |A|!$ unique keys — high, but far shorter than the original scheme's keys. The security analysis still applies: The CBC-MAC "block cipher" is a truly random single-symbol permutation. Adversaries cannot query a MAC-generating oracle (due to random one-time MAC keys and OTP-encryption of MAC tags), so they're forced to guess either keys or MAC tags. There are still fewer possible tags than there are keys, so adversaries will still guess tags: The probability of a successful forgery is still $1/|A|^N$.

Generating a Random Function

Consider the improvement above (generate only one permutation of $A$). CBC-MAC has similar security bounds (with advantage over guessing random functions proportional to $q$) if the underlying block cipher is a random function; therefore, if $|A|$ is such that picking $|A|$ symbols of $A$ uniformly at random (with replacement) is faster than permuting $A$, the random permutation can instead be a random function $f:A\rightarrow A$.

This is especially nice when $|A|=10$: Roll a 10-sided die ten times to generate the random function.

In this case, the number of unique keys is $|A|^{|A|+2N}$. This is a much larger keyspace than the single-permutation scheme, but it provides similar security.

user103480
  • 21
  • 2
2

I didn't think it has to be this complicated, but the complexity is slowly creeping up.

We make every message the same length, from your question 160 characters, my method adds an extra message (the header) into the senders message, and the message again, so in transit the message is 327 characters. The sender will pad from the end of their text with white space. This may eat your key faster if you typically send 1 or 2 character messages, but it stops someone fingerprinting you from matching message lengths and allows the rest to work. Spaces at the end of the message will not be seen by the receiver.

Now before sending each message we build the header that needs to include an identifying string, a message checksum of some type, the entire message encrypted again using more of the OTP, and a new random start index of the message. Now begin the message at the index and wrap around, and place this header at a random location in the message. You would have your receiving program take out this header, then fix the messages order. Something like:

original message:"this is the message but one six zero chars"
padded message:"this is the message but one six zero chars                                                                                                                      "
identifying string:"zzzz"
checksum (my guess at the LRC character):"f"
random header position(0 to 160+7-1):7
random start index (1 to 28^2): 2 : "ab"
header:"zzzzfab"
next 160 OTP characters XORed with the message:"(160 rnd chars)"
message in transit:"(160 rnd chars) this izzzzfabs the message but one six zero chars                                                                                                                     "
message at receive:"this is the message but one six zero chars"

The identifying string should be strange and long enough not to appear in your messages. When initially transferring the OTP key also agree on this identifying string. The sender should avoid sending the identifying string in their plain message as it would goof up the receiver (but there are methods to deal with this).

The checksum, random header position and random message start index is to limit the effectiveness of MITM attacks by forcing them to guess at values. The checksum is the LRC of the other 326 characters in the message (it mainly serves to stop a MITM adding 1 to each character). The way I would implement the message start index is have at both ends a table with 28^2 = 784 different start locations, but you could make it more complex by changing the direction of text or in other ways. so for example:"aa" means start at 1 (no change). "ab" means start at 2. "ft" means start at 160. "zz" means start at 86. This table does not have to be secret, the added security comes from the index being sent with the message.

If you want to decrypt and verify then, skip 160 chars of your OTP, decrypt the whole message, skip back to the start of your OTP and decrypt the first half of the message. To check if it is authenticated make sure the values match. If you do not need to validate you can just throw away the first half and skip along in the OTP.

Re encrypting the whole message has a cost, it eats 3 bits of your OTP for each effective bit of message you send. And the message is now twice as long as the information you are sending.

Because of the scrambled order a MITM has a 1/167 chance of guessing the character position if they were attempting to change one character from the first 160 and the matching character from the remaining.

daniel
  • 912
  • 5
  • 15
  • 2
    The verification is only useful, if the message itself is secret. If the message is not secret, it is trivial to reproduce the otp and replace the message with another. If the checksum is weak it may be possible to know and change only part of the message. – Meir Maor May 30 '17 at 09:51
  • WRT "random looking" - think OTPed Twitter. So yes whilst sending random A-Z characters might conceivably happen, vast majority of messages would be what passes for language in the Twittersphere. – Paul Uszak May 30 '17 at 10:02
  • To make progress, we have to assume that the OTPs are secret and the messages are secret. This is the classical use case for historical OTP codes. I think that known plain text attacks will have to be considered as part of a second phase feasibility assessment... – Paul Uszak May 30 '17 at 12:38
  • Comments are not for extended discussion; this conversation has been moved to chat. – e-sushi May 31 '17 at 08:26
  • "MitM just needs to change so they don't work" - I've never considered this! Thank you. The usual case is to substitute the MitM's information. It never occurred that a successful attack would also be to simply disrupt the message. Is this where MitM attacks become akin to signal jamming? I wonder if the community makes this a formal distinction? – Paul Uszak Jun 02 '17 at 09:58
  • If that is sarcasm you are part of a real crummy community. You started this by talking about "messages based on English" I didn't know we are sending the launch codes, I thought we were just sending text messages. – daniel Jun 02 '17 at 10:59
2

Authentication in one-time pad systems was a problem in WW2. How it was solved historically in the SOE at least: an agent has a pre-agreed error pattern (like making a spelling mistake in the seventh word). If he/she omitted to do this, it meant that he/she had been captured and was sending under duress. The captor did not know this pattern presumably, and would not suspect a correctly sent message. So the authentication key was the pattern of error, essentially.

Henno Brandsma
  • 3,842
  • 16
  • 20
  • 1
    Excellent! Do you have any references to the error authentication? Can I infer that Man in the Middle situations were not catered for? And that MitM attacks are not probable or realistic most of the time. – Paul Uszak Jun 04 '17 at 22:09
  • 3
    It's in "between silk and cyanide", a book on the history of cryptography in the SOE. They used radio directly. MITM was not considered a threat there. @PaulUszak – Henno Brandsma Jun 04 '17 at 22:18
  • They didn't all use radio. What about pigeons? Could they be intercepted and subjected to MITM men? – Paul Uszak Jun 20 '17 at 00:33
  • Pigeons were rarely used, from what I recall from the book. A well-trained one would be hard to intercept, I gather. – Henno Brandsma Jun 21 '17 at 19:41
  • This is interesting. What authentication systems were actually used? What about the VIC cipher? – Patriot Jul 23 '19 at 12:31
  • 1
    @Patriot the VIC cipher had no authentication AFAIK. – Henno Brandsma Jul 23 '19 at 12:46
  • @Henno Brandsma It does in a naive kind of way, right? Look at how the agent number was used, and look at how that random five-letter group was embedded, and how it appears in the ciphertext among the final five-letter groups. We would not call this authentication, but to them it did serve that purpose it seems. What do you think? – Patriot Jul 23 '19 at 13:45
  • 1
    @Patriot I’d say that’s just the system description as a whole, plus key information. If you knew all that it had to be the real agent, they probably thought. But there is no check as such, or a duress codeword. – Henno Brandsma Jul 23 '19 at 13:48
  • @Henno Brandsma I agree. And no duress code... unless no one has figured it out yet! – Patriot Jul 23 '19 at 14:04
  • @HennoBrandsma Duress codes can be hidden well. My favorite one is from the DPRK: if it is written in pen, it is true; written in pencil, false. Or was it the other way around? I am for "written in pen" is false. – Patriot Jul 23 '19 at 14:09
  • @Patriot doesn’t really work for radio communication. – Henno Brandsma Jul 23 '19 at 14:10
  • @HennoBrandsma I would like to look into this issue of WWII and early Cold War authentication much further. I hope you can too. – Patriot Jul 23 '19 at 14:25
2

Use a secret permutation of the alphabet as the top and side of the tabula recta used to encrypt and decrypt messages.

Meler Lawler
  • 315
  • 1
  • 10
1

Encrypt then MAC. We will compute a MAC of $N$ characters from the ciphertext. Pad the message to a multiple of $N$ characters, if needed (since all transmitted messages are $160$ characters long, just choose $N$ to be an integer such that there exists an integer $k$ where $N \times (k + 1) = 160$).

Divide the ciphertext into groups of $N$ characters. Set the MAC to an arbitrary initial value, such as a group of $N A$ characters. For each group of ciphertext, use the group as if it were one time pad key material to encrypt the MAC, making the result the new MAC, until all groups have been used in this way. Then use $N$ characters of actual one time pad key material to encrypt the MAC, and this is the final MAC. Append the MAC to the message.

Weakness: A change in one character of the ciphertext will only change one character of the MAC.

Attack: Changes can be made to particular characters in the ciphertext, which will balance out and produce an unchanged MAC.

AleksanderCH
  • 6,435
  • 10
  • 29
  • 62