17

I'm very uneducated when it comes to cryptography. I have tried to find an answer to my question, but what I've read doesn't quite cover what I'm asking.

I have thought up my own encryption algorithm (which I'm sure is nothing new) and I would like to know why this sort of algorithm is not used in terms of security, efficiency and any other reasons.

It stems from the Caesar Cipher. But, instead of shifting each letter by 13, you shift each character (using computer character values) by a certain value each time.

So, let's say I had the sentence:

My dog's name is Rover.

And I have an array of numbers:

1, 4, 3, 2

I would shift M up by 1 character, y by 4, [space] by 3, d by 2 and then continue with the rest of the sentence, looping through the number array. So, the sentence would be converted into N}#fpk*u!rdof$lu!Vrxfv1.

Let's call the array your "key". Only someone with the key could decipher the message, unless someone figures it out. The more items in your array (key), the safer you are from someone deciphering your message.

Apologies if my explanation is poor. To summarize my question:

  • Why is this bad (or, not used)

  • Is it inefficient/insecure compared to other encryption methods

  • Is there anything else about this that should be considered

Thanks.

tylo
  • 12,654
  • 24
  • 39
user1575550
  • 187
  • 1
  • 1
  • 3
  • 33
    This should be (up to naming) a Vigenère Cipher on of the standard examples of classical cryptography (and of cryptography that can't be trivially brute-forced but is still "easy" to break using proper methods). – SEJPM Jun 07 '17 at 20:14
  • 9
    Creating crypto algorithms in general isn't bad, but assuming they're secure without spending time analyzing them or researching the history of cryptography is a big risk. Any "home-brewed" algorithm shouldn't be used in practice, or shared with others who might use it. If you're going to keep making algo's in this field, please clearly label them as a "toy cipher" until you know how to prove their security. Learn to prove yourself wrong, before trying to prove yourself to others. This is just my two-cents. – floor cat Jun 07 '17 at 21:17
  • 1
    An extension to this might be if the shift numbers where 0 - 25 (for letters A-Z) and the length of the key was the same as the length of the plain text. Then you'd really be onto something... – Paul Uszak Jun 07 '17 at 21:39
  • 4
    If your key is as long as your message, you'd be much more secure using a one-time-pad with random data (key) and the xor operation, while still having the same problems with key distribution as you currently have. If your key is shorter than your plain text, it has to repeat, which opens you up to all the statistical analysis methods mentioned in the other answers already. – JesseM Jun 07 '17 at 23:53
  • I second @floorcat 's sentiments. The only way to get into cryptography is to try and invent algorithms. However, one should always remember a quote by Bruce Schneier, one of the greats in cryptography: "Any fool can invent an encryption so strong that he/she cannot break it." Also, with regard to questions like these, it's worth remembering the story of the young cryptographer and his professor. The cryptographer kept coming up with "unbreakable" codes and the professor kept breaking them. Finally, the cryptographer came up with a code for which the professor did not simply break it. – Cort Ammon Jun 08 '17 at 01:07
  • Instead, he showed the young cryptographer two envelopes. He said, "In these envelopes are two ways to break your algorithm. When you can come back and show me that this algorithm is broken, I'll give you the envelopes and you can see for yourself." – Cort Ammon Jun 08 '17 at 01:08
  • If the key is random and is as long as the message, this is a one-time pad: perfectly secure (of course, the key can't ever be re-used for a second message). If the key is shorter than the message, you have serious problems. – Nicholas Wilson Jun 08 '17 at 08:30
  • 2
    See https://crypto.stackexchange.com/q/43272/2430 – Martin Schröder Jun 08 '17 at 10:12
  • Given that this is a question from a beginner, I don't think any comments regarding OTP and implications of perfect secrecy are useful. It is quite possible, that they actually confuse the OP. – tylo Jun 08 '17 at 11:06
  • When I was a teenager 40+ years ago, I read lots of books and magazine articles (e.g. Scientific American) on codes and code breaking, and this cypher was described there. Do people interested in codes no longer have access to resources like those? – Barmar Jun 08 '17 at 21:11
  • @Barmar Access is surely not the problem today - but actually filtering all the available information. I think in almost any introduction to cryptography Vigenere is at least mentioned - as SEJPM said it is the most prominent example of a classical cipher, which was considered secure for centuries but isn't any more. – tylo Jun 09 '17 at 08:29

5 Answers5

41

Edit: how to break Vigenère

Yes, this cipher (which as SEJPM points out, is Vigenère) is vulnerable to frequency analysis, but Vigenère resists this a bit, because common letters don't always get ciphered into the same ciphertext. The unavoidable vulnerability stems from the repeating nature of the key.

Let's say your message is $N$ characters long and your key is $k$ characters long. You may think that by repeating the key until you reach $N$ characters that you gain some "units of security" ("it's $\lfloor N/k \rfloor$ times more secure!"). But if I know your message is enciphered in this way, I have learned a great deal about your plaintext.

If I can estimate your key length $k$ $-$ and I can, with the great Kasiski's help $-$ then I can take every $k$th ciphertext character and group them together (for example, if $k=5$, group together the first, sixth, eleventh, etc. characters, then group the second, seventh, twelfth, etc. into another block, and so on). Each block is enciphered with the same character, a character from your key. At this point, a straightforward frequency analysis comes into play; breaking this is no harder than breaking the Caesar cipher you started with.

I do this for each "group" and now I have recovered your key.


It is cool that you came up with the idea of a polyalphabetic cipher on your own. But as the comment and other answer have pointed out, this cipher is insecure and can be broken quite reliably. Seriously: only use this for fun or pedagogical reasons.

In 1863, Kasiski even knew how to guess the length of the key for this cipher! That forces you to make your key longer / harder to remember.

It's very efficient; you can encipher text quickly with it. But that's a double-edged sword: I can attempt to decipher it quickly as well.

15

Edit:

I think the edit to the question makes it as vigenere cipher; which invalidates my answer below. @galvatron answer gives the suitable answer why vigenere is not secure.

The old answer below ( applies only to substitution)

Baiscally this is a simple substitution cipher, where each letter is mapped to another letter (i.e. the shift). The answers for your questions:

Why is this bad (or, not used)

Because each language has known frequency analysis patterns. For example, in the English language e is the most used letter. So, if an adversary has a piece of encrypted text using your algorithm (long enough) he can easily know the shift you used for each letter.

Is it inefficient/insecure compared to other encryption methods?

Yes, it is not secure and can be broken easily by statistical analysis. You can use also brute-force with worst case scenario of $26!$.

Is there anything else about this that should be considered?

A better algorithm than yours is called the Vigenere cipher but it is also insecure. See the link (https://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher) for more information.

NuminousName
  • 452
  • 4
  • 14
  • 2
    "d" is the most used letter? It's not "e"? (as in etaoin shrdlu...?) –  Jun 07 '17 at 21:24
  • 1
    https://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html – Paul Uszak Jun 07 '17 at 21:31
  • @galvatron Judging from QWERTY, I'm guessing it's a typo. – Sam Estep Jun 07 '17 at 23:43
  • 3
    "where each letter is mapped to another letter" - but if the key length is equal to the length of the message, how can a frequency analysis be helpful? Every "e" would be represented by a different symbol and thus you couldn't look for a "symbol" which appears the most. Regardless, the biggest issue in terms of implementation is that your key size would have to be at least as large as the message, which isn't very practical. – noahnu Jun 08 '17 at 00:19
  • @SamEstep You know, I use Dvorak, where QWERTY "d" = Dvorak "e", so I can buy that! –  Jun 08 '17 at 00:21
  • 1
    @user43088 Then you'd need to gather multiple messages before you could start using frequency analysis, but could then proceed normally. That's the reason why you can never reuse a one-time-pad. (which is effectively what this would become if the key length is equal to the message length). – Ray Jun 08 '17 at 00:22
  • 7
    The given algorithm is the Vigenere cipher, which invalidates almost everything you've said. It isn't secure, but not for the given reasons, and obviously suggesting that the Vigenere cipher is better (than itself) is out of place. – Eric Tressler Jun 08 '17 at 02:13
  • 4
    It can be broken with statistical analysis easily enough, but 26! isn't going to be brute-forced. That's over 2^88 possible keys. Also, 26! is for an alphabetic substitution, which is not the cipher the OP was describing; they were using an offset into the full "computer character" (presumably ASCII) set, with possible repeats. So 256^keylen possible keys. Which for short keys might actually be brute-forceable, although frequency analysis is still clearly the way to go. – Ray Jun 08 '17 at 02:30
  • 1
    @noahnu even with short key the mapping of the "e" will be different example "Free refugees" with key 12 will make it "Gtfg sggwhgfu" so it is still very hard to decrypt. – asmgx Jun 08 '17 at 04:28
  • 1
    @asmgx The Vigenere cipher (that's the name of the cipher in the question) is broken - Cort Ammon already quoted the co-called Schneier's law in a comment to the question. An example can never show that something is secure or even difficult to break. – tylo Jun 08 '17 at 11:28
  • 14
    The answer is wrong: Even if the encryption method isn't described clearly in the question, an ASCII table and the example clearly shows that it is used as a Vigenere cipher (every symbol is shifted, including the spaces). It is not a simple substitution cipher and the number of keys isn't $26!$. The upvotes here and the lengthy discussion are surprising - since SEJPM already wrote the correct answer in the very first comment. – tylo Jun 08 '17 at 11:49
  • 2
    As of now the post has +15 votes. I've learned more from the comment chain than the post itself (which as @tylo says is possibly misleading). Can the post be corrected / a disclaimer added? – noahnu Jun 08 '17 at 14:56
11

Encryption is naïvely viewed as a way to send messages from A to B that cannot be deciphered (at least in practice) by an adversary. Sure, encryption does do that, but modern ciphers do so much more...

A common attack scenario is the known plaintext attack (KPA). Of course, if the adversary already knows the entire plaintext, there's not much to be gained by encrypting it. However, a secure cipher remains secure even if the adversary knows the plaintext of a previous message or merely parts of the plaintext of the current one.

Assume a message was encrypted by your algorithm, and the ciphertext begins with the following bytes.

Pksqr,)Rkus|!

Looks scrambled, so far so good. However, if the attacker knows that recipient is Jenny and that the sender would most likely greet her with

Hello, Jenny!

Subtracting the code points of the plaintext from the code points of the ciphertext gives the following result.

   80 107 115 113 114  44  41  82 107 117 115 124  33
-  72 101 108 108 111  44  32  74 101 110 110 121  33
-----------------------------------------------------
    8   6   7   5   3   0   9   8   6   7   5   3   0

With absolutely no effort, statistical analysis or cryptanalysis skills, the attacker can guess the key to be 8 6 7 5 3 0 9 and decrypt the rest of the message.

Thus, KPA resistance is a requirement for ciphers to be considered secure nowadays. All modern ciphers satisfy this requirement and – if properly used – can securely encrypt billions of plaintext containing billions of characters.

Dennis
  • 2,151
  • 15
  • 21
  • 2
    While looking at modern security definitions, I would suggest replacing known plaintext attacks by chosen plaintext attacks as the absolute minimum today (both for symmetric and asymmetric encryption schemes). If some scheme is resistant to KPA but not CPA, no one would consider that scheme to be secure. – tylo Jun 09 '17 at 08:40
  • Sure, but showing that a cipher is vulnerable to KPA means that it is even weaker. – Dennis Jun 09 '17 at 14:31
4

Sari's answer is very good at explaining why this method isn't particularly good. However I'd like to add that if your array is chosen from a very good random number source (think radioactive decay, the LSB of radio noise etc.), (at least) as long as your plaintext, and you do not use it to encrypt more than a single message, then what you have is a one time pad. It is worth reading about them, as they are theoretically unbreakable, but you have to keep the keys secure and can never reuse them.

dkaeae
  • 530
  • 5
  • 15
MikeS159
  • 141
  • 3
  • 6
    Sari's answer failed to realize that the OP actually described a Vigenere cipher and not a monoalphabetic substitution cipher. – tylo Jun 08 '17 at 11:32
  • 5
    The OP mentions "looping through the number array". How would that ever be an OTP? – dkaeae Jun 08 '17 at 11:39
  • @dkaeae if the array is bigger than your plain text, (and you don't reuse the same array for other messages) then you only use it once. However, that's not the implication given by the OP, so this is really more of a comment as answer, by someone who couldn't comment. – Rick Jun 08 '17 at 17:20
  • @dkaeae The OP may mean "looping through the number array" as going through the array multiple times. Or he may mean he has a loop that processes each character of the plain text - in the process iterating through the array, but only once. I don't see enough context in the OP to decide one way or the other. – user2460798 Jun 08 '17 at 20:44
1

The reasons why this is not particularly secure have been well explained here. On the off chance that you're interested in pursuing these matters a little further, here are a few useful resources to develop your interest and understanding a little further:

  • Helen Gaines, Cryptanalysis (introductory-level text on elementary ciphers and how they can be broken, available from Dover) (this will show you how to systematically break your cipher)
  • Neal Stephenson, Cryptonomicon (fiction, but thought-provoking)
  • David Kahn, The Code Breakers (classic study of the development of codes in politics and war)
  • Dan Boneh, Crypto I (Coursera course, starts to get into the mathematical bases for modern cryptographic systems - if you're willing to struggle a bit, there's a lot of meat on those bones)

If you make it through those, there are deeper resources out there and you'll be well equipped to find the ones you need next. Expect to need some math fundamentals, number theory and advanced algebra for example will be important. Good luck, and have fun!

Jon Kiparsky
  • 111
  • 3