3

Scenario

I have some unique requirements to encrypt text:

  1. symmetric encryption;

  2. the encrypted text will be a valid text (that means that humans could read it and not just “gibberish”);

  3. I can not replace the key from time to time (that means no temporary \ one time key algorithms);

  4. each word must be always encrypted to the same word (and only to one word).

These needs are important to support the following use cases:

  1. a third party will get the encrypted text and will be able to index it for searches (effective indexing needs basic morphology, this is why need number 2 is important);
  2. when someone sends this third party an encrypted word(s) it will be able to find the texts that associated with these words (this is why needs number 3 and 4 are important).

But I also have some flexibility in the system:

  1. I can always add more words to a dictionary (if need in the encryption / decryption process);

  2. both sender and receiver has access to this dictionary (always synced).

Example scheme

So basically I can use some sort of a dictionary and map words and replace each word in the encrypted word (like Caesar cipher but for words).

Every time I encounter a word I can add it to the dictionary as both original word and encrypted word.

For example:

  • Word: “I” Encrypted word: “Cat”
  • Word: “Love” Encrypted word: “Dog”
  • Word: “You” Encrypted word: “Home”

Now the text “I Love You” will be encrypted to “Cat Dog Home”. This fits all the needs that I have. But as you probably know, this is a very week encryption.

To make it stronger I was thinking about randomly inserting words that I know that are fake (no mapping to a real words), this (maybe?) prevents statistical analysis of the language (common words and etc).

For example: - Encrypted word: “Boy” is fake. - Encrypted word: “Test’s” is fake.

Now the text “I Love You” will be encrypted to “Cat Test’s Boy Dog Test’s Home Boy”. This fits all the needs that I have. But I believe this is still weak, so I’m looking for something better.

Symmetric ciphers

I’ve read about symmetric encryption algorithms like Blowfish, Twofish and AES-256 and from my (limited) understanding they are all algorithms that replace one byte with another and they consider to be strong encryption methods.

Is it possible to implement a symmetric cipher, but instead of operating on bytes it will operate on words?


NOTE

After further research I've come to the understanding that this question is more misleading than helpful.

The question asks about encryption but directs to obfuscation - and so may some of the answers.

I believe that noobs (like me) will be more distracted and misled than gaining anything from this.

Patriot
  • 3,132
  • 3
  • 18
  • 65
Omri
  • 133
  • 3
  • Do the resulting words need to be pronounceable? (i.e. XKCD is not pronounceable, but INFOSEC is) Do the resulting words need to be short? – 700 Software Nov 10 '16 at 14:14
  • Can a chaining algorithm be used to increase security? For example, if "I love you" becomes "Cat Dog Home"; is it OK if "Love you" becomes something other than "Dog Home", to increase security? – 700 Software Nov 10 '16 at 14:16
  • No, chaining algorithm will break need number 4. since the word Love wont be consists. –  Nov 10 '16 at 14:39
  • The resulting words can be anything as long as the word is a real word in the English language. –  Nov 10 '16 at 14:40
  • 2
  • Security of this will not be great. 2) Why the requirement to use proper words for the ciphertext? That doesn't follow from your search requirements.
  • – CodesInChaos Nov 10 '16 at 16:27
  • 3
    IMHO it is barely practical/sensible to modify any well accepted schemes like AES for any special purposes. You have a dictionary of words that can be indexed by integers in a certain range. With a given key you can via shuffling using the key as the seed of an appropriate PRNG to bijectively map that natural index range to another one, resulting in what you examples showed. But, if one key is used to encrypt a sufficiently large amount of materials, the security would unavoidably suffer. Maybe you could examine the practical feasibility of using different keys for different sets of materials. – Mok-Kong Shen Nov 12 '16 at 11:44
  • Hi Omri. Sorry to contact you after so much time. To be honest, I think this is the perfect question of a starting cryptographer. You've indicated a wish to delete it because the proposed scheme is not secure and just obfuscation, but this gets pretty well explained in the answers, so I don't think it will lead anybody in a trap; quite the opposite in fact. Please comment below if you have any issues with the Q/A or flag for moderator assistance. – Maarten Bodewes Feb 28 '20 at 16:25