3

Is it possible to use FPE to encrypt names and language words like 'Bob' 'the' 'tree', in order to preserve both length and format (like keeping characters within a certain range like A-Z a-z)

The only way I can think of at the moment is to reduce each character to it's decimal value and encrypt that. But that doesn't seem very secure (wouldn't each character map to exactly the same value each time?)

erotavlas
  • 507
  • 3
  • 14

1 Answers1

3

The obvious way to FPE strings of $N$ characters of [A-Za-z] is to treat the string as a base-52 value (with each character being a digit, say, A=0, B=1, ..., y=50, z=51); do a base conversion of that to an integer between 0 and $52^N-1$; use a standard FPE technique to encrypt that value into another integer between 0 and $52^N-1$, and do a base conversion back into a string of $N$ characters of [A-Za-z].

Because you're encrypting the string as a whole, you don't have the data leakage you would have if you encrypted each character individually.

poncho
  • 147,019
  • 11
  • 229
  • 360
  • 1
    Exactly. The formal name for this construction is rank-then-encipher, and was first treated in Bellare et al.'s paper on the subject: link – pg1989 Mar 02 '15 at 19:52
  • 1
    Interestingly, this technique can be extended to build permutations over regular languages and context-free grammars: link – pg1989 Mar 02 '15 at 19:54
  • I'm not sure I see how this is better, it seems like you are encrypting each character individually anyway, just as an integer. – erotavlas Apr 29 '15 at 20:08
  • 1
    @erotavlas: nope. Consider the case where we encrypt the strings "ABC" and "ABD". "ABC" is translated into the integer 54, while "ABD" is translated into the integer 55. Those integers are both encrypted as integers between 0 and 140607 (and the mapping acts as if were chosen randomly); 54 might encrypt as 31415, while 55 might encrypt as 27182. So, 31415 would translate back into "LgH", while 27182 would translate into "KCm". So, by changing one character, all three characters of the ciphertext are affected. – poncho Apr 29 '15 at 20:14
  • @poncho I guess the missing piece is how does 'ABC' become 54, (I understand the 0 - 140607 is from 0 and 52N−1) – erotavlas Apr 29 '15 at 20:18
  • @erotavlas: "ABC" -> $ 0 \cdot 52^2 + 1 \cdot 52^1 + 2 \cdot 52^0 = 54$, as $A=0, B=1, C=2$ – poncho Apr 29 '15 at 21:14
  • @poncho What number does the exponent represent? Is there a general formula? – erotavlas Apr 29 '15 at 21:23
  • 1
    @erotavlas: review http://en.wikipedia.org/wiki/Positional_notation ; this is base-52 – poncho Apr 29 '15 at 21:29
  • @poncho Are these integers between 0 and $0 - 52^N -1$ in base 10? Also does the base conversion using positional notation always yield an integer in base 10? (is it called decimal expansion?) I think I understand and have done everything correctly I just want to verify. – erotavlas Jul 02 '15 at 22:16
  • @erotavlas: integers are not inherently in particular base; they can be represented in any base. And yes, you can represent any integer in base 10, if you want; however I would question the utility of doing so in this case. – poncho Jul 03 '15 at 01:40
  • @poncho I understand, however I am just trying to clarify terminology for documentation purposes. And I'm just unclear as to what base I am working in at each step. For instance when I first create the number from string it is in base 52, Then will performing the expansion results in base 10 number? If so then the range $0 - 52^N - 1$ is in base 10. If passing this to the FPE function it should return another base 10 number. Which I then convert the result back to a base 52 number and then to a string. Is this correct? I have done the procedure and it works. Just want to verify. – erotavlas Jul 03 '15 at 11:42
  • @erotavlas: Typically, FPE functions deal with base 2 numbers (or, more generally, base $2^k$). So, what I would do is convert from base 52 into base 2, pass that base 2 number to the FPE function, and then take the base 2 number it gives you and convert it base into base 52. – poncho Jul 03 '15 at 13:09
  • @poncho How would I know if a particular implementation is using base 2 or not? The one I am using seems to be base 10 but I am guessing. http://botan.randombit.net/manual/fpe.html – erotavlas Jul 03 '15 at 14:40