If you have a file and XOR every bit with a random bit, can you extract any information?

Question

Say you have a file that is not random, and you XOR every bit with a random bit (not pseudo, but really random). Can someone who sees only the result extract any information from it? Obviously, it won't be 100% accurate, but I imagine you can do some sort of stochastics and get a vague idea. If yes, how? If no, is there a mathematical proof?

Adding a small note in addition to Reid´s answer… remember to never reuse the random key-stream. (Just in case you weren´t aware of related issues if you do.) — e-sushi, Jun 25 '15 at 02:16

score 4 · Answer 1 · answered Jun 25 '15 at 00:51

4

This cipher is called a one-time pad. It is unbreakable ("perfect secrecy") assuming that:

The pad (the collection of random bits) really is truly random
The pad is never reused to encrypt other messages

So, no information can be extracted from $\text{file} \oplus \text{random bits}$.

The basic idea of the proof is that an attacker can test every possible key, but they have no way of knowing which plaintext is actually correct. If I encrypt "attack" with a one-time pad, then any six-character string could just as equally have been encrypted in the first place.

answered Jun 25 '15 at 00:51

Reid

6,829
1
39
57

Thanks, i think the 'if never reused' bit is what i was looking for. – not sure Jun 25 '15 at 00:54
2

You do leak the length of the plaintext, unless you use some sort of padding. – SAI Peregrinus Jun 25 '15 at 03:13

score -2 · Answer 2 · answered Jun 25 '15 at 10:15

-2

If the file has been crafted deliberately to survive this form of damage then yes you should be able to recover your data.

There are many quite simple methods from adding CRCs to replicating the data multiple times.

There are other possible routes to recovery. If for example the file was an ASCII text file then it may be possible to recover something close to the original data by reasoning and dictionary work.

answered Jun 25 '15 at 10:15

OldCurmudgeon

97
2

The "damage" done by XOR-ing with a truly random source (any bit has 50% independent chance of being 0 or 1) is too much for any recovery scheme. No amount of statistical analysis or combining of known repeated elements will give you a better than 50% guesswork on any individual bit, and a correct guess at any bit value gives you no advantage on guessing any other bit value. Any CRCs would be equally mangled and not recoverable. – Neil Slater Jun 25 '15 at 11:32
If you changed the source to have some bias such as $p(0)=0.4, p(1)=0.6$, then enough repetition or suitably robust error correction codes could in theory work. – Neil Slater Jun 25 '15 at 11:36

If you have a file and XOR every bit with a random bit, can you extract any information?

2 Answers2