Well your 1st couple of bytes will probably be zero as they might be golden ones and cancel each other out. This would almost certainly confirm that the same algorithm was used on both messages. Not sure if this is of use though.
The attacker's problem with compression is that some of it is very good indeed. fp8 will compress to within 0.1% of the theoretical Shannon limit. This means that the compressed file will be almost perfectly random. For example a large fp8
compressed file well passes both ent
and FIPS-140
tests for randomness. A typical file compressed with fp8
will easily achieve 7.999837 bits /byte of entropy as measured by ent
.
The end is where it's interesting. You mention misuse. It might be that the two messages compress to two very different lengths. If these were then xored without noticing, one end would be original compressed information. Only a few people know how fp8
works, but it's feasible that you might be able to recover fragments in less time than a brute force search would take. The attacker would only be fighting against the compression algorithm itself, and that's more Kerckhoff than probability theory.
If they both end up exactly the same length before xoring, the problem is hard. If you have no idea of what the messages could possibly be, you have 99.9979625% true randomness and 0.0020375% file format (from my example compression). Your author's creativity in writing each original message forms a seed. The compressor forms a true randomness extractor with a 0.0020375% output error. If internal blocks overlap, the file format gets destroyed, and the error decreases very substantially. Tricky. NSA guys, what do you think ?