1

I am trying to analyze symmetric block ciphers like DES, 3DES and AES, using Cryptool 2. I want to do a frequency analysis on each of these ciphers, in order to comment whether this is an effective way for cryptanalysis or not.

However, the ciphertext these ciphers produce comes in hex form: enter image description here

My question is - when performing frequency analysis on such ciphers, do I input the hex ciphertext or do I need to transform it into text? How can I comment on that (based on letter frequency or bit occurrence)?

Modal Nest
  • 1,443
  • 4
  • 18
C-Bk
  • 113
  • 5
  • 1
  • @kelalaka I read that before asking the question, but I didn't quite understand. I'm testing a random plaintext in English. From what I understand, I can perform frequency analysis on hex mode, and need to find general frequencies on english texts in hex mode in order to compare? – C-Bk Jul 02 '20 at 14:35
  • The frequency analysis on block cipher can work on block level. The data ( plaintext) in general is byte encoded. If one encrypts only one byte, then byte-level frequency works too. Of course, that requires non-randomized padding, too. In anyway, one need do compare 8 bytes for DES and 16 bytes for AES ciphertexts to observe a frequency due to the insecure mode ECB. If any other mode is used, like CBC and CTR mode then the modes have the standard Ind-CPA security and the attacker has no luck there – kelalaka Jul 02 '20 at 14:38
  • @kelalaka So in order to try to decrypt the ciphertext, I need to compare results of block level frequency analysis of my ciphertext with general english text block-level frequency analysis? Is there something like that? I mean like letter "e" has the highest occurrence in English texts, are there some statistics like those on block-level frequencies? – C-Bk Jul 02 '20 at 14:44
  • 1
    One doesn't exactly decrypt the ciphertext, one finds the value. Standard definition of decryption requires the key. Here, we cannot say how the plaintext are formed. They can be a date, name as in the example of the link. It is hidden in your question. If one knows the distribution of the plaintext then with ECB mode used, they can determine the plaintext with great probability. See the articles\ on the bottom of the linked answer. – kelalaka Jul 02 '20 at 14:50
  • 1
    @kelalaka: For you it is obvious that the frequencies for all symbols for these algorithms will be almost equal. But look at the goal in the OP: in order to comment whether this is an effective way for cryptanalysis or not. In my opinion, this is a reasonable task for a student - not to trust to what is said in a book, but to check if it really holds in particular case. – mentallurg Jul 02 '20 at 20:36
  • @mentallurg to do that the OP need the frequency of the message space, at least some pre-information. Then, divide the ciphertext into 8 or 16-byte blocks ( HEX is better) then sort. After that, they can decide some information about the plaintext that can be found or not. I think the linked answer contains enough information for that. At least the linked articles. – kelalaka Jul 02 '20 at 20:45
  • @kelalaka: You are right in the sense that this is what someone educated should do :) But, based on the question about frequency of letter "e" in English, I believe the the goal is much simpler. I suppose that the teacher showed how frequency analysis can be applied to trivial ciphers like Caesar. As the next step the teacher wanted to show that there are more complex ciphers where "naive" frequency analysis is useless because all symbols have more or less equals frequencies. – mentallurg Jul 02 '20 at 21:16
  • @mentallurg are you the lecturer :P. Then the goal is not perfect. Anyway, without the full data, I can say you may right. – kelalaka Jul 02 '20 at 21:19
  • @kelalaka: I am not a lecturer, but often need to explain technical things to customers with little (or without) technical background :) If you don't mind, I'd suggest to move it to SO. There are much more guys who would by happy to suggest ready tools or code examples to implement the logic needed in this case. Actually, even conversion from hex to bates is not needed. Based on screenshot, this is just hex representation of the byte stream, which is understandable, if there are not only printable symbols. – mentallurg Jul 02 '20 at 21:26
  • @mentallurg no problems. Can you flag it? – kelalaka Jul 02 '20 at 21:27
  • @kelalaka: Done. As I learned for a couple of weeks from SEJPM, the only correct migration way is to migrate it first to "crypto.meta.stackexchange.com", so that the guys at SE company see the necessity of additional migration ways like SO or Information Security. – mentallurg Jul 02 '20 at 21:32
  • @mentallurg did you see this https://crypto.meta.stackexchange.com/q/1152/18298 – kelalaka Jul 02 '20 at 21:36
  • @kelalaka: No, I have not seen that. I usually flagged such questions as needed moderator attention, because in the list there was no proper target site for migration. SEJPM told me that to convince SE company to add more migration ways we need to gather more statistics and for this we should report all such cases to the Crypto Meta. I would prefer having the sites you have enlisted there. – mentallurg Jul 03 '20 at 00:48
  • Answered regardless of the close votes, as I think the answer about the format of the ciphertext can at least be explained. – Maarten Bodewes Jul 05 '20 at 11:56

1 Answers1

1

My question is - when performing frequency analysis on such ciphers, do I input the hex ciphertext or do I need to transform it into text? How can I comment on that (based on letter frequency or bit occurrence)?

First of all, what you are looking at is a hexadecimal representation of the ciphertext. The ciphertext itself does not consist of hexadecimals, it is binary. Converting it to text makes very little sense; if any analysis has to be done it has to be performed over the binary data without any conversion to text.

We know that DES and AES are permutations over block sizes of 8 or 16 bytes. What you are showing is a DES or AES using mode of operation. Analysis should only be performed on the combination of the two. And you can actually find e.g. duplicate blocks when DES or AES is used in ECB mode. In any correctly used secure cipher mode they should of course be secure, and indistinguishable from random (IND-CCA/IND-CPA).

It doesn't make much sense to use the general frequency analysis over the cipher as the output does not even consist of characters. Generally analysis should not be performed willy-nilly; it is performed while keeping the characteristics of the cipher under investigation in mind. And this kind of leads to the answer...

Maarten Bodewes
  • 92,551
  • 13
  • 161
  • 313