6

Are there any requirements for the additional data to be used in the GCM block cipher mode? And are there any "classic" used informations worthwile to be used? Can for example a username or the name and surname of a person be suitable to encrypt a message with AES GCM?

M-elman
  • 1,248
  • 3
  • 15
  • 24
  • You really should stuff everything into the AAD that you can think of that is relevant to the connection / packet at hand. – SEJPM Dec 06 '16 at 19:29
  • Aren't there any information which are suggested to include or, conversely, that shouldn't be used as additional data? – M-elman Dec 06 '16 at 20:17

1 Answers1

9

There are few requirements for the additional data in the GCM document of NIST.

There are some limitations on the size of the AAD, the maximum is $len(A) ≤ 2^{64}-1$ in bits.

Furthermore the following recommendations about the size apply for 128 bit tags:

Therefore, the total number of blocks of plaintext and AAD that are protected by invocations of the authenticated encryption function during the lifetime of the key should be limited. A reasonable limit for most applications would be $2^{64}$, consistent with the requirement on the number of invocations in Sec. 8.3.

and for shorter authentication tags Appendix C applies further restrictions on the length of ciphertext (or equivalently plaintext) and AAD combined.

Those are all the restrictions that apply according to the NIST document.

I'll however add another restriction: AAD is defined in bytes. Whatever you do, for a successful authentication, the AAD must be encoded to the same bytes. That is: the encoding of the AAD must be a canonical encoding. Obviously if you encode a sequence number in little endian and later in big endian then you're in trouble. Same thing if it is unclear where one field ends and the other starts.


Is there any "classic" used information worthwhile to be used? Can for example a username or the name and surname of a person be suitable to encrypt a message with AES GCM?

NIST again offers a few options:

The controlling system or protocol may protect against such an event by monitoring for any duplication of the IVs that are presented for authenticated decryption. Alternatively, certain identifying information can be incorporated into the AAD. Examples of such information include a sequential message number or a timestamp.

I'd say it's a good idea to include as much identifying data in your AAD as possible. Note that you do not have to send the AAD data as long as you can produce it at the sender and the receiver. At least you should be able to prevent replay attacks and to distinguish between sender and receiver. If you cannot do this using the key/IV combination then you'd have to use the AAD (or plaintext) for that.

So yes, name and surname are a good idea as they identify the sender. Beware that names are however often not unique (hence UUID's etc.), something you'd sorely want in an identification scheme.

One thing you do not have to include is the IV, the IV is authenticated by default. And in case you're a greenhorn in cryptography, including the key is a bad idea as well.

dave_thompson_085
  • 6,319
  • 1
  • 21
  • 23
Maarten Bodewes
  • 92,551
  • 13
  • 161
  • 313
  • "Including the key is a bad idea as well." - is there any treatment on this? Ie should there be any founded fear to include the key (beside it being pointless)? – SEJPM Dec 06 '16 at 20:30
  • @SEJPM That's enough reason not to do it. But in general you should just use a key for one singular purpose. And mind that if you'd ever want to use hardware encryption (or something similar) that the key value may not even be available. – Maarten Bodewes Dec 06 '16 at 20:37
  • You got me curious there, I posted a Q about scientific treatment of this question :) (Even though it's pointless, bad practice and may be straight impossible) – SEJPM Dec 06 '16 at 20:43
  • What is the recommended size of the tag? – M-elman Dec 06 '16 at 21:01
  • 1
    That's simple: 128 bits, the maximum size in other words. – Maarten Bodewes Dec 06 '16 at 21:07
  • I'll remeber to encode the AAD everywhere in the same way. E.G. Base64 should be good, right? And finally another question: so the important thing is to make sure that the set of the AAD used is unique? (name and surname isn't good if two users share the same name and surname, while id+name+surname where the id is unique is a good set of AAD? Or the ID in this case is enough because it alone is unique?) – M-elman Dec 06 '16 at 21:08
  • 1
    Base64 encodes binary as text. You need to encode to binary; if you already have a byte array then you don't have to encode. As for the ID: long as it uniquely identifies the sender to the receiver (for the specific key). It depends on your protocol if you want to include name / surname & whatever more (you're good at final questions, I presume this was the last one?) – Maarten Bodewes Dec 06 '16 at 21:28
  • It was. My questions often arise from answers beacuse I want to fully understand them; thank you as always. – M-elman Dec 06 '16 at 21:48
  • One thing I'd add to this is that it's often useful to include data that's out-of-band and not attacker-controlled. Things that arise out of the context the data is being used in. For instance, if you're decrypting data for a particular user, that user's database ID might be included in the AAD. Or if there's a sequence number, like TLS. I'd also clarify that if you have multiple fields, those fields need to be explicitly delimited or have their length encoded in a non-ambiguous way. Data shouldn't simply be blindly appended together. – Stephen Touset Dec 07 '16 at 01:57
  • @StephenTouset "I'd say it's a good idea to include as much identifying data in your AAD as possible." and a whole paragraph on canonical encoding (including a point about concatenation) is already in my answer. But feel free to update those sections if they are not clear in your opinion. – Maarten Bodewes Dec 08 '16 at 18:09
  • @M-elman Is there anything missing from this answer? Please indicate what is missing or use the accept button. – Maarten Bodewes Dec 12 '16 at 22:59
  • @maartenbodewes I have a couple of new doubts about this topic: I tried to implement it but I had some troubles with encoding (specifically, it all worked with ISO-8859-1, while not with UTF-8). And then, how to work with multiple AAD? Should I concatenate all of them or is there another way? (Because I read from your answer and from StephenTouset comment that I have to know where one field ends and the other starts) – M-elman Dec 15 '16 at 15:50
  • The simplest way to create a canonical encoding is probably to prefix a length in, say 32 bits (4 bytes) to each field. As for the encoding / decoding you could post on StackOverflow, that's undoubtedly a programming error that has little to nothing to do with an AEAD cipher. – Maarten Bodewes Dec 15 '16 at 22:57