6

I have just made my own program to encipher, using AES in counter mode, and have validated it using NIST data. So I know I have done it properly.

I have read that AES-CTR can produce a stream of random numbers. To achieve this, what do I feed in as the plaintext? Will any text do? Does it matter if the plaintext has repetitions?

Cryptographeur
  • 4,317
  • 2
  • 27
  • 40
user2256790
  • 433
  • 4
  • 12
  • 2
    "So I know I have done it properly." As long as you're only considering an extremely narrow definition of "properly" that discounts timing attacks, paging virtual memory containing secrets to disk, buffer overflows, and other similar implementation faults. Producing the correct output isn't the only thing required of a cipher implementation. – Stephen Touset May 05 '14 at 04:38
  • Two reactions to your comment Stephen: – user2256790 May 28 '14 at 09:12
  • Two reactions to your comment Stephen: (1) It would be useful to hear not only your critiques but also where solutions can be found. (2) My application is to encipher my own secret messages, which as a matter of principle will be done on a computer that is isolated from the Internet. Thus it seems to me that the risks you mention do not apply -- or at least are so minimal in my case (for example targetted theft of my computer) as to be negligible. Some of us mess with computing for the love of it -- not to earn our living -- so there's no need to drive a square peg into a round hole. – user2256790 May 28 '14 at 09:22
  • 1
    Implementing cryptography right isn't only about producing the correct outputs for all inputs. There are many operating-system-specific, threat-model-specific, and environment-specific concerns (too many to list) that must also be addressed by any cryptographic implementations that intend to be used seriously. If you're going through the effort of performing this cryptography on an offline host (indicating some importance), why take the backwards step of writing your own version instead of using a well-vetted existing tool or library? – Stephen Touset May 28 '14 at 17:24
  • As a rough example, what do you intend to do with these secret messages? Transmit them over a network? If so, an attacker on the wire can trivially flip bits in the plaintext, even if they don't know specifically what that plaintext is. The mantra often repeated here is: GPG for storage, TLS for transmission. They aren't perfect or infallible, but they get details right (well past simple application of AES) that you (or I) overwhelmingly likely wouldn't. – Stephen Touset May 28 '14 at 17:29
  • 1
    I like to write my own software partly for the intellectual satisfaction, partly to learn new things and partly because then I know what is present and, perhaps more importantly, what is not present. My secret messages are my own private data that I store on my computer isolated from the Internet -- and therefore an attacker flipping bits is not going to happen. – user2256790 May 30 '14 at 11:07
  • Stephen -- you write that I should use a 'well-vetted existing tool or library'. What do you suggest? – user2256790 May 30 '14 at 12:01
  • Stephen -- on reading about timing attacks, it appears to me that your comments are a red herring. For the attack to be successful a huge sample is required, much bigger than the total size of the documents I encipher! Your comments may be relevant in the commercial world where v. large amounts of data are enciphered, but they are absurd in my context. – user2256790 May 31 '14 at 15:22
  • 1
    The examples given are merely some types of problems an implementation can have while still producing correct inputs and outputs. It is not an exhaustive list. That said, if you are storing this data on an offline host that you seem to believe will be difficult for any kind of attacker to get a foothold on, why exactly are you bothering to encrypt the data? What is your threat model, and what kinds of attackers are you attempting to thwart? – Stephen Touset May 31 '14 at 19:41
  • Well then, since you raise the subject, let's have an exhaustive list, at least of the risks that may be relevant to my circumstances! I encrypt to guard against the chance, admittedly very small, that my computer will be stolen. If that were to happen perhaps the thief would be smart enough to discover my password from the hard drive but that is piling improbability on unlikelihood. – user2256790 May 31 '14 at 21:15
  • 1
    An exhaustive list would be quite literally impossible. In the spirit of Bruce Schneier's anecdote, I'll give you one concrete attack outside of the scope of implementing AES-CTR correctly. Are you zeroing out the area of the disk that contained the unencrypted data, after you have encrypted it? In the case of an SSD, are you sure you're zeroing out that data (thanks to automated wear-leveling)? – Stephen Touset Jun 02 '14 at 21:08
  • Since I'm feeling generous, I'll throw two more leading questions out there. What is the source of randomness for your cryptographic keys? How are you generating nonces, and how are you ensuring you're not using the same nonce twice? – Stephen Touset Jun 02 '14 at 21:11

3 Answers3

4

You can get up to around $2^{64}$ random bits from counter mode (before you hit the birthday bound due to the lack of a collision), simply by running it as is. If you've got a full implementation of counter mode, the plaintext can be anything you like, because the stream (ie the output of the $E_k(ctr)$ calls), should appear uniformly sampled from $2^{128}$.

Cryptographeur
  • 4,317
  • 2
  • 27
  • 40
  • 1
    Updating the nonce won't let you beat the birthday bound. Additionally, the 2^64 bound is when the provable security of CTR becomes worthless; if you want the security to remain strong (say, by limiting an adversary's advantage to 2^{-40}), then you'd need to stop at 2^{44} bits. Not that this is usually a problem! – Seth May 04 '14 at 01:01
  • 1
    @Seth : $;;;$ It doesn't become worthless there, since the output is still indistinguishable from a distribution for which each given value's probability is at most a not-too-large multiple of that value's probability under the uniform distribution. $:$ In particular, if an adversary has a negligible probability of success on truly random bits, then its probability of success on CTR output should also be negligible. $\hspace{.91 in}$ –  May 04 '14 at 23:13
  • Also, if the generator can either provide bits at its own rate (rather than on demand) or keep an extra 7 bits of state, then one can count output in blocks instead of bits. $;$ –  May 04 '14 at 23:13
4

AES-CTR mode can be used to generate stream of random numbers. For generating random numbers, the plaintext is indeed irrelevant. It can be even full of zero (the NIST recommended way to generate random numbers uses such plaintext.) NIST has recommendation on how to generate DRBG (deterministic random bit generator) based on CTR mode.


NIST has defined how to construct random bit generator from CTR mode in NIST SP 800-90A, section 10.2.1.

This NIST SP 800-90A document is concerned with various high level functions which are used in random bit generation:

  • Generating Random Bits
  • How many blocks are to be generated with a single key (with AES-CTR: up-to $2^{48}$ requests; up-to $2^{13}$ bits per request)
  • Health checking (check RNG is operational)
  • Reseeding (without loosing any previously obtained entropy)
  • Backtracking resistance (when random bits are generated, also new internal state is generated)
  • Operating when entropy input does not provide ideal random bits. (a derivation function Block_Cipher_df() defined for this purpose.)

From perspective of generating stream of random numbers, the generate function is the most essential. For NIST SP 800-90A, the random bit generation function is little more than AES-CTR with XOR step skipped. I.e. it is equivalent to AES-CTR where plaintext is full of zero bits.

In your post you say that you have followed NIST vectors in implementing AES-CTR. If you want to make random number generator "the NIST way", then you shall use the NIST SP 800-90A to implement the RNG.

The most significant advantage of documents of NIST SP 800-90 series is that they discuss a lot on how to obtain entropy (i.e. secure key/seed material) for the RNG. This is a filed, where prior art has often gone wrong (such as accepted insufficient entropy).

user4982
  • 5,319
  • 20
  • 32
2

Any plaintext will do. If you choose 0x000..., then you can skip the XOR step entirely.

One extra step you can take to improve security is to periodically use the stream to choose a new AES key. This provides forward security: if an attacker manages to compromise the system (say, using a heartbleed-type exploit) and learn the current key, he will not be able to figure out what random numbers you used in the past.

As an aside: It's not very clear from your question, but did you implement AES yourself as well as CTR mode? If so, the fact that you may have correctly implemented AES in the sense that it produces the correct outputs doesn't imply you've implemented AES securely. In particular, your implementation is likely vulnerable to timing attacks (pdf) unless you've taken steps to prevent them (and are qualified to do so).

Seth
  • 4,378
  • 23
  • 28
  • Any plaintext *independent of the AES key* will do. – fgrieu May 04 '14 at 09:14
  • 1
    @fgrieu any particular attacks for $E(K,K)$ when $E$ is CTR mode encryption? I mean, using the key or anything derived from the key as plaintext is really stupid, but please indicate the attack. – Maarten Bodewes May 04 '14 at 19:32
  • @owlstead: I'm not aware of any attacks for $E_K(K)$, but there's definitely an attack for, say, $E_K(D_K(0))$. – Ilmari Karonen May 04 '14 at 20:45
  • @IlmariKaronen Yeah, reusing any of the "key" streams would be extremely stupid :), but that's a bit different from reusing the key. Oh well, I'll leave it alone. – Maarten Bodewes May 04 '14 at 20:48
  • @owlstead: I was thinking that if as plaintext we choosed $E(K,IV)||E(K,IV+1)||\dots$ then the overall RNG would output zero. Also, $(E(K,IV)\oplus K)||E(K,IV+1)||\dots$ would reveal the key. Note: I assume the IV is known; a lot of it is zero in standard AES-CTR anyway. – fgrieu May 05 '14 at 05:07
  • Ah, yes, I agree that using $K$ together with $E$ is not a good idea. Better be safe and not use $K$ at all. – Maarten Bodewes May 06 '14 at 01:23