Can anyone suggest a good (CS)PRNG algorithm that takes advantage of having a very large (ideally arbitrarily large) seed? I'd like to use several kilobytes, perhaps several hundred kilobytes, of random data to generate up to a few dozen megabytes of PRNG output. The PRNG algorithms I'm aware of only take relatively small seeds: I'd like to make the most of the random input data I have.
-
something I just came up with now: $:$ partition the seed into seeds for standard PRNGs, and alternate $\hspace{0.4 in}$ their outputs. $;;$ (This would only work if your data pretty much is random, it could fail quite badly $\hspace{0.8 in}$ if your data merely had min-entropy close to its size.) $;;;;$ – Jul 17 '12 at 12:31
-
5Using more than a couple of hundred bits of entropy doesn't really increase your security. If you have a big seed, just send it through SHA-512 first, and use that as actual seed. – CodesInChaos Jul 17 '12 at 12:34
-
Thanks Ricky. No offense, I'm trying to avoid things that people "just came up with" =). I'm hoping there's a standard peer reviewed and time tested technique for doing this. Your idea sounds perfectly reasonable to me, but I'm no expert. Anyway, I think a variant of your idea would be to use Fortuna, or something very Fortuna like (i.e., set the number of entropy pools high enough to cover all of the input data). – brianmearns Jul 17 '12 at 12:35
-
Alternatively, if anyone can point me to a good hash function which has a configurable, arbitrarily long output size, I could use that with PBKDF2 with a hash size equal (or almost equal) to the amount of input data I have. – brianmearns Jul 17 '12 at 12:37
-
@CodesInChaos: $:$ That greatly reduces whatever unconditional guarantees can be given. $;;$ Also, if something like that was to be done, I would very much recommend a t-resilient strong extractor instead of SHA-512. $;;;;$ – Jul 17 '12 at 12:44
-
@CodeInChaos: Can you elaborate on that? How does several thousand bits of true random data not increase security over 512 bits? – brianmearns Jul 17 '12 at 12:52
-
1512 bits can't be brute-forced. So your main concern is the cryptographic strength of whatever PRNG you're using. If you're paranoid you can double the size, going with something like Skein1024. You can also increase the number of rounds of the cryptographic primitive, to increase resilience to cryptoanalysis. – CodesInChaos Jul 17 '12 at 13:03
-
I'm not worried about brute force: I'm worried about patterns in the output if I have to generate thousands of times more data than what is in the seed. Is that what you mean by the strength of the PRNG? – brianmearns Jul 17 '12 at 13:28
-
1Being able to distinguish the output of a PRNG from true random numbers requires either cryptoanalysis breaking the underlying primitive, or a brute-force attack. The brute-force attack is impossible for state&seed sizes of 512 bits, so you'd need to break the algorithm. – CodesInChaos Jul 17 '12 at 14:39
-
2@bmearns: You're trying to solve a problem that no halfway-decent PRNG has. So long as the PRNG isn't horribly broken, and so long as an attacker can't guess the seed, the output will look random and pattern-free over many, many terabytes. – David Schwartz Jul 22 '12 at 16:05
-
1@DavidSchwartz: Ok, I think I finally get it now =). Just use a good PRNG and I don't have to worry about "making the most" of my random seed. That being said, it probably doesn't do any good to use a seed that's bigger than the PRNGs internal state, right? – brianmearns Jul 27 '12 at 13:16
-
@bmearns: Assuming the seed is perfect, it does no good to introduce more bits of seed than the PRNG's internal state. If the seed is not perfectly random, introducing more bits of seed than the internal state can help ensure the internal state is as random as it can be. You can also introduce more of the seed later on. But for a good PRNG, all that's needed is to introduce enough seed prior to using its output. – David Schwartz Jul 27 '12 at 17:31
4 Answers
The PRNG proposed by Barak and Halevi should be able to meet your needs and provide sufficient security.
Their PRNG has an API of next and refresh. refresh takes an arbitrarily long string and uses it to update the internal state of the PRNG. next returns some number of random bytes (if more is needed, one can simply call next in succession).
They prove some very nice properties of their PRNG (which together define robustness of a PRNG), and give a concrete implementation.
The entire paper is definitely worth a read.

- 38,563
- 8
- 112
- 180
What you are doing sounds a lot like what the /dev/random and /dev/urandom or the PRNGD on many systems already do: those systems take an arbitrary large sequence of numbers (from a true hardware random number generator if available, or else from environmental noise such as keystroke timing) and feed it into a CSPRNG; the output is made available via /dev/random and /dev/urandom .
Perhaps you could somehow extract the source code implementation of any one of those systems, and make a user-level variant that, rather than getting input from environmental sources, only gets input from your large seed.
- FreeBSD and AIX implement /dev/random using the Yarrow algorithm
- OpenBSD implements /dev/random using an algorithm based on RC4 (is this the same as ISAAC ?)
- Fortuna
- My understanding is that many other CSPRNG implementations have a way to feed in more bits of entropy.

- 5,664
- 4
- 21
- 35
-
take a look at the paper I link to in my answer. Section 5.2 talks about some issues they see with Fortuna. While all academic, it is an interesting discussion. – mikeazo Jul 25 '12 at 11:54
-
FreeBSD is now moving to Fortuna over Yarrow, so maybe that's preferred for some reason? – felixphew Feb 11 '16 at 09:22
Don't bother. Take any cryptographically secure CSPRNG. Feed it a random key (make sure the key length is at least 128 bits). You're done. Once the key length is 128 bits or so, any additional length is pointless overkill.
If you have a thousand-bit seed, just hash it first to get it down to be the right size of the key of your CSPRNG -- but make sure you have at least that much entropy in the seed. Or, better yet, simply generate your key using a secure pseudorandom number generator, like /dev/urandom
or CryptGenRandom
.

- 36,365
- 13
- 102
- 187
This is probably obvious now (more obvious, at least, than before Keccak won the SHA-3 competition) but what is being asked for here sounds an awful lot like a sponge function. Or a sponge function in duplex mode, if the finite state is such a serious concern.

- 121
- 2