What to do when a noise source for a TRNG isn't perfectly white

Question

I have constructed a quintessential two transistor noise generator typical of what might be found when doing an internet search for such circuits. After laborious tuning I have it adjusted to about the best it's going to get from the analog/circuitry side.

Over relatively short and choice stretches of frequency (say 300kHz to 400kHz), the noise seems reasonably flat/white (-50.5dBV to -48.5dBV). The problem is over the full length of what I can see with my scope, it is - in all actuality - producing what appears to be 1/f noise, AKA pink noise.

That is to say, the physical hardware appears to fundamentally produce pink noise. And there doesn't appear to be anything I can do about that.

I have a healthy level of cryptographic paranoia, so reading that the usual hardware noise sources for cryptography are only ever described as "white noise sources," and knowing that my source isn't strictly white has made me question how to proceed. (I am assuming of course that no further improvements to the circuit can be made, which may not hold true.)

According to the wiki article on white noise however... (emphasis my own)

An infinite-bandwidth white noise signal is a purely theoretical construction. The bandwidth of white noise is limited in practice by the mechanism of noise generation, by the transmission medium and by finite observation capabilities. Thus, random signals are considered "white noise" if they are observed to have a flat spectrum over the range of frequencies that are relevant to the context.

It would seem from the above defining characteristic that my noise generator can be made to work without modification; that the issue isn't whether or not a circuit/hardware can ever be made to produce purely white noise, but rather, the specifics on how one processes the noise.

In other words, how should one process such a noise source while avoiding harm from the potential biases of a pink nature?

We can assume that we are getting the noise into numeric form using an analog to digital converter with an as of yet unknown sample rate and an as of yet unknown number of bits per sample.

With that, I would also assume that the answer to this question resides in specifying these as of yet unknowns, with the possible addition of some whitening.

(Note: The ultimate intention I have is to feed entropy into libhydrogen.)

There are several ways to "whiten" an entropy source. Whitening is a bit of a strange subject as I've seen mentions of things as simple as XOR'ing output against the inversion of the next bit (or one further up in the stream of bits) to using a full fledged DRBG. Obviously the earlier method simply removes a simple bias towards 0's or 1's but does nothing to correct any other issues with the distribution. The other kind-of removes the whole idea of it being a TRNG but will produce a nice distribution (and still allows you to reseed as often as you like). — Maarten Bodewes, Jan 04 '24 at 09:06
(I'd myself err towards the latter as I don't trust that entropy sources provide a good distribution and I'd be worried about some kind of relationship between the bits produced as well as possible I/O errors and the like, but then again I don't see the need for a TRNG at an application level) — Maarten Bodewes, Jan 04 '24 at 09:09
More directly, I would probably filter on a specific frequency and test the properties of that, whiten the result using a simple XOR and then finally use that as entropy source of a slow but secure DRBG. Then applications can use that for seeding their own local DRBG instances. — Maarten Bodewes, Jan 04 '24 at 09:19
Hard question. The adequate solution depends a lot on if we want to extract (nearly) as much entropy per second as possible (in which case we'll indeed use an ADC, at a high sample rate, and we will run into the problem of knowing if the noise is from the source or from the ADC), or if we are content with a small fraction of that. And, in a cryptographic context, if we want to simultaneously test for defect in the noise source and ADC (or whatever simpler thing replaces that ADC). — fgrieu, Jan 04 '24 at 11:36
@fgrieu This is for the function hydro_random_init() in libhydrogen. Looking it over, it appears that it only collects 256 bits of entropy and uses that as seed for hydro_hash_update(). So it appears that "a small fraction of that" is the correct answer and my particular noise source is overkill for this use case. — Charlie, Jan 04 '24 at 16:50

Paul Uszak · Answer 1 · 2024-01-04T13:25:35.727

a noise source for a TRNG isn't perfectly white...

They hardly ever are. Irrelevant as that's the default situation.

The ultimate intention I have is to feed entropy into libhydrogen.

Then output bias is also irrelevant, as is colour or flicker. What matters is de-correlation of the raw output stream as that allows an accurate measurement of the entropy rate. If you can measure it, you can stick the right amount of it into libhydrogen.

In essence you keep slowing down the sample rate, or dropping significant bits of the sample values until the auto-correlation becomes negligible. Like this:-

There's more detail here, including the means of IID determination and entropy measurement. It builds upon the work of Gaspard, Pierre; Wang, Xiao-Jing, Noise, chaos, and (ɛ, τ)-entropy per unit time, available here un-paywalled.

Once have $H_{\infty}$, simply inject enough of it into libhydrogen to satisfy your needs.

Speak up if you're going to create one time pads, as there is another little trick to minimise final bias.

But now a warning if the following looks familiar: https://electronics.stackexchange.com/questions/289058/are-reverse-biased-transistors-stable. No they're not.

My circuit has two stages. First is a carefully reverse biased emitter-base junction of a small signal NPN as noise source. The breakdown voltage is ~6V, so whether Zener or avalanche effect dominates is unknown; I assume a fairly even mix. Extreme care was taken to ensure that the reverse current is comfortably below what might damage the transistor. This results in a lack of signal that is compensated for in the second stage, which is a common emitter amplifier tuned for max gain with minimum non-linearity. Voltage gain is about 25x. — Charlie, Jan 04 '24 at 18:25
@Charlie You can differentiate Zener stuff from Avalanche stuff as the former is Normal whilst the latter is Log Normal. Anything on-line anywhere? — Paul Uszak, Feb 04 '24 at 02:43

What to do when a noise source for a TRNG isn't perfectly white

1 Answers1