What are usable entropy sources in a headless machine without /dev/random or any cryptography API?

Question

If I have a headless machine (no mouse movements, no keyboard presses, or user input), without any cryptography CSPRNG APIs, where can I collect entropy from?

Is that a theoretical or practical question? Almost all Unix-based systems have /dev/random. Windows typically has CryptGenRandom or RtlGenRandom. "Headless machine" is vague, PIC12, CS-1, OLCF-4? — fgrieu, Nov 25 '19 at 16:32
Ala Paul's question, what else can you tell us about the environment? Is it x86-64? Modern Intel compatible architecture CPUs have the RDSEED instruction. If you trust that it has all the entropy you need. — Swashbuckler, Nov 25 '19 at 18:45
As for fgrieu's question: this is really just a theoretical question to see if entropy can be gathered in a system without any hardware dependencies, and without cryptographic APIs. As for Swashbuckler's question, we'll assume it's x86-64, but the previous sentence should explain my question pretty well. A reworded version could probably be: "Are there any reliable entropy sources in an environment without any hardware dependencies (mouse movements to /dev/random to RDSEED) or cryptographic APIs?" — 09182736471890, Nov 25 '19 at 18:52
Obviously you need some hardware dependencies. For example, in a virtual machine with software emulation of all clock sources that deterministically increments them by 1 for every machine instruction executed, there is nothing here. Saying ‘without /dev/random’ indicates you're talking about a specific software environment, which doesn't imply anything about what hardware is available. Are you talking about writing software for an application program in some software environment, or are you talking about engineering a system on actual physical hardware? — Squeamish Ossifrage, Nov 26 '19 at 02:57
"which doesn't imply anything about what hardware is available" my reworded question should explain it: "Are there any reliable entropy sources in an environment without any hardware dependencies (mouse movements to /dev/random to RDSEED) or cryptographic APIs?", so, hardware dependencies do not matter. — 09182736471890, Nov 26 '19 at 03:31
How are you getting this device? Does it have persistent state? Are you flashing an OS image onto it that you control? If so, you can tweak the state after you've flashed it to store a seed drawn from /dev/urandom on your laptop, and make sure the device updates the seed at every boot. But this is another kind of hardware dependency; if you categorically reject hardware dependencies then you are categorically denying any possible answers and rendering your question unanswerable, because fundamentally the entropy you're looking for is about unpredictable physical processes. — Squeamish Ossifrage, Nov 26 '19 at 05:18
Headless doesn't mean that it is without disk timings or network timings. If you want to exclude that kind of thing in your question then you should make that explicit in your question (and not just in the comments). I also worry about the statement "without CSPRNG". That's just software, right? You cannot tell me that you have a machine that cannot run software. Better limit it to entropy. — Maarten Bodewes, Nov 26 '19 at 20:30
The statement "without CSPRNG" is not what I said, I meant "without cryptographic APIs", since I wanted to avoid answers like "use /dev/random", since I just wanted to not outsource entropy collection to /dev/random, rather I wanted answers on what SOURCES there are. — 09182736471890, Nov 27 '19 at 04:39
If it's without software APIs, and it's without hardware dependencies, what is left in your system? — Squeamish Ossifrage, Nov 27 '19 at 21:38
No the question is rather, what entropy sources are there? I just don't want to outsource entropy collection (/dev/random or CryptGenRandom), but I want to see what sources there are. I'm not asking how to generate random numbers, rather what entropy sources there are. — 09182736471890, Nov 27 '19 at 21:50
Yet you seem to be summarily rejecting all answers about hardware, which makes this question fundamentally unanswerable because it's necessarily a a matter of what physical processes are involved in your system that an adversary cannot predict the outcomes of. — Squeamish Ossifrage, Nov 27 '19 at 23:33
I have not rejected all answers on hardware, see the accepted answer. The accepted answer successfully answers this question, with mention of hardware. — 09182736471890, Nov 28 '19 at 19:25
I realize you have accepted an answer, but it is rather confusing that you first summarily rejected all ‘hardware dependencies’ and then accepted an answer that is entirely about hardware dependencies! — Squeamish Ossifrage, Nov 29 '19 at 06:25

b degnan · Accepted Answer · 2019-12-08T13:15:33.507

I believe that on any CPU that you should be able to find at least two sources of noise with physically proven behaviors to create a purely random number. The first is through oscillator sampling, assuming that you have a fast clock and a slow clock. The second assumes that you have DRAM, and through the two-way shot noise in the channel, you should be able extract random bit flips in DRAM if you can control the timing.

Clock Sampling

The clock in a system is not a precise as you'd hope. Generally, you have a core clock that is generated through a PLL, and you also have some sort of realtime clock at 32.768kHz because that will overflow after bit-15 to give you a 1-second pulse. By sampling the fast clock with the slow clock, you can generate a series of bits; however, there's a few caveats:

oscillator jitter alone is not enough to create randomness
you can actually degenerate the randomness by trying to make it random if you have a poor circuit model.

As an x86 was mentioned (I know the least about this CPU btw), the time stamp counter combined with a 8254 on older machines or HPET on newer machines will get you a fast and a slow clock source. Michael S. McCorquodale dissertation should have a very good analysis on the clock jitter in both cases, or at least he did at this defense.

DRAM

If you can control the timing of the DRAM, you very likely could get a random noise source due to the fact that the noise in a semiconductor channel is a function two-way shot noise (and not classical Johnson Noise). Let's limit the scope to a single bit. You would have to put charge on the DRAM capacitor and then gradually delay the timing until that bit gave you a 1 or 0. The charge amplifier on the DRAM rows would also add to the noise, but I will ignore that component for now. Carver Mead and Rahul Sarpeshkar both use shot-noise in their neuromorphic engineering papers as a source for entropy on silicon neurons, and circuits people can derive it but I really don't know where an external reference is. The caveats:

DRAM timing control
assume that you are not near "freeze out" as the shot noise goes away.

I believe that either of these, combined with a hash, could be used as a means to generate a true random number as the noise sources are rooted in physics. You'll need to grab a cryptographer to take what I described to the next step.

Actually, you don't need any form of cryptographic input to extract from your scheme. The numbers simply form as a result of non cryptographic extraction, such as Toeplitz or von Neumann (in appropriate circumstances). The hard bit is measuring and stabilising the $H_\infty$ rate. — Paul Uszak, Nov 27 '19 at 21:49
You would have to put charge on the DRAM capacitor and then gradually delay the timing until that bit gave you a 1 or 0. – Surprisingly, at least for modern DRAM, different bits have different probabilities for decaying into a given bit, and those probabilities are not random nor secret. — forest, Dec 08 '19 at 03:52
@forest I completely agree, but there's some context here. For a "bank", the decays are pretty close, and this is due how mismatch and lithography mess things up over area. In the scope of a single bit, you can do this. It's did the hand waving over the clocks. When I use DRAM for cache instead of SRAM, we have really good access to the clock that control timing. If I am using external DRAM, I have much less access to the controller registers. — b degnan, Dec 08 '19 at 13:13
@forest I edited the question and limited the scope. Also, as this is esoteric semiconductor stuff, if you'd think it'd be helpful, I could write up the complete circuits explanation. This is how we'd verify the DRAM for cache to see when it'd give bad data... we just decreased the clocks until failure and added a safety margin. Another note, i hastily answered the question as I didn't like the other answer, which was accepted at the time. — b degnan, Dec 08 '19 at 13:18
@PaulUszak I disagree as it's possible with anything with cache actually, and you just have to have access to the timers. You could achieve the same thing with SRAM read delays from the charge amps as there's a wait time. As someone who makes hardware, I can verify that there's a lot of registers that no one knows about outside the implementers of that hardware. I'm sure if you got the right *unix kernel programmer over for coffee, they'd tell you how to do it as they are familiar with what is given. — b degnan, Dec 09 '19 at 19:40
@PaulUszak https://www.intel.com/content/dam/doc/datasheet/5100-memory-controller-hub-chipset-datasheet.pdf 3.9.1 to 3.9.4 have what is needed. Sadly, you'd have to dedicate a whole DIMM to it form the looks of it, which would decrease your memory for the time of creating the random bits. Still, it's feasible in hardware. — b degnan, Dec 09 '19 at 19:49

Paul Uszak · Answer 2 · 2019-12-09T21:28:50.883

-3

Can entropy be gathered in a system without any hardware dependencies?

No. Not real entropy. There's probably some over-used cliche about trying to, but essentially you can't create Kolmogorov randomness without physical processes.

But there's a but. All computer algorithms execute on hardware, and that hardware is non deterministic at the pico scale. Hard to predict execution paths, indeterminate gate propagation delays and even electron travel times, amongst others can all produce randomness for harvesting. The haveged algorithm attempts this. Even simply reading the system clock repeatedly in a high level garbage collecting language [e.g. Java's System.nanoTime()] is non deterministic.

It's gets even more jittery if you can include a spinning disc drive in the loop somewhere. And don't forget that even virtualised discs are mounted on physical ones.

Are there any reliable entropy sources?

I suggest not very. All of the algorithms above have to execute alongside other code, some of which may be unknown, and certainly at unknown locations. It it difficult to quantify how much randomness your algorithm may have harvested, and that directly undermines the security confidence. And the concept of code portability/cross platform operation is at odds with a harvesting algorithm tied to particular hardware. So it's difficult to get 128 bits of Kolmogorov randomness in a firm timescale. You will have to use a large safety factor (say x100) in estimating min.entropy, but it's doable if speed isn't important. You'll be down to only tens of bits/s on something like an Arduino.

There are even more exotic techniques such as reading DRAM state transitions. There's a 2018 summary of DRAM based cryptographic primitives here, from TRNG and physical unclonable function (PUF) perspectives.

If your machine has some form of remote/comms access, then you can just fetch entropy over the comms port from a device that has it to spare.

edited Dec 09 '19 at 21:28

answered Nov 25 '19 at 22:36

Paul Uszak

15,390
2
28
77

1

‘essentially you can't create Kolmogorov randomness without physical processes’—This is not true. Kolmogorov randomness has nothing to do with physical processes (and not much to do with cryptography either); it may be hard to compute but a string's Kolmogorov complexity relative to some language is independent of any particular hardware you used to pick the string or execute the language. – Squeamish Ossifrage Nov 27 '19 at 23:38
@SqueamishOssifrage Pretty fundamental to cryptography actually. The reason it’s called Kolmogorov randomness is that it’s upstream of the typical (odd) definitions of (information) entropy here. This always rears it’s head with TRNGs and ye olde /dev/random thingie. And computational indistinguishability means it’s too easy to abuse as I show in my RDRAND analysis. It’s 99% certain that’s a high entropy source with low Kolmogorov randomness and thus a PRNG, not a TRNG. That’s why I use much clearer definitions. – Paul Uszak Nov 28 '19 at 01:44
1

The reason it's called Kolmogorov randomness is that it's based on Kolmogorov complexity—which, for a string and a language, is the length of the shortest program in that language that prints that string, an (uncomputable!) mathematical property not related a priori to probability or entropy—and it scratches an apparently common human itch to assign a notion of ‘randomness’ to a particular string—specifically, whether in a specific language there's a program shorter than the string to print it—even though it has nothing to do with uncertainty or adversaries or prediction or random processes. – Squeamish Ossifrage Nov 28 '19 at 03:01
@SqueamishOssifrage As ever, I’m sure that I don’t understand what your point is. Perhaps you could explain your problem in a more succinct way? – Paul Uszak Nov 28 '19 at 23:43
2

Cryptography relies on ensuring the adversary doesn't know a secret key. If the adversary knows the key, it doesn't matter how long the shortest, say, Python program to print the key is—the Kolmogorov complexity of the key relative to the Python language—because the cryptography is hopelessly broken anyway. Maybe the shortest Python program is longer than the key itself—so the key has ‘Kolmogorov randomness’—but what's relevant to cryptography is the adversary's state of knowledge, which is a probability distribution on possible keys. – Squeamish Ossifrage Nov 29 '19 at 06:14
1

@SqueamishOssifrage That's grand, but this Q&A is about how to harvest entropy. And I've shown that quite conclusively. – Paul Uszak Nov 29 '19 at 17:12
As down votes accumulate, does anyone have any substantive comments other than grammar? – Paul Uszak Dec 09 '19 at 13:25
Let me see if I can understand. You are talking about Kolmogorov randomness because without hardware the randomness must come from software and can therefore be described by a short program in the language that generated it. The other comments from Squeamish indicate that this is not Kolmogorov randomness as he understands it. The downvotes seem to hinge mainly on this. The idea that entropy sources cannot be trusted because other processes may influence the entropy gathered seems sound to me (although that is a factor that is not made explicit in the question). – Maarten Bodewes Dec 10 '19 at 01:00
There are many other ideas that that are kind of hand-wavy. Somehow cross platform operation is introduced but marginally explained. There is talk about 128 bits of Kolmogorov randomness (unexplained what that might be) and a large safety factor (of possibly 100, but measured against what, perceived entropy vs actual entropy? - if you divide by 100, wouldn't the result be the perceived entropy?). I understand that this is a question with a complex answer, but maybe it can be simplified? – Maarten Bodewes Dec 10 '19 at 01:06
Take a look at the last sentence: "If your machine has some form of remote/comms access, then you can just fetch entropy over the comms port from a device that has it to spare." Which undoubtedly means: "hook up a TRNG if you haven't got one". Kind of dangerous advice if the link is not secured, by the way. – Maarten Bodewes Dec 10 '19 at 01:07
@Maarten-reinstateMonica I rarely understand what Squeemish is on about :-( I use the term 'Kolmogorov randomness' very carefully. It's important in TRNG design as it's extremely easy/tempting to fiddle with extraction/whitening algorithms that produce more information entropy out than get's put in. Thus the output is not Kolmogorov random (e.g. RDRAND, /dev/urandom and probably the Entropy Multiplication technique), yet you can't tell by solely measuring due to computational indistinguishability. E.g. $E_k((time || salt) = TRNG$ if you can't audit. It's the old "What is entropy" chestnut. – Paul Uszak Dec 10 '19 at 04:41
Simply write Java's System.nanoTime() in a loop, to a file without any buffering. Then look at the samples for yourself. Try it as an image, that's clearest. There’s lots of entropy, lots of correlation and lots of variance depending on what else you’re doing and your kit. That's jitter. And I don’t know what else. This means a large safety factor is required, and I’m just guessing at it’s size. Thus the ‘no reliable’ sources as $H_\infty$ can’t be stabilised. – Paul Uszak Dec 10 '19 at 04:42
Do you think this is worth pursuing/editing? – Paul Uszak Dec 10 '19 at 04:43
Let us continue this discussion in chat. – Maarten Bodewes Dec 10 '19 at 11:39

What are usable entropy sources in a headless machine without /dev/random or any cryptography API?

2 Answers2

Clock Sampling

DRAM