NIST 800-90B /Non-IID track - min-entropy result > 8 for 8-bit symbol

Question

I'm conducting an analyisis of noise sources based on the NIST 800-90B Non-IID track. Accordingly, I applied the non-iidmain.py script on a datarecord, which delivered the following output:

reading 1000000 bytes of data
Read in file ../some_data_rec.bin, 1000000 bytes long.
Dataset: 1000000 8-bit symbols, 256 symbols in alphabet.
Output symbol values: min = 0, max = 255

Running entropic statistic estimates:
- Most Common Value Estimate: p(max) = 0.00424421, min-entropy = 7.88029
- Collision Estimate: p(max) = 0.0119504, min-entropy = 6.3868
- Markov Estimate (map 6 bits): p(max) = 9.06705e-222, min-entropy = 5.73662
- Compression Estimate: p(max) = 0.00790364, min-entropy = 6.98327
- t-Tuple Estimate: p(max) = 0.006, min-entropy = 7.38082
- LRS Estimate: p(max) = 0.00389867, min-entropy = 8.0028
....

What I am wondering is the output for the LRS Estimate which delivers a min-entropy of 8.0028. The symbols in alphabet are defined at 256 i.e. one byte. So how is it possible a single symbol at one byte/8 bit is able to deliver more entropy than it is able to describe? I expected the maximum entropy value for one byte should be 8 bit?

Have you performed the IID test to see if your file is correlated? Warning - it takes ages. — Paul Uszak, Apr 14 '18 at 21:41
It should have occurred earlier, but you can't have 8 bits /byte either for non-IID entropy. 8.0 would have to be IID as by definition correlations reduce the entropy measure. — Paul Uszak, Oct 23 '18 at 01:52

Squeamish Ossifrage · Answer 1 · 2019-10-12T01:06:55.620

There are three separate reasonable questions here.

What does LRS Estimate mean?
Does the software compute it correctly?
Are stupid entropy estimation tools like this useful for your goal?

The setting is that we hypothesize specific families of generating processes for the data that came out of a black box. The families of generating processes are all quite simple-minded. For example:

A gremlin in the black box rolls a 256-sided die with unknown face probabilities $p_i$ independently for each octet in the string.
A gremlin in the black box emits outputs from unknown states by a hidden Markov model with unknown emission probabilities $e_i$ and transition probabilities $t_{ij}$.

Each of these models has a certain measure of min-entropy that is determined by the choices of parameters, such as the $p_i$. The NIST tool, like any other entropy estimation tool, estimates the min-entropy by fitting parameters and evaluating or estimating an upper bound on the min-entropy in the specific case of the model with the parameters it fit. Obviously 8.0028 is an upper bound on the min-entropy of an octet. How do we get there?

To address (1), what the LRS Estimate means, the LRS estimator posits a generating process something like this:

A gremlin inside the black box has a hat with $d$ strings of varying lengths on it, the $i^{\mathit{th}}$ string having $n_i$ copies. It shuffles the hat and draws a string out to add to the output; then it repeats.

The estimator works by guessing a few strings that might be in the hat based on repeating and nonrepeating contiguous substrings in the data, and uses the frequency of those strings as an estimate for the probability that they will occur. (The complete details of the computation are in NIST SP800 90A, §6.3.6.) From those probabilities it yields an estimate of the min-entropy using the standard $-\max_i \log p_i$ formula.

It doesn't consider all contiguous substrings in the output (there is a combinatorial explosion of them), so it ignores some of the probability space and thus slightly underestimates the probability of each substring it does consider, which has the effect of slightly overestimating the min-entropy—which would explain why it seems to produce an entropy rate of just over eight bits per octet.

Now, to address (2), does the software implement it correctly? That's a question for codereviews.se, and maybe for stats.se if you want to make sure that the posited model and the estimator for the probabilities is sensible and what was intended by NIST.

Finally, to address (3), whether this is useful for your goal, the answer is not very. It will detect certain families of patterns in your data, if those patterns appear. High estimates like the LRS Estimate in your sample are not useful: they just mean that this particular pattern didn't appear. Low estimates suggest that someone at NIST who wasn't even thinking about your particular noise source could predict its output with pretty good chances of success. That's usually bad sign. So don't worry about the slightly >8-bits-per-octet estimate for LRS: worry about the ~5-bits-per-octet estimate with a Markov model! Does your device alternate between internal states with somewhat more predictable emission probabilities?

What you really need to do is study the physics of the object in question from an adversary's perspective, and try to find the best way to predict the output knowing how your particular noise source works. Then compute the entropy of that model. If it's higher than all of the NIST estimates, either you had a lucky sample or someone at NIST who doesn't even know what you're working on was better at studying your noise source than you are.

Paul Uszak · Answer 2 · 2018-10-23T01:49:57.307

I feel your pain. Have a look at How to interpret the entropy results for a NIST test file?. This question was terribly received (-3 votes) yet no convincing anwer was offered as to the aberrations in entropy for the $ \pi $ file.

The test suite clearly presents it's output in bits/byte. I got Longest Reapeated Substring Test Estimate = 8.5665 and had to scratch my head. Clearly this is impossible. And it poorly compares with Compression Test Estimate = 0.0828647. That is two orders of magnitude out. The spelling mistake is also worrisome.

The assessment formulae in Recommendation for the Entropy Sources Used for Random Bit Generation are probably good, but I can only conclude that this software is pants. Was an [assert] statement too much to ask? Other than the acronym NIST appearing in the title, there is no guarantee that the tools work reliably. The software specifically states that there is none. You'll also find that the C++ implementations give wholly different values from the Python ones. The entropy must not be Pythonic.

I conclude that this software is unreliable, and therefore unsafe for cryptographic use. If you study many papers on entropy generation and true random number generators, you will be hard pressed to find a single instance of this software being used. I've not found any references. That's strong anecdotal evidence against it. I've deleted it and will be using strong compression tools for entropy measurement instead. They're ideally suited to finding correlations as that's their raison d'être

PS. I also asked Do the NIST tools on GitHub measure entropy reliably? but no one was interested.

PPS. Have a look at my answer to How to interpret the entropy results for a NIST test file? In the papers and links, you'll find references to entropy rates of 11 and even 21 bits/byte. The Markov model likes to output 21 when it gets confused. This just shows that no rigour was used in the code implementation. The clear conclusion of independent findings on 90B is that they are wholly unreliable and grossly inappropriate for real world entropy.

Yes, I read you post a few days ago but I was confused what you meant by 'two versions'. Now I grasped you meant the python i.e. C++ implementation of the NIST 800-90B tests. I can confirm they deliver partly different results for the same input and params. Some estimates vary not at all, some at decimal place, but LZ78Y Prediction Estimate varies in my opinion a little to much = C++: 6.63806 / python: 7.91324. I agree, those results raise doubts regarding the precision of those tools. — OliverJL, Apr 15 '18 at 11:40

NIST 800-90B /Non-IID track - min-entropy result > 8 for 8-bit symbol

2 Answers2

Linked