8

Suppose I receive a list of 1 million coinflips, and I want to know how likely it is that the list was randomly generated.

My first thought would be to count the number of heads and tails, which should be evenly distributed (around 500.000). But suppose the distribution looks normal, its still possible the list contains patterns or repititions. For example, the first half of the list may be the heads, and the last half the tails. In real random data, that would be highly unlikely.

So how do you calculate the 'randomness' of this list?

Maestro
  • 1,069
  • 1
  • 10
  • 16

2 Answers2

7

You can never actually prove that it was generated randomly or pseudorandomly. You can only prove with high probability that it wasn't. Calculating the number of heads and tails is one way. Another is calculating runs of consecutive heads or tails. There is a suite of statistical tests from NIST in their FIPS 140-2 document which is a good place to start.

Having said that, for cryptographic purposes you really need to be sure that you are using a secure random number generator and there aren't any tests you can apply to the data itself to sufficiently guarantee that it is secure enough.

Travis Mayberry
  • 1,305
  • 9
  • 8
  • 1
    Can you never prove pseudorandomness or do we just not know how? Existance of one-way functions would imply existence of pseudorandom generators right? It would also imply $P\neq NP$. – mikeazo Oct 17 '14 at 18:24
  • 1
    Given an algorithm, you can prove that it's output is pseudorandom based on some computational assumption, but that is as close as you can get. For instance, Blum-Blum-Shub is a PRNG that is pseudorandom if factoring is a hard problem. As you say, PRNG implies $P \neq NP$, which is not known. – Travis Mayberry Oct 17 '14 at 19:37
  • You mention that you can only prove with high probability that it wasn't produced randomly or pseudorandomly, but that's not true at all against an intelligent adversary. Any statistical test you perform can be bypassed. – Stephen Touset Oct 23 '14 at 22:08
  • My statement was a bit ambiguous, but I did not mean that with high probability you will be able to prove that something is not random. I meant that all you can hope for is to prove, with a high degree of certainty, that something is non-random. – Travis Mayberry Oct 24 '14 at 23:01
1

Find NIST statistical tests suite for (p)rng provided by National Institute of Science and Technology (Formerly NBS, National Bureau of Standards) here http://csrc.nist.gov/groups/ST/toolkit/rng/index.html

and documention at NIST (see above) or here: National Institute of Science and Technology (Formerly NBS, National Bureau of Standards)

ABri
  • 209
  • 2
  • 9