2 hours ago I thought I had this figured out, but now I am doubting myself and want someone to validate my algorithm.
I want to take a stream of k
trusted random bits and convert it to groups of 5 digits in a uniformly distributed range from 0-99999. To do this I take k
and truncate to a multiple of 32, let's say the result now has n
bits where n = k - k % (4*8)
. To create one group of 5 digits we load 32 bits into a variable we'll call var_i
. The result will be stored in var_e
. To get 5 digits from 0-99999 takes 17 bits, so we perform the loop:
for (int i = 0; i < 17; i++)
var_e += ((var_i >> i) % 2) << i
this way we preserve only the first 17 bits. Then we perform var_e = var_e % 100000
to truncate var_e
to 5 digits. var_e
is now one complete block of 5 digits.
An alternative algorithm could be to just take a 32 bit number var_e
and perform var_i = var_e % 100000
. This would take just the last 5 digits of the 32 bit number, which fall into the range 0-99999.
These algorithms both waste nearly 50% of the input data, but it would be fairly easy to change the first one to accept 17 bits of input.
var_i = var_e % 100000
give an unbiased output? For example, 0 is reached whenvar_e
is 0 or 100000, but 99999 is reached only whenvar_e
is 99999. – fgrieu Apr 15 '13 at 05:59while ((output=rnd32()&0x1FFFF)>99999);
is a correct way to get perfectly random input over 0-99999, assumingrnd32()
return a 32-bit random. It is wasteful of input bits, but often that's a non-issue. – fgrieu Apr 18 '13 at 16:58/dev/random
, or graphing something with little visual significance to the naked eye like the values obtained vs iteration number. You want to count the number of times a value is reached over a number of experiments like 10 times the number of possible values or more, then graph that as a function of the number. It is a bit difficult making that for the range $[0\dots2^{32}-1]$, but it is easy with $[0\dots99999]$ and will reveal the problem with one of the alternatives considered, and perhaps the other with many experiments. – fgrieu Apr 20 '13 at 16:19/dev/random
? I was just usingdd
and a pipe to send the data throughstdin
. The graph I'm am generating is the value of a number vs. the number of times it occurs. – chew socks Apr 21 '13 at 23:09/dev/urandom
right in my program, and got the same result. – chew socks Apr 21 '13 at 23:10output=(rnd32()&0x1FFFF)%100000
, computed for enough values, you will see a defect clearly. – fgrieu Apr 22 '13 at 07:26