I've generated 100M of random numbers in range (0..255). These numbers fail Dieharder tasts (bitnum = 8). However, I can pass this test by combining four bytes into an int. Randomness of the set didn't change, so, something isn't correct with the test settings. What could cause it?
def prep_input(file_name, number_of_iterations,numbit=32):
f = open(file_name, "a")
max_rand = (-1 + 2 ** numbit)
print(str(max_rand))
a = len(str(max_rand))
for i in range(0, number_of_iterations):
f.write(str(randint(0, max_rand)).rjust(a) + "\n")
f1.close()
With header it looks like
#==================================================================
# Dieharder
#==================================================================
type: d
count: 16180337
numbit: 8
255
0
93
....
dieharder -a -g 202 -f input.txt >>result.txt
Report on bytes: Set fails when bytes are used
Report on ints: Set passes tests when ints are used
Update:
- I've tested up to 4Gb of data. This is close to 2^32 although ASCII format requires ~4 characters (4 bytes) per a byte of actual data.
- When this file is converted to byte stream it passws several tests and then reaches the end of data. When concatinated it passes tests better than ASCII int version.
When Python random is fed as a binary to
import time import sys import os from random import randint
j=0 div = 2
while(True): sys.stdout.write(chr(randint(0, 255))) # newFileBytes = [randint(0, 255)] # os.write(1, (''.join(chr(i) for i in newFileBytes)).encode('charmap')) sys.stdout.flush() j+=1
if 0 == j%div : dh_file = open("progress.log", "a") dh_file.write(str(j)+"\n") dh_file.close() div = div*2
With python to_terminal.py | dieharder -a -g 200 >>test_result.dhres
I get 16G (17179869184) as the last record in the log and out of 75 tests only two tests are weak the rest is a pass. Python passes DH when supplied as bit stream Clearly, the problem follows 8 bit ASCII representation.
dieharder -a
requires much more data (in the order of $2^{32}$ bytes), see citations from the manual there. Also, testing a would-be CS(P)RNG using Diehard is somewhat like testing a bathyscaphe in a pool: not detecting a failure allows to draw no operational conclusion. – fgrieu Jan 21 '19 at 13:45