Can Von Neumann bit extraction be made more efficient?

Question

I want to develop a previous question regarding Von Neumann debiasing /randomness extraction.

The typical solution (as posted) is to take pairs of throws and output a bit based on a comparison of the two throws. This is the most simple solution.

My problem is that it's very wasteful of input entropy. The biased die can have values 1 - 6, but only a single bit is output. So 2 x log2(6) bits in, produce 1 bit out. It's actually less as the throws may be discarded if identical. What happens if the die is a D & D D120 (Disdyakis triacontahedron - 120 sides)? You'd be inputting 13.8 bits of entropy, and getting (practically) less than 1. That's incredibly wasteful of precious entropy. Is there a clever variant of Von Neumann that can output more than a single bit at a time?

My particular interest is actually not a bigger die, but more of them. For example you simultaneously throw 6 biased regular dice. Can you get better than 1 purely random bit out per pair of throwing rounds?

score 2 · Accepted Answer · edited Apr 13 '17 at 12:48

Suppose you are going to throw a 6-sided die N times. You can convert that to an integer in the range $[0,6^N-1]$, by viewing the sequence of N outcomes as a base-6 representation of an integer. This gives you a random number chosenly uniformly at random from $[0,6^N-1]$.

Now pick a value of $k$ such that $2^k < 6^N$. Let $Y = \lfloor 6^N/2^k \rfloor \times 2^k$. You can now use the following algorithm to output a $k$-bit number (much of the time):

Generate a random number from the range $[0,6^N-1]$ by tossing a 6-sided die $N$ times. Suppose the number you get is $x$. If $x < Y$, then output the value $x \bmod 2^k$ (which can be viewed as $k$ bits). Otherwise, output nothing -- you will have to re-roll.

This produces a uniform random distribution on $k$-bit values. It produces output on any given attempt with probability $Y/6^N$; on average, the expected number of times you'll have to repeat this procedure is $6^N/Y$, which is a bit more than $1$ (but not much more than $1$, as long as you're not too greedy and don't pick $k$ too large).

It's possible to do a bit better, by saving information about the value of $x-Y$ when you repeat the process, but this might not make a big difference in practice.

See also Generating uniformly distributed random numbers using a coin and Is rejection sampling the only way to get a truly uniform distribution of random numbers? and How to simulate a die given a fair coin and Isn't std::bernoulli_distribution inefficient? Designing a bit-parallel Bernoulli generator.

I can handle the multiplications, but what are the brackets in ⌊6N/2k⌋ ? Is that integer floor of? — Paul Uszak, Mar 30 '17 at 12:00

Can Von Neumann bit extraction be made more efficient?

1 Answers1