Fastest way to generate random integer from fair dice rolls

Question

Given I have a fair $n \in \mathbb{N}$-sided dice, and I want to generate an integer in the range of $[0, r), r \in \mathbb{N}$ with the same probability. I may roll the dice any number of times and stop anytime after observing the results. What is the best way to do this so that I do the least amount of expected dice rolls?

Intuitively, we can do it greedily so that anytime the number of possible outcomes is more than $r$, we take those outcomes out and re-roll when the rejection happens. In code, those will be something like (in the code, the dice numbers are zero-indexed):

def dice_rand(r, n):
   cur_val = 0
   max_val = 1
   while True:
      cur_val = n * cur_val + roll_dice(n)
      max_val = n * max_val
      d = floor(max_val / r)
      if cur_val < d * r:
         return floor(cur_val / d)
      cur_val = cur_val - d * r
      max_val = max_val - d * r

Now I have problem figuring out the expected dice rolls number for above algorithm. Also how do I prove the above algorithm is the most efficient?

Unclear if following is always best approach, re $n,r$ can vary. Best to illustrate with example. Suppose that $n = 23, r = 8.$ Then, reject all rolls $> 16$, and if the roll is $\leq 16$, use the residue of the roll, $\pmod{8}.$ — user2661923, Jun 11 '21 at 07:22
In my algorithm, I use the quotient rather than the residue, but the probability and the expected number of rolls is still the same as the rejection case is the same, no? — Bimo Adityarahman, Jun 11 '21 at 07:28
Weird, you are using the quotient, and I am using the remainder. For the specific case of $n = 23, r = 8$, please describe, in very clear detail, in English, not computer code, how you would do it. — user2661923, Jun 11 '21 at 07:30
@Breakingnotsobad No, if $n=4, r = 3$ then the roll of $4$ is rejected, just like, when $n = 23, r = 8,$ any rolls $> 16$ were rejected. — user2661923, Jun 11 '21 at 07:32
@user2661923 Forgot to clarify that the dice is zero-indexed so that the number lines up better. Do the first roll, and if the roll is $< 16$, use the integer part of roll divided by 8. Else if the roll is between $16$ and $22$, substract it by $16$ and do a reroll while also keeping track of the maximum range of possible number generated by multiplying it with $23$ and adding it with the second roll result. — Bimo Adityarahman, Jun 11 '21 at 07:36
For what it's worth, the acid test of my algorithm being best might be something like $n = 199, r = 100$, where about $(1/2)$ of the rolls are rejected. For $n=199, r = 100$, I can't think of a better approach, but perhaps someone else can. — user2661923, Jun 11 '21 at 07:36
@BimoAdityarahman As your analysis indicates, the fact that the dice is zero-indexed doesn't affect whether one strategy is better than another. Unclear, based on your description, what you intend by if roll $< 16$ use the integer part of the roll, divided by $8$. If, for example, you rolled an $11$, what number would that be recorded as? — user2661923, Jun 11 '21 at 07:40
When $n = 23, r = 8,$ using my approach, I would expect that (only) $\frac{23}{16}$ rolls are needed. — user2661923, Jun 11 '21 at 07:42
Sorry I mixed it up, it's not the quotient by dividing of $r$. What I mean is the roll result of $0, 1$ go to $0$; $2, 3$ go to $1$; $\ldots$; until $14, 15$ go to $7$. Yeah as long as the number of results mapped are all the same, the use of quotient or remainder doesn't seem to affect anything. — Bimo Adityarahman, Jun 11 '21 at 07:43
@BimoAdityarahman There is still the issue that if the roll is between $16$ and $22$, I reject the roll, while you (somehow) try to use it. The questions are [1] whether your strategy will still result in ${0,1,\cdots, 7}$ with equal probability? and [2] Whether your approach, on average will require fewer than $\frac{23}{16}$ rolls? — user2661923, Jun 11 '21 at 07:46
Yes, that's what I intended to do for the purpose of making the expected number of rolls to be fewer than the natural $\frac{n}{r}$. For example with $n=5, r=3$, for the rejection case of roll $3$ or $4$ on the first roll, we substract it with $3$ to have the current value $0$ or $1$. Then we merge it with the result of the second roll by multiplying the current value with $n$ and adding the result of the second roll. This create an equiprobable chance of the current value being in the range of $[0, 10)$ by which we divided it again to the $3$ possible result. — Bimo Adityarahman, Jun 11 '21 at 07:52
Some related posts: Simulate a 12-sided fair die with a 10-sided fair die, How to generate a random number between 1 and 10 with a six-sided die?, Simulate repeated rolls of a 7-sided die with a 6-sided die — Jaap Scherphuis, Jun 11 '21 at 07:57
Intuitively, as this increase the possible value generated on the second or more rolls, it increase the chance of the generated value to be able to be grouped together to obtain $r$ different result, hence less number of rolls needed. But ofc I struggled to answer [2] @user2661923 — Bimo Adityarahman, Jun 11 '21 at 07:58
@JaapScherphuis thanks for the resource. The algorithm manage to yield the best (known) expected number of rolls for the first two links, but the given algorithm seems to have been discussed in the third link. I'll still keep the question though (?) because I'm interested in any closed form formula of the number of rolls and how do I get around proving its efficiency. My current formula is $\Sigma \frac{n^i \mod r}{n^i}$ for $i$ from $0$ to infinity. — Bimo Adityarahman, Jun 11 '21 at 08:37

score 1 · Accepted Answer · answered Jun 11 '21 at 11:35

Any algorithm to roll an $r$-sided die with an $n$-sided die will inevitably "waste" randomness (and run forever in the worst case) unless "every prime number dividing [$r$] also divides [$n$]", according to Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner. In general, the greedy strategy is the best that can be done; another practical strategy is to use rejection sampling to get arbitrarily close to no "waste" of randomness (such as by batching multiple rolls of the $n$-sided die until $n^m$ is "close enough" to a power of $r$).

Take the much more practical case that $n$ is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and $r$ is arbitrary. In this case, this "waste" and indefinite running time are inevitable unless $r$ is also a power of 2.

See also these questions on Stack Overflow:

Fastest way to generate random integer from fair dice rolls

1 Answers1