0

Suppose we have a fair $n$-sided die, with faces labeled from 1 to $n$ and an equal probability of rolling any available number. Further suppose that we pick a sum $S$ that we wish to reach by rolling the die an arbitrary number of times and adding the result of each roll. I want to calculate the probability of [the sum of the rolls] hitting $S$ for any values of $S$ and $n$. I know that for large $S$ the probability tends towards $\frac{2}{n+1}$, but I want to calculate the exact value for specific inputs, not just the behavior in the limit.

For example, say we have a six-sided die and we want to reach a target sum of 10. We roll once and get a value of 2, initializing the sum at 2. We roll again and get a value of 6, giving a sum of 8. We roll again and get a value of 3, giving a sum of 11, meaning we missed our target $S$.

To calculate the probability $P$ of any value of $S$ being reached, we can sum up the probability of each possible run of rolls that would produce that sum.

This is straightforward for $S$ less than or equal to $n$, as it is just a sum of

  • [probability of rolling $S$ on the first roll, $\frac{1}{n^1}$]
  • plus [number of ordered ways to get a sum of $S$ from two numbers] × $\frac{1}{n^2}$
  • plus [number of ordered ways to get a sum of $S$ from three numbers] × $\frac{1}{n^3}$
  • [...et cetera, all the way down to...]
  • plus [probability of rolling $S$ ones all in a row, $\frac{1}{n^S}$]

Enumerating all possible permutations of rolls which sum to $S$ (the compositions of $S$) is simple, but gets long very quickly with increasing $S$, as the total number of compositions is $2^{S - 1}$. The first four sets of compositions are:

  • S = 1
    • 1
  • S = 2
    • 2
    • 1 + 1
  • S = 3
    • 3
    • 2 + 1
    • 1 + 2
    • 1 + 1 + 1
  • S = 4
    • 4
    • 3 + 1
    • 2 + 2
    • 2 + 1 + 1
    • 1 + 3
    • 1 + 2 + 1
    • 1 + 1 + 2
    • 1 + 1 + 1 + 1

It is possible to generate the next set ($S$ + 1) by taking a copy of the set for $S$, incrementing the value of the first number in each line by one, and appending another copy of the set for $S$ with a [1 + ] inserted at the beginning of each line.

Adding up the total number of permutations with length $k$ reveals that the counts follow Pascal's triangle (the pattern of binomial coefficients): there is always one permutation of length 1, one permutation of length $S$, and in between the classic symmetric pattern generated recursively.

Taking advantage of the recursion and patterns, in formal notation this reduces to:

$$ P_S = \sum_{i = 1}^{S} {\left(\binom{S - 1}{i - 1} × \frac{1}{n^i}\right)} $$

where $\binom{a}{b} = \frac{a!}{b! × (a - b)!}$

Expanding the sum to a continuous function rather than discrete values works out to $p(S, n) = n^{-S} × (n + 1)^{S - 1}$. (Functions like this are my ultimate goal.)

However, when $S$ is greater than $n$, some of those combinations will contain values that cannot be rolled and which thus must be excluded from the total. For example, if $S = 4$ and $n = 2$, we need to subtract all permutations of values that sum to 4 which include either 3 or 4, since those numbers aren't included on the 2-sided die:

  • $\color{red}{4}$
  • $\color{red}{3 + 1}$
  • $2 + 2$
  • $2 + 1 + 1$
  • $\color{red}{1 + 3}$
  • $1 + 2 + 1$
  • $1 + 1 + 2$
  • $1 + 1 + 1 + 1$

That means subtracting $1 × \frac{1}{2^1}$ and $2 × \frac{1}{2^2}$ from $p(4, 2) = 2^{-4} × (2 + 1)^{4 - 1}$.

This is where I've gotten stuck.

I want to find a function or set of functions which automatically subtract the probabilities of all permutations that include invalid values, rather than having to manually (or programmatically, both of which get slow quickly because of the exponential growth) count how many permutations of each length must be excluded. But I haven't been able to figure out a way to do that.

Is there such a set of functions? If not, can it be proved that there isn't?

This question was inspired by Numberphile's recent video "Pi is Evil".

UPDATE 1

I've made some progress. I figured out that the number of compositions of length $i$ that need to be removed when $n$ is less than $S$ is given by the expression $\frac{i × Γ(S - n)}{Γ(i) Γ(-i + S - n + 1)}$, up to a maximum of $\frac{Γ(S)}{Γ(i) × Γ(S + 1 - i)}$. However, for values of $S$ greater than $2 n$, this removes too much, with the ultimate average tending towards zero instead of what I know to be the correct limit of $\frac{2}{n+1}$. I haven't figured out where the problem lies yet.

UPDATE 1

I realized that $\frac{i × Γ(S - n)}{Γ(i) Γ(-i + S - n + 1)}$ counts, for example, how many 3s are in compositions of length $i$ and not how many compositions of length $i$ contain at least one 3. This means that it double-counts compositions that contain two copies of an invalid number, which only becomes a problem when $S$ goes above $2 n$. When $S$ goes above $3 n$, it is possible for a composition to contain up to three copies of an invalid number and thus be triple-counted, and so on. So I need to find some way to count how many compositions of length $i$ contain at least one 3 specifically.

Lawton
  • 1,759
  • 7
  • 18
  • 2
    The problem of finding the probability of rolling $S$ in a set number $m$ rolls can be framed in terms of generating functions if one takes the coefficient of $x^S$ in $(x + x^2 + \cdots + x^n)^{m}$ and divides by $n^m$. – chirpyboat73 Oct 06 '23 at 22:53
  • 1
    The average increment is $(n+1)/2$, so in the long run you hit $2/(n+1)$ of all possible sums. This is a good estimate for the probability you hit a given sum $S>n$. – Karl Oct 06 '23 at 23:25
  • Just as a note: your attempted closed form for $P_S$ involving binomial coefficients relies on the stars and bars method; however as you pointed out, there is no restriction on the sizes of the "bins," thus leading to impossible dice rolls. Although I can't entirely be sure, I'm doubtful there exists a nice, exact closed form for what you're looking for. – chirpyboat73 Oct 06 '23 at 23:26
  • 1
    Already asked: https://math.stackexchange.com/questions/4758272/keep-rolling-two-dice-until-the-cumulative-sum-hits-1000 https://math.stackexchange.com/questions/9210/probability-of-cumulative-dice-rolls-hitting-a-number – leonbloy Oct 06 '23 at 23:30
  • @Karl I know about the limit behavior, but I want to be exact with specific non-infinite inputs. – Lawton Oct 07 '23 at 00:37

2 Answers2

3

This can be easily computed using recursion.

Let $P_t$ be the probability that our sum hits the value $t$. Then $P_1=\frac{1}{n}$, and in general, $$P_t= \frac{1}{n} \left( P_{t-1} + P_{t-2} + \cdots + P_{t-n}\right)$$

with $P_0=1$ and $P_t=0$ for $t<0$.

A graph for $n=6$

enter image description here

It converges quickly to $P_{\infty}=\frac{2}{n+1}$

Intuitively: each succesive dice hits one value and leaves untouched $0, 1 ... n-1$: in average, $(n-1)/2$. Hence, in average, the proportion of hitted values is $\frac{1}{1+(n-1)/2}=\frac{2}{n+1}$

A more formal proof, here.

leonbloy
  • 63,430
  • I want a formula to calculate the values directly, without resorting to recursion, even if it's more complicated to write down. I specifically would like continuous functions for the intervals of width $n$ after the first one which I give in my question. – Lawton Oct 07 '23 at 00:51
  • As you can see in the answers to the similar questions, there is no "formula to calculate the values directly" – leonbloy Oct 07 '23 at 00:58
  • I have made some progress towards a direct formula, though it has a flaw I'm missing somewhere; I added it to the end of my question. – Lawton Oct 07 '23 at 01:26
  • @leonbloy: I suspect OP would be content with flattening the recursion into an iteration (i.e., summation), but I could be wrong. – Brian Tung Oct 11 '23 at 16:39
0

Based on the answer to this question, the formula I want for the discrete case is

$$ \sum_{k = 1}^{S} \left(\sum_{i = 0}^{\lfloor \frac{S - k}{n} \rfloor} \left((-1)^i × \binom{k}{i} × \binom{S - n × i - 1}{k - 1}\right) × \left(\frac{1}{n}\right)^k\right) $$

Plot of discrete points

The continuous case is a little more complicated. It consists of a piecewise-defined function where the segment $P_j(S)$ is defined as

$$ P_j(S) = \sum_{m = 0}^{j} \left(\frac{S × (n + 1)^{S - m × n - m - 1}}{(S - m × n) × n^{S - m × n}} × \frac{Γ(m + m × n - S)}{Γ(m + 1) × Γ(m × n - S)}\right), \\ j × n + 1 ≤ S < (j + 1) × n + 1, \\ j ∈ \mathbb{N}^0 $$

Plot of piecewise function

The sum with the greatest probability of being rolled is $S = n$, with probability $P = \frac{(n + 1)^{n - 1}}{n^n}$. If you restrict $S$ to values greater than $n$, the sum with the greatest probability of being rolled is to $S = \frac{(n + 1)^{n - 1}}{n^n} + \frac{1}{\ln\left(\frac{n}{n + 1}\right)}$, with probability $P = \frac{-\left(\frac{n}{n + 1}\right)^{n - n^{-n} × (n + 1)^{n + 1}}}{e × (n + 1)^2 × \ln\left(\frac{n}{n + 1}\right)}$.

Lawton
  • 1,759
  • 7
  • 18