1

Scenario in Question

Say we roll $n$ $d$-sided dice. We look at each dice to see if the dice face reads our target number $t$ or higher, and if it does, we count it as a "hit". The result of our roll that we're interested in then is the number of "hits", or $h$.

We then decide that if we roll $e$ or higher $(d≥e>t)$, that we not only count our roll as a "hit", but we get to roll that die again, and if the result on that rerolled die continues to be $e$ or higher, then we once again count the roll as a hit, and roll again. However, this can only occur $r$ times - If a die has shown $e$ or higher on $r$ consecutive rolls, we roll a final time, without the possibility of rerolling again.

What would be the Probability Mass Function of this scenario in terms of $d$, $e$, $t$, $n$, $h$ & $r$?

Prior Work - Unlimited Rerolls

A previous answer has given us a good understanding of a similar scenario with an unlimited reroll. That particular answer assumes that $e=d$, but it doesn't take much work to rewrite the final answers to allow us to choose $e$. We can convert the Probability Generating Function to:

$$F(z^h)=\left (\frac{(e-t)z+(t-1)}{d-(d-e+1)z} \right )^n$$

And convert the Probability Mass Function to:

$$P(H=h)=\frac{(t-1)^n(d-e+1)^h}{d^{n+h}}\sum_{k=0}^{\max(n,h)}\binom{n}{k}\binom{n+h-k-1}{h-k}\left [ \frac{d(e-t)}{(t-1)(d-e+1)} \right ]^k$$

We can consider these useful as these functions should be related to what we're looking for - our functions are effectively the limit of our target functions as $r \to \infty$.

Building a Probability Generating Function for limited Rerolls

Let us look at the situation with a single die. Our first roll has three probabilities: Success (S), Fail (F) and Explode (E), and those probabilities are: \begin{align*} &S_E= \frac{e-t}{d} &F_E= \frac{t-1}{d} &&E_E= \frac{d-e+1}{d} \end{align*}

These remain our probabilities until we reach our final reroll - at that point, we cannot reroll, so instead of three probabilities, we only have two:

\begin{align*} &S_T= \frac{d-t+1}{d} &F_T= \frac{t-1}{d} \end{align*}

We can encode this into a probability generating function by include $r$ as one of it's parameters, as follows:

$$F(z,r)=\begin{cases} \frac{(e-t)z}{d}+\frac{t-1}{d}+\frac{(d-e+1)zF(z,r-1)}{d} & \text{ if } r>0 \\ \frac{(d-t+1)z}{d}+\frac{t-1}{d} & \text{ if } r= 0 \end{cases}$$

We can then unroll the recursion into the following:

$$F(z)=\left ( \frac{(d-e+1)z}{d} \right )^r\left ( \frac{(d-t+1)z+ (t-1))}{d} \right )+\sum_{k=0}^{r-1}\left ( \frac{(d-e+1)z}{d} \right )^k\left ( \frac{(e-t)z+(t-1)}{d} \right )$$

We can convert the sum into a closed form $\frac{d(1-(\frac{(d-e+1)z}{d})^r}{d-(d-e+1)z}$, and after a couple of cancellations gives us the final Probability Generating Function for a single die:

$$F(z)=\left ( \frac{(d-e+1)z}{d} \right )^r\left ( \frac{(d-t+1)z+ (t-1))}{d} \right )+\frac{\left ( 1-\left ( \frac{(d-e+1)z}{d} \right )^r \right )\left ( (e-t)z+(t-1) \right )}{d-(d-e+1)z}$$

Which we then take to the $n$th power for $n$ dice. We can check to ensure that we're on the right track by setting $r$ to infinity and seeing what happens. In this case, each $\left (\frac{(d-e+1)z}{d}\right )^r$ goes to zero, and we indeed end up at the PGF for unlimited rerolls.

First Attempt at a Probability Mass Function

It seems obvious from the PGF that we're going to be looking at something broadly similar to the unlimited PMF, but taking more degrees of freedom into account. The limited PGF suggests that there are 4 outcomes that contribute to $h$, including one we didn't specify before - The final reroll on any given dice needs tracking independently from the other rerolls, as that imposes limits on the terminal outcomes. So, we know that:

$$h = S_E + S_T + E_E + E_T$$

Let's assign $j=S_E$, $k=E_T$ and $l=S_T$. With the above: \begin{align*} h&=E_E + j + k + l\\ E_E&= h-j-k-l \end{align*}

We also know that $E_T = S_T+F_T$, which means:

\begin{align*} k&=l+F_T\\ F_T&= k-l \end{align*}

We also know every die has to terminate on $S_E$,$F_E$,$S_T$ or $F_T$ (ie that these outcomes must equal $n$), thus:

\begin{align*} n&=j + F_E + l + k-l\\ n&= F_E + j + k\\ F_E&= n-j-k \end{align*}

And that means that we know how all the probabilities relate to each other.

At this point, I become a lot more uncertain. I think that we need three binomials to account for all the probabilities: $\binom{n}{j}$, $\binom{n-h-j-l-1}{h-j-l}$ and $\binom{k}{l}$ - the first two for the exploding outcomes, and the third for the terminal outcomes. I thought that the three summation iterations should be:

$$\sum_{j=0}^{\max(n,h)}\sum_{k=0}^{\left \lfloor \frac{h-j}{r}\right \rfloor}\sum_{l=0}^{k}$$

And so the final PMF should be:

$$P(H=h)=\sum_{j=0}^{\max(n,h)}\sum_{k=0}^{\left \lfloor \frac{h-j}{r}\right \rfloor}\sum_{l=0}^{k}\binom{n}{j}\binom{n+h-j-l-1}{h-j-l}\binom{k}{l}\left ( \frac{e-t}{d} \right )^j\left ( \frac{d-t+1}{d} \right )^l\cdot\left ( \frac{t-1}{d} \right )^{n-j-k}\left ( \frac{t-1}{d} \right )^{k-l}\left ( \frac{d-e+1}{d} \right )^{h-j-k-l}\left ( \frac{d-e+1}{d} \right )^k$$

Or, if we simplify to put all the iterators in one place:

$$P(H=h)=\frac{(t-1)^n(d-e+1)^h}{d^{n+h}}\sum_{j=0}^{\max(n,h)}\sum_{k=0}^{\left \lfloor \frac{h-j}{r}\right \rfloor}\sum_{l=0}^{k}\binom{n}{j}\binom{n+h-j-l-1}{h-j-l}\binom{k}{l}\cdot\left [ \frac{d(e-t)}{(t-1)(d-e+1)} \right ]^j\left [ \frac{d(d-t+1)}{(t-1)(d-e+1)} \right ]^l$$

But after coding the function, it doesn't work. It doesn't appear to correctly deal with the terminal case. The theory behind the last two iterators is that, if $h-j$ is $r$ or higher, we need to iterate through the potential number of $E_T$ outcomes, and from those outcomes, iterate over the potential number of $S_T$ outcomes, and that's how I thought the terminal cases needed to be dealt with. If I set r to $\infty$ it does, indeed, revert back to the unlimited form, which is why I thought I was on the right track.

I've been working on this for over a month and honestly, I'm at my wit's end and need someone else's perspective. Where am I going wrong here? Are my iterators wrong? Are the binomials wrong? What is the correct solution here?

  • Are you tied to the PGF formulation? This is 2-3 lines in AnyDice or my own Icepool. – HighDiceRoller Nov 15 '23 at 05:50
  • The goal isn't just to get the probabilities (I have the PGF already, I have the tools to find any answer I want around this distribution), I'm explicitly trying to get a mass function so I have everything related to the distribution. I have mass functions for almost every other dice pool variant, I'd like to "complete the set" if I can. – Lee Davis-Thalbourne Nov 15 '23 at 05:58

2 Answers2

1

I think I have part of the approach. In the unlimited case you had to allocate $x$ explosions among $n$ dice, with duplicates allowed; the coefficient was thus $n$ multichoose $x$, or

$$ \binom{n + x - 1}{x} = \binom{n + x - 1}{n - 1} $$

In this limited case you have the additional restriction that no individual die may be chosen to explode more than $r$ times. This is like the number of possible ways to allocate $x$ balls to $n$ bins with capacity $r$ each. A brief search turned up an inclusion-exclusion-based solution, which is

$$ \sum_{k = 0} \left( -1 \right)^k \binom{n}{k} \binom{n + x - 1 - k \left(r + 1\right)}{n - 1} $$

There are some remaining subtleties accounting for how many dice hit the explosion limit or not, and how many stopped exploding on a success or failure, but hopefully this is on the right path.

  • Interesting. The $k$ and $l$ iterators are attempting to do the same work here - I Identified that the upper limit on $E_T$ outcomes (ie how many dice hit the reroll limit) is always $\lfloor\frac{h-j}{r}\rfloor$, but if n is large, you could well have a lower number. That's what the $k$ iterator is doing here - assigning $E_T$ outcomes, and then the $l$ iterator assigns $S_T$ outcomes. But I'll play with this approach, see if it bears fruit. – Lee Davis-Thalbourne Nov 15 '23 at 10:24
  • 1
    AFAICT the missing parts are a) each of the $k$ dice that reach the reroll limit also brings exactly $r$ $S_E$s along with it, which reduces the number of other successes to allocate b) you have to assign the $E_E$s that resulted from the $n - k$ dice that didn't reach the reroll limit to those dice without allocating more than $r - 1$ to a single die, and c) the $S_E$s aka $j$s -- less those processed in point a -- also have to be chosen from those $n - k$ dice that didn't reach the reroll limit rather than from all $n$. – HighDiceRoller Nov 16 '23 at 00:14
  • Playing with the numbers, a few insights: first, I was quite wrong about k's upper limit. In fact, $\lfloor \frac{E_E+E_T}{r} \rfloor \leq k \leq n$ is the clear range, which means the k iterator needs changing. Also, another insight - $E_E + E_T$ is, obviously, limited by $n$ (in fact, the upper limit is $n(r+1)$), but there's no reference to $n$ in it's exponents - that's definitely causing issues in the overall sum, because it's letting $E_E$ outcomes into sums that it shouldn't be allowed into. Still figuring out the best way to apply that limit. – Lee Davis-Thalbourne Nov 16 '23 at 04:03
1

After a whole lot of additional research and calculations, I think I might have finally answered my own question.

The breakthrough was when I looked at the starting Generating function and realised what should have been obvious:

\begin{align*} F(z)=& \left ( \frac{(d-e+1)z}{d} \right )^{r}\left ( \frac{(d-t+1)z+(t-1)}{d} \right ) + \frac{\left ( 1-\left ( \frac{(d-e+1)z}{d} \right )^r \right )\left ( (e-t)z+(t-1) \right )}{d-(d-e+1)z}\\ =& P(z)M(z) + Q(z)N(z)\\ [F(z)]^n=&[ P(z)M(z) + Q(z)N(z)]^n \end{align*}

Looks suspiciously like a binomial so:

\begin{align*} [F(z)]^n=&[ P(z)M(z) + Q(z)N(z)]^n\\ =&\sum_{j=0}^{n}\binom{n}{j}[ P(z)M(z)]^j[Q(z)N(z)]^{n-j}\\ =& \sum_{j=0}^{n}\binom{n}{j}\left (\left ( \frac{(d-e+1)z}{d} \right )^{r} \left ( \frac{(d-t+1)z-(t-1)}{d} \right ) \right )^{j}\left ( \frac{\left ( 1-\left ( \frac{(d-e+1)z}{d} \right )^r \right )\left ( (e-t)z+(t-1) \right )}{d-(d-e+1)z} \right )^{n-j}\\ =& \sum_{j=0}^{n}\binom{n}{j}\left ( \frac{(d-e+1)z}{d} \right )^{jr} \left ( \frac{(d-t+1)z-(t-1)}{d} \right )^{j}\left ( \frac{d\left ( 1-\left ( \frac{(d-e+1)z}{d} \right )^r \right )}{d-(d-e+1)z} \right )^{n-j}\left (\frac{(e-t)z+(t-1)}{d} \right )^{n-j} \end{align*}

We then look at $M(z)$ and $N(z)$ next: \begin{align*} [M(z)]^j=&\left ( \frac{(d-t+1)z-(t-1)}{d} \right ) ^{j}\\ =& \sum_{k=0}^{j}\binom{j}{k}\left ( \frac{t-1}{d} \right )^{j-k}\left ( \frac{d-t+1}{d} \right ) ^{k}z^{k}\\ [N(z)]^{n-j}=&\left ( \frac{(e-t)z+(t-1)}{d} \right )^{n-j}\\ =&\sum_{l=0}^{n-j}\binom{n-j}{l}\left ( \frac{t-1}{d} \right )^{n-j-l}\left ( \frac{e-t}{d} \right )^{l}z^{l} \end{align*}

I suddenly had a lightbulb moment here, because: \begin{align*}\binom{n}{j}[M(z)]^{j}[N(z)]^{n-j}=& \binom{n}{j}\binom{j}{k}\binom{n-j}{l}\left ( \frac{d-t+1}{d} \right ) ^{k}\left ( \frac{t-1}{d} \right )^{j-k}\left ( \frac{e-t}{d} \right )^l\left ( \frac{t-1}{d} \right )^{n-j-l}\\ =& \binom{n}{k,l,j-k,n-j-l}\left ( \frac{d-t+1}{d} \right ) ^{k}\left ( \frac{t-1}{d} \right )^{j-k}\left ( \frac{e-t}{d} \right )^l\left ( \frac{t-1}{d} \right )^{n-j-l} \end{align*}

$S_E$,$F_E$,$S_T$ and $F_T$, when taken together, form a multinomial distribution, which makes sense because ultimately, every die roll has to terminate on one of these values, so they have to equal $n$, the only question is how many rerolls it takes to get to one of these four outcomes.

That leaves us, finally, with $E_E$ and $E_T$:

\begin{align*} [P(z)]^j =&\left (\left ( \frac{(d-e+1)z}{d} \right )^{r}\right )^{j}\\ =&\left ( \frac{d-e+1}{d} \right )^{rj}z^{rj}\\ [Q(z)]^{n-j}=&\left ( \frac{d\left (1-\left ( \frac{(d-e+1)z}{d} \right )^r \right )}{d-(d-e+1)z} \right )^{n-j}\\ =&\left ( \sum_{k=0}^{r-1}\left (\frac{(d-e+1)z}{d} \right )^k \right )^{n-j}\\ =&\sum_{\alpha=0}^{r-1}\sum_{i=0}^{n-j}\left ( -1 \right )^i\binom{n-j}{i}\binom{(n-j)+\alpha-1-i((r-1)+1)}{(n-j)-1}\left ( \frac{(d-e+1)}{d} \right )^{\alpha}z^\alpha \end{align*}

We know that our (h) must be the sum of each of these exponents, so:

\begin{align*} h=&E_T+S_T+S_E+E_E\\ h=&rj+k+l+\alpha\\ \alpha=&h-rj-k-l \end{align*}

Taking into account HighDiceRoller's excellent contribution, that should be all the pieces we need, and $P(H=h)$ should be:

\begin{align*} P(H=h)=&\sum_{j=0}^{n}\sum_{k=0}^{j}\sum_{l=0}^{n-j}\binom{n}{j}\binom{j}{k}\binom{n-j}{l}\\ \cdot&\left (\sum_{i=0}^{n-j}\left ( -1 \right )^i\binom{n-j}{i}\binom{(n-j)+(h-rj-k-l)-1-ir}{(n-j)-1} \right )\left ( \frac{d-e+1}{d} \right )^{h-rj-k-l}\\ \cdot&\left ( \frac{d-e+1}{d} \right )^{rj}\left ( \frac{d-t+1}{d} \right ) ^{k}\left ( \frac{t-1}{d} \right )^{j-k}\left ( \frac{e-t}{d} \right )^l\left ( \frac{t-1}{d} \right )^{n-j-l}\\ \end{align*} We can then move everything that can be out of the loop out, and simplify everything else down to: \begin{align*} P(H=h)=&\frac{(t-1)^n(d-e+1)^h}{d^{n+h}}\sum_{j=0}^{n} \sum_{k=0}^{j}\sum_{l=0}^{n-j}\binom{n}{j}\binom{j}{k}\binom{n-j}{l}\\ \cdot&\left (\sum_{i=0}^{n-j}\left ( -1 \right )^i\binom{n-j}{i}\binom{(n-j)+(h-r(i+j)-k-l)-1}{(n-j)-1} \right )\\ \cdot&\left [ \frac{d(e-t)}{(t-1)(d-e+1)} \right ]^l\left [ \frac{d(d-t+1)}{(t-1)(d-e+1)} \right ]^k \end{align*}

Which is not nearly as neat as I would have liked it, but does appear to check out. Some sanity checks would be very appreciated, just to make sure no obvious mistakes have crept in.