2

please excuse (or change, if possible) the title if it doesn't make sense.

I have a problem which is something like this:

You have a variable $i$ which starts at $i=1$. You also have a target number of $j$, and you have an n-sided die (in my case $j=6$ and $n=6$ but ideally i'd like to be able to solve this generally.)

Now you repeatedly roll the n-sided die, each time the die the lands on a number greater than $i$ you increment $i$ by 1. This stops once $i=j$.

I've already calculated the mean number of rolls until $i=j$ as:

$$\sum_{x=n-(j-1)}^{n-1} \frac{n}{x}$$

Now i'd like to be able to produce a probability distribution of how many rolls until $i=j$. I'm currently only looking at $j=6$ and $n=6$ and have come up with a solution, however it is very manual.

I can model the probability of raising $i$ from 1 to 2 (assuming 6-sided die) in $x$ rolls as:

$$p_1(x)=\frac 16^{x-1}\times \frac 56 $$

Similarly the probability of raising $i$ from 2 to 3 in $x$ rolls

$$p_2(x)=\frac{2}{6}^{x-1}\times \frac 46 $$

(I'll skip the definition of $p_3(x)$ to $p_5(x)$)

Now for the probability of raising $i$ from 1 to 6 in just 5 rolls would be

$$p_1(1) \times p_2(1) \times ... \times p_5(1)$$

But what I need to do next is the probability of raising $i$ from 1 to 6 in 6 rolls, I realise I need to do something like:

$$p_1(2) \times p_2(1) \times ... \times p_5(1) + p_1(1) \times p_2(2) \times ... \times p_5(1) + ... \times p_4(1) \times p_5(2)$$

(Not sure if I removed too much from the above equation but essentially the sum of all the different products of $p_{1-5}$ where the arguments to the functions add up to 6, does that make sense?)

And then the above but for all arguments summing to 7 and on to Infinity.

So I'm wondering how to generalise this last step? It seems to be that it could/should be able to be represented as a function. Maybe there is something similar that already exists?

Thank you in advance. Let me know if i need to edit the question to clarify anything :)

allouis
  • 21
  • It seems like you have the sum of j-1 geometric distributions, each with parameter $\frac {n-i}n$? In fact I feel like it is almost an ordered coupon collection problem. – Tony Hellmuth May 30 '18 at 10:34
  • @Tony Hellmuth thanks for your comment. That seems to make sense, when you say the sum of geometric distributions, are you referring to convolution (as desscribed here https://en.m.wikipedia.org/wiki/Convolution)? Also for the $\frac{n-1}{n}$ parameter, does that miss out the fraction that is taken to the power of $x-1$? – allouis May 30 '18 at 11:10
  • 1
    Am I right in thinking the generalised geometric distributions would look like: $$p_l(x) = {\frac l n}^{x - 1} \times \frac {n-l}{n} $$ and so I would want to sum from $l=1$ to $l=j$? – allouis May 30 '18 at 11:15
  • I think you are better off understanding the basis of your formula! Have a read of this: http://www.randomservices.org/random/urn/Coupon.html – Tony Hellmuth May 30 '18 at 11:21
  • Thanks I will take a look at that! :) – allouis May 30 '18 at 11:26
  • Awesome, no worries :) I happen to be working on a similar type of problem! (https://math.stackexchange.com/questions/2795397/random-sum-in-coupon-collection). Can I ask how you thought of this? Hopefully we can help each other :P – Tony Hellmuth May 30 '18 at 11:29
  • 1
    After reading through that I understand exactly what you mean by it being an “ordered” coupon collection problem, thank you! This is a generalised form of a dice mechanic for “Insanity” in a tabletop RPG “Cuthulu Dark” (http://catchyourhare.com/files/Cthulhu%20Dark.pdf) inspired by this post on reddit https://www.reddit.com/r/RPGdesign/comments/8msbxs/stats_question_regarding_cthulhu_dark/?st=JHT20DF6&sh=8fd47248 yeah i hope so, I have some reading to do but I’ll post anything relevant to your question as a comment :) – allouis May 30 '18 at 11:58

1 Answers1

1

Clearly you need $j \le n$

As Tony Hellmuth says in the comments, this is closely related to the Coupon Collector's problem. You can make the relationship even closer if you start with $i=0$, which has the effect of adding $1$ to the number of throws. Doing that, I believe you can say that the probability that the first time $i=j$ is the $r$th throw is $$ S_2(r-1,j-1) \frac{ n!}{(n-j)! n^r}$$ where $S_2(a,b)$ is a Stirling number of the second kind

For $n=j=6$ it gives probabilities like this table for varying $r$ starting at $i=0$ (if starting from $i=1$ then subtract $1$ from each $r$); in the tail $\mathbb P(R=r+1) \approx \frac56 \mathbb P(R=r)$. The expected value is $14.7$, median is $13$ and mode is $11$.

r       Prob(R=r)

1       0
2       0
3       0
4       0
5       0
6       0.0154321
7       0.0385802
8       0.0600137
9       0.0750171
10      0.0827689
11      0.0843943
12      0.0816093
13      0.0760425
14      0.0689872
15      0.0613674
16      0.0537917
17      0.0466281
18      0.0400747
19      0.0342160
20      0.0290645
21      0.0245899
22      0.0207390
23      0.0174480
24      0.0146506
25      0.0122827
26      0.0102849
27      0.0086036
28      0.0071917
29      0.0060077
30      0.0050162
31      0.0041867
32      0.0034932
33      0.0029139
34      0.0024302
35      0.0020264
36      0.0016896
37      0.0014085
38      0.0011742
39      0.0009787
40      0.0008158
41      0.0006799
42      0.0005667
43      0.0004723
44      0.0003936
45      0.0003280
46      0.0002734
47      0.0002278
48      0.0001899
49      0.0001582
50      0.0001319
Henry
  • 157,058
  • You are correct. It looks like it is how long it takes to collect exactly j-1 unique coupons. – Tony Hellmuth May 30 '18 at 11:20
  • Hi @Henry thanks for your answer! I’m having trouble seeing how you got to the answer, I’ve looked at both https://math.stackexchange.com/questions/1609459/coupon-collectors-problem-using-inclusion-exclusion and http://www.mathematik.uni-stuttgart.de/~riedelmo/papers/coupon-stirling.pdf but cannot see the link. As Tony mentioned in his comment this problem seems to be an ordered coupon collection problem, as I have to roll a 2 before any 3’s count toward being collected. Could you expand on your answer some more? – allouis May 30 '18 at 12:40
  • @Henry After reading more on Stirling numbers of the second kind, I can see how they’re related to this problem. If we take $S_2(6, 5)$ that gives us the number of possible combinations to arrange the arguments for the $P_n$ functions in my post? Thanks for giving me something to look at. I’m still trying to work out how to apply those actual numbers to the functions and sum the results, but your answer is beginning to be understood my end :) – allouis May 30 '18 at 13:57
  • @allouis: As I read your question "each time the die the lands on a number greater than $i$ you increment $i$ by $1$" you do not have to roll any $2$s exactly, it is just that to get from $i=1$ to $i=2$ you have to roll more than $1$ and this has probability $\frac{n-1}{n}$. This is the same as in the coupon collector's problem where to get from $1$ coupon type to $2$ coupon types you have probability $\frac{n-1}{n}$ of finding a new type of coupon with each new coupon. – Henry May 30 '18 at 17:17
  • Yes! It’s taken me a good nights sleep to reconcile this in my head, but you’re right, the order doesn’t matter because we’re getting the same smaller probabilities as we collect coupons. The bit I’m having trouble with now is the use of $S_2$ as on the surface that gives too many combinations, the way I am interpreting it is that the stirling numbers give the number of combinations of products as described in the last block of LaTeX in my question. However, as we’re only concerned with the length of the sets generated from partitioning, it seems we will have too many combinations, am I wrong? – allouis May 31 '18 at 03:17
  • Also something that concerns me is that in my question I say that I’m planning on using $n=6, j=6$ but plugging that into your solution leaves me dividing by $0$ does your solution only work for $n \gt j$? – allouis May 31 '18 at 03:21
  • @allouis: If $n=j$ then $(n-j)!=0!=1$ so you are not dividing by $0$ – Henry May 31 '18 at 07:49
  • @Henry my b, I parsed that wrong – allouis May 31 '18 at 08:01