Picking balls from a bag until one colour is missing

Question

A bag contains balls of three different colours namely $A，B，C$. The number of each types of balls is $n$.

Assume you pick 3 balls randomly from the bag every round, without replacement. And you stop if you have picked up all balls of one colour. What's the probability distribution of number of the rounds you spend?

I think we can identify this with the same question except you pick one ball a time，but I still cannot handle it.

Do you replace the balls after every round or is the game deemed to end after $n$ rounds, as there are no balls? — N74, Aug 21 '17 at 19:42
Also, do you mean, "all $n$ balls of any colour over the game", or "all three balls are the same colour in the same round". — Graham Kemp, Aug 22 '17 at 04:02
You may model it via a Markov Chain with an absorption state, and the time until absorption is following a discrete phase-type distribution https://en.wikipedia.org/wiki/Discrete_phase-type_distribution — BGM, Aug 22 '17 at 11:07
@BGM Thanks, but I only know some basic probablity so I hope there is a more elementary solution. — , Aug 22 '17 at 15:28

Satish Ramanathan · Answer 1 · 2017-08-27T15:35:28.553

My answer is not as detailed as Marko's and it does not give the probability of rounds but probability of draws before a ball of one colour to become extinct first. But a simple enumeration of the combination of draws could lead to the below and all X's are the number of draws that yield the combination. Now let us denote each of them with sub-scripts.

$AAA -X_1$

$AAB -X_2$

$AAC -X_3$

$ABB -X_4$

$ACC -X_5$

$ABC -X_6$

$BBB -X_7$

$BBC -X_8$

$BCC -X_9$

$CCC -X_10$

Let us analyse Ball A

$3X_1+2X_2+2X_3+X_4+X_5+X_6 = n$

If you notice it is same for B and C too.

Let us analyze for Ball B

$3X_7+2X_4+2X_8+X_2+X_6+X_9 = n$

So is for Ball C

Summation of the solution of all three will make sure that one of those balls will get to run out the first.

Now use generating function to find the solution for any one of the equations. And the coefficient of $X^n$ is the number of ways you can draw all n balls of any type first. Let us assume $n=20$

Thus $(1+x^3+x^6+x^9+x^{12}+x^{15}+x^{18}+x^{21})(1+x^2+x^4+x^6+x^8+x^{10}+x^{12}+x^{14}+x^{16}+x^{18}+x^{20})^2(\sum_{i=0}^{20}x^i)^3$

The coefficient of $x^{20}$ in the above generating function is $6602$.

Total number of ways all such draws could result in a sum of 60 $3(X_1+X_2+X_3+X_4+X_5+X_6 +X_7+X_8+X_9+X_{10})= 60$

$(X_1+X_2+X_3+X_4+X_5+X_6 +X_7+X_8+X_9+X_{10})= 20$ and the coefficient of $x^{20}$ in the below generation function

$(\sum_{i=0}^{20}x^i)^{10}$ is $10015005$.

Thus the required probability = $\frac{6602\times 3}{10015005}$

different n, you will have to go through this procedure to obtain the probability.

I hope I have given you a simple solution.

Goodluck.

Marko Riedel · Accepted Answer · 2017-08-23T21:32:58.827

Here is a basic contribution, working with a closely related question. We solve the problem where we have $j$ instances of each of $n$ types of coupons and draw without replacement until we have seen all $j$ coupons of some type. Using the notation from the following MSE link we introduce the marked generating function

$$\left(\sum_{k=0}^{j-2} \frac{j!}{(j-k)!} \frac{z^k}{k!} + j w z^{j-1}\right)^n.$$

The coefficient on $[z^m]$ here represents distributions of sequences of $m$ draws from the $n$ types according to probability, where the ones that occur $j-1$ times have been marked. Each of the latter may be augmented to a complete set of some color where the weight is one because $j-1$ coupons have already been drawn. As we only need the count we differentiate with respect to $w$ and set $w=1$, getting

$$n\times \left(\sum_{k=0}^{j-1} \frac{j!}{(j-k)!} \frac{z^k}{k!}\right)^{n-1} \times j z^{j-1}.$$

With the method from the linked post we thus obtain for the probability

$$P[T = m] = \frac{1}{m!} {nj\choose m}^{-1} (m-1)! [z^{m-1}] n j z^{j-1} \left(\sum_{k=0}^{j-1} \frac{j!}{(j-k)!} \frac{z^k}{k!}\right)^{n-1} \\ = \frac{1}{m!} {nj\choose m}^{-1} \times n \times j \times (m-1)! [z^{m-1}] z^{j-1} (-z^j + (1+z)^j)^{n-1}.$$

Extracting the coefficient we find

$${nj-1\choose m-1}^{-1} [z^{m-j}] \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} z^{j(n-1-q)} (1+z)^{qj} \\ = {nj-1\choose m-1}^{-1} \sum_{q=0}^{n-1} [z^{m-j(n-q)}] {n-1\choose q} (-1)^{n-1-q} (1+z)^{qj} \\ = {nj-1\choose m-1}^{-1} \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {qj\choose m-j(n-q)} \\ = {nj-1\choose m-1}^{-1} \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {qj\choose nj-m}.$$

Observe that

$${qj\choose nj-m} {nj-1\choose m-1}^{-1} = \frac{(qj)! (m-1)! } {(nj-1)! (m-(n-q)j)! } \\ = {nj-1\choose qj}^{-1} {m-1 \choose m-(n-q)j}.$$

We record for the probabilities the formula

$$\bbox[5px,border:2px solid #00A000]{ P[T=m] = \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj}^{-1} {m-1 \choose (n-q)j-1}.}$$

We now verify that this is a probability distribution. This requires the value of

$$\sum_{m=j}^{n(j-1)+1} {m-1 \choose (n-q)j-1} = \sum_{m=j-1}^{n(j-1)} {m\choose (n-q)j-1} \\ = [z^{(n-q)j-1}] \sum_{m=j-1}^{n(j-1)} (1+z)^m = [z^{(n-q)j}] ( (1+z)^{n(j-1)+1} - (1+z)^{j-1} ).$$

With $0\le q\le n-1$ the second term does not contribute and we may continue with

$$ \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj}^{-1} {n(j-1)+1\choose (n-q)j} \\ = \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj}^{-1} {nj+1-n\choose qj+1-n} \\ = \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} \frac{qj}{nj-qj} {nj-1\choose qj-1}^{-1} {nj+1-n\choose qj+1-n} \\ = \sum_{q=1}^{n-1} {n-1\choose q-1} (-1)^{n-1-q} {nj-1\choose qj-1}^{-1} {nj+1-n\choose qj+1-n}.$$

Observe once more that

$${nj-1\choose qj-1}^{-1} {nj+1-n\choose qj+1-n} = \frac{(nj+1-n)! \times (qj-1)!}{(nj-1)! \times (qj+1-n)!} \\ = {nj-1\choose n-2}^{-1} {qj-1\choose n-2}.$$

We thus find for the sum of the probabilities

$${nj-1\choose n-2}^{-1} \sum_{q=1}^{n-1} {n-1\choose q-1} (-1)^{n-1-q} {qj-1\choose n-2} \\ = {nj-1\choose n-2}^{-1} \sum_{q=0}^{n-2} {n-1\choose q} (-1)^{n-q} {qj+j-1\choose n-2} \\ = 1 + \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-q} {qj+j-1\choose n-2}.$$

The sum vanishes, as in

$$\sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-q} [z^{n-2}] (1+z)^{qj+j-1} \\ = [z^{n-2}] (1+z)^{j-1} \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-q} (1+z)^{qj} \\ = [z^{n-2}] (1+z)^{j-1} (1-(1+z)^j)^{n-1},$$

but $(1-(1+z)^j)^{n-1} = (-1)^{n-1} j^{n-1} z^{n-1} + \cdots$ and there is no contribution. This confirms it being a probability distribution.

Continuing with the expectation we require the value of

$$\sum_{m=j}^{n(j-1)+1} m {m-1 \choose (n-q)j-1} = (n-q)j \sum_{m=j}^{n(j-1)+1} {m \choose (n-q)j} \\ = (n-q)j [z^{(n-q)j}] \sum_{m=j}^{n(j-1)+1} (1+z)^m \\ = (n-q)j [z^{(n-q)j+1}] ((1+z)^{n(j-1)+2} - (1+z)^j) = (n-q)j {n(j-1)+2\choose (n-q)j+1}.$$

The second term did not contribute since we have $(n-q)j+1\gt j.$ We thus have for the expectation

$$j \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj}^{-1} (n-q) {n(j-1)+2\choose (n-q)j+1} \\ = j \sum_{q=0}^{n-1} {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj}^{-1} (n-q) {nj+2-n\choose qj+1-n} \\ = j \sum_{q=1}^{n-1} {n-1\choose q} (-1)^{n-1-q} \frac{qj}{nj-qj} {nj-1\choose qj-1}^{-1} (n-q) \\ \times \frac{nj+2-n}{nj-qj+1} {nj+1-n\choose qj+1-n} \\ = j (nj+2-n) \sum_{q=1}^{n-1} q {n-1\choose q} (-1)^{n-1-q} {nj-1\choose qj-1}^{-1} \frac{1}{nj-qj+1} {nj+1-n\choose qj+1-n}.$$

Re-using the earlier factorization we get

$$j (nj+2-n) {nj-1\choose n-2}^{-1} \sum_{q=1}^{n-1} q {n-1\choose q} (-1)^{n-1-q} \frac{1}{nj-qj+1} {qj-1\choose n-2} \\ = j (nj+2-n) (n-1) {nj-1\choose n-2}^{-1} \sum_{q=1}^{n-1} {n-2\choose q-1} (-1)^{n-1-q} \frac{1}{nj-qj+1} {qj-1\choose n-2} \\ = j^2 n (nj+1) {nj+1\choose n-1}^{-1} \\ \times \sum_{q=0}^{n-2} {n-2\choose q} (-1)^{n-q} \frac{1}{nj-qj-j+1} {qj+j-1\choose n-2}.$$

Working with the sum term we have

$${n-2\choose q} (-1)^{n-q} \frac{1}{nj-qj-j+1} {qj+j-1\choose n-2} \\ = \mathrm{Res}_{z=q} \frac{1}{nj-j+1-zj} \prod_{p=0}^{n-3} (zj+j-1-p) \prod_{p=0}^{n-2} \frac{1}{z-p}.$$

Now since $\lim_{R\to\infty} 2\pi R \times R^{n-2}/R/R^{n-1} = 0$ and residues sum to zero we may evaluate this by taking the negative of the residue at $z=n-1+1/j$. This is the computation:

$$-\mathrm{Res}_{z=n-1+1/j} \frac{1}{nj-j+1-zj} \prod_{p=0}^{n-3} (zj+j-1-p) \prod_{p=0}^{n-2} \frac{1}{z-p} \\ = \frac{1}{j} \mathrm{Res}_{z=n-1+1/j} \frac{1}{z-(n-1+1/j)} \prod_{p=0}^{n-3} (zj+j-1-p) \prod_{p=0}^{n-2} \frac{1}{z-p} \\ = \frac{1}{j} \prod_{p=0}^{n-3} (nj-p) \prod_{p=0}^{n-2} \frac{1}{n-1+1/j-p} \\ = \frac{1}{j} \times (n-2)! \times {nj\choose n-2} \times j^{n-1} \prod_{p=0}^{n-2} \frac{1}{nj-j+1-pj}.$$

With

$${nj\choose n-2} {nj+1\choose n-1}^{-1} = \frac{n-1}{nj+1}$$ we finally have the closed form

$$\bbox[5px,border:2px solid #00A000]{ E[T] = n! \times j^n \times \prod_{p=0}^{n-2} \frac{1}{nj-j+1-pj}.}$$

To see what the asymptotics are we use the alternate form

$$\bbox[5px,border:2px solid #00A000]{ E[T] = n \times j \times \frac{\Gamma(n)\Gamma(1+1/j)}{\Gamma(n+1/j)}.}$$

Keeping $j$ fixed and letting $n$ go to infinity yields the asymptotic

$$n\times j\times \Gamma(1+1/j) \times n^{-1/j} = n^{1-1/j} \times j\times \Gamma(1+1/j).$$

There is an enumeration routine that may be compared to the closed forms both of which were implemented in the following Maple code.

V :=
proc(n, j)
    option remember;
    local L, recurse, results;

    results := 0;

    recurse :=
    proc(LL, sofar, prob)
    local choice, cprob;

        if numboccur(LL, 0) > 0 then
            results := results + prob*u^nops(sofar);
            return;
        fi;

        for choice to nops(LL) do
            cprob := LL[choice]/(n*j-nops(sofar));

            recurse([seq(LL[q], q=1..choice-1),
                     LL[choice]-1,
                     seq(LL[q], q=choice+1..nops(LL))],
                    [op(sofar), choice],
                    prob*cprob);
        od;
    end;

    L := [seq(j, q=1..n)];
    recurse(L, [], 1);

    results;
end;

P := (n, j, m) -> n*j/binomial(n*j, m)/m
*coeftayl(z^(j-1)*(-z^j+(1+z)^j)^(n-1), z=0, m-1);
PGF := (n, j) -> add(P(n,j,m)*u^m, m=j..(j-1)*n+1);

EX := (n, j) -> add(P(n,j,m)*m, m=j..(j-1)*n+1);

P2 := (n, j, m) ->
add(binomial(n-1,q)*(-1)^(n-1-q)/binomial(n*j-1, q*j)
    *binomial(m-1, m-(n-q)*j), q=0..n-1);

PGF2 := (n, j) -> add(P2(n,j,m)*u^m, m=j..(j-1)*n+1);

EX2 := (n, j) -> n!*j^n*mul(1/(n*j-j+1-p*j), p=0..n-2);

EXGAMMA := (n, j) -> n*j*GAMMA(n)*GAMMA(1+1/j)/GAMMA(n+1/j);
EXASYMPT := (n, j) -> n^(1-1/j)*j*GAMMA(1+1/j);

With the calculation that was presented here we want to make sure we have the correct interpretation of the problem from the start. The following basic program will do this by computing the expectation through simulation. Consult for the details of the scenario under investigation. The output is in fine agreement with the data i.e. the closed form from above.

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <time.h>
#include <string.h>

int main(int argc, char **argv)
{
  int n = 6 , j = 3, trials = 1000; 

  if(argc >= 2){
    n = atoi(argv[1]);
  }

  if(argc >= 3){
    j = atoi(argv[2]);
  }

  if(argc >= 4){
    trials = atoi(argv[3]);
  }

  assert(1 <= n);
  assert(1 <= j);
  assert(1 <= trials);

  srand48(time(NULL));
  long long data = 0;

  for(int tind = 0; tind < trials; tind++){
    int src[n*j];

    for(int cind = 0; cind < n*j; cind++)
      src[cind] = cind/j;

    int done = 0; int steps = 0; 
    int dist[n];

    for(int cind = 0; cind < n; cind++)
      dist[cind] = 0;


    while(!done){
      int cpidx = drand48() * (double)(n*j-steps);
      int coupon = src[cpidx];

      for(int cind=cpidx; cind < n*j-steps-1; cind++)
        src[cind] = src[cind+1];

      steps++;
      dist[coupon]++;

      if(dist[coupon] == j)
        done = 1;
    }

    data += steps;
  }

  long double expt = (long double)data/(long double)trials;
  printf("[n = %d, j = %d, trials = %d]: %Le\n", 
         n, j, trials, expt);

  exit(0);
}

This answers a generalization of the question, for $n$ rather than just three types. Also, $n$ in the question is replaced by $j$ in the answer. — John Bentin, Aug 23 '17 at 07:31

Picking balls from a bag until one colour is missing

2 Answers2

Linked