2

In the page here: Probability distribution in the coupon collector's problem, the probability density of the number of coupons needed to complete a set, $N$ in the coupon collector's problem when all the coupons are equiprobable is calculated. Now, I want to extend this to the case where the probabilities of collecting the coupons are different (probability of collecting the $i$th coupon is $p_i$ where $i \in 1\dots n$).


My line of thinking: for the mean and variance of the general case, Ross covers this in his textbook on introduction to probability models (chapter 5, example 5.17). He uses a neat trick where the coupons are imagined arriving according to a Poisson process with rate $1$. Then, he defines $X$ as the random variable which is the time at which all the coupons are collected in accordance with this process. If $X_j$ is the time at which the $j$th coupon is collected, then using the fact that they happen to be independent, we get:

$$X = \max_{1\leq j \leq m} X_j$$

and so,

$$P(X<t)=\prod\limits_{j=1}^{m}(1-e^{-p_j t})$$

This was easy enough. But we don't want the CDF of $X$, we want the CDF of $N$. For the mean or variance of $N$, we can first use the CDF of $X$ to get $E(X)$ and $V(X)$. Then, the law of total expectation and total variance respectively will give us $E(N)$ and $V(N)$ (see here). Is there a corresponding way to go from the CDF of $X$ to the CDF of $N$?

Rohit Pandey
  • 6,803

0 Answers0