Could you please explain us something about the "−1" in the formula below? Where does it come from and what it is the intuition behind it?
$$p = \sum^{m}_{i=1} (-1)^{i+1} {m \choose i} \left(\frac{m-1}{m}\right)^k$$
For what it's worth, I am posting this answer because my research into the Coupon Collector's problem has failed to find this exact problem addressed.
The formula shown below can be used to calculate the probability of not collecting all m coupons after k trials in a coupon collector's problem (with m coupons).
I may be misinterpreting the OP's (i.e. original poster's) intent. I am interpreting the problem to be that you have $m$ coupons, and that you randomly select $k$ coupons, one at a time, with replacement.
I am assuming that $k \geq m.$
Also, unless I am mistaken, the correct formula is
$$p = \sum^{m}_{i=1} (-1)^{i+1} {m \choose i} \left(\frac{m-\color{red}{i}}{m}\right)^k.$$
Inclusion Exclusion is used. See this article for an
introduction to Inclusion-Exclusion.
Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.
For any set $E$, with a finite number of elements, let $|E|$ denote the number of elements in the set $E$.
For the specific problem, let $S$ denote the collection of all multisets that represent all possible distributions of the $m$ coupons, where each set contains $k$ elements, and a specific element (i.e. collected coupon) can occur more than once.
For $i \in \{1,2,\cdots,m\}$, let $S_i$ denote the subset of $S$ that contains all multisets that are missing coupon $i$.
Then, the desired computation is
$$\frac{|S_1 \cup S_2 \cup \cdots \cup S_m|}{|S|} ~: ~|S| = m^k.$$
Let $T_1$ denote $~\displaystyle \sum_{1 \leq i_1 \leq m} |S_{i_1}|.$
Let $T_2$ denote $~\displaystyle \sum_{1 \leq i_1 < i_2 \leq m} |S_{i_1} \cap S_{i_2}|.$
That is, $T_2$ represents the sum of $~\displaystyle \binom{m}{2}~$ terms.
Similarly, for $~3 \leq r \leq m,~$ let $~T_r~$ denote the $~\displaystyle \binom{m}{r}~$ terms given by
$~\displaystyle \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq m} |S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r}|.$
Then, in accordance with Inclusion Exclusion Theory,
$$|S_1 \cup S_2 \cup \cdots \cup S_m| = \sum_{r=1}^m (-1)^{r+1} T_r.$$
Thus, the entire problem has been reduced to showing that
$$T_r = \binom{m}{r}\left(m-r\right)^k. \tag1 $$
To enumerate $T_r$, first consider the specific computation of
$|S_1 \cap S_2 \cap \cdots \cap S_r|.$
There are $(m-r)^k$ ways that $k$ coupons can be collected from the subset of coupons represented by $\{r+1, r+2, \cdots, m\},~$ where the sampling is done with replacement.
Further, by considerations of symmetry, for any ordered $r$-tuple
$(i_1, i_2, \cdots, i_r) ~: ~i_1 < i_2 < \cdots < i_r,$
you have that
$|S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r}|$
is also equal to $(m-r)^k.$
So, in computing $T_r$, there are $~\displaystyle \binom{m}{r}~$ ways of selecting the $r$ coupons that will be missing.
So, the formula in (1) above is established.
It is important to note that (for example)
in the subset of distributions represented by
$S_1 \cap S_2 \cap \cdots \cap S_r$,
while the coupons represented by $\{1,2,\cdots,r\}$ are missing in each such distribution, other coupons may also be missing from the represented distributions.
In fact, it is this consideration that :
- Promotes the easy computation of $~\displaystyle \binom{m}{r} \times (m-r)^k.$
- Requires that Inclusion Exclusion be used.
For example, consider the situation where the coupons that are represented by $\{1,2\}$ are missing, and no other coupons are missing.
Such distributions will be counted twice. Once in $|S_1|$ and once in $|S_2|$. Then, these distributions will be deducted once because of the term $|S_1 \cap S_2|$, which is part of the computation of $T_2$.
So, the net effect is that such coupons are counted $~(+2 - 1) = 1~$ time.