1

This question solves the problem of "how many packets do I need to achieve a certain probability of completing the album?", but this made me think of another (quite related) question.

Given an album of $N$ stickers, and given that the packets come with $M=5$ stickers each, I would like to know if there is a closed formula for the mean quantity of packets one should buy to complete the album. In the question I linked above, what I would like to know is $\mathbb{E}[k|n=N]$, I think, where $k$ is the number of packets I bought and $n$ is the number of different stickers I have.

I wasn't able to derive this answer from that question because in that case, $k$ is not treated as a random variable but as a parameter one fixes to perform the calculation.

You can assume that one packet can't have the same sticker more than once or that it can. Both results would be of interest.

Tendero
  • 798
  • 1
  • 7
  • 18
  • It is hard to understand the question, since $k,n$ are not defined. Also, the cases of (possibly) repeated stickers corresponds to $M=1$. Also, the assumption "for simplicity" makes the problem complicated. Please define $k,n$ without reference to the initial post. – dan_fulea May 23 '18 at 23:56
  • @dan_fulea I've edited the question. However, I don't understand this part of your comment: "the cases of (possibly) repeated stickers corresponds to $M=1$". Why is that so? – Tendero May 24 '18 at 00:02
  • If in an envelope with 5 objects the objects may repeat, than it is the same as buying 5 envelopes with one object inside. (So we also have to pass from $N$ to $MN$. – dan_fulea May 24 '18 at 00:07

1 Answers1

1

The case with repetition allowed within a packet is solved in Coupon Collector Problem with Batched Selections; the result (using your variable names) is

$$ \sum_{l=0}^{N-1}(-1)^{N-l+1}\binom Nl\frac1{1-\left(\frac lN\right)^M}\;. $$

With $N=424$ and $M=5$ as in your example, this comes out to about $562.5$ (WolframAlpha calculation).

The case with no repetition allowed within a packet is solved in Expected number of times a set of 10 integers (selected from 1-100) is selected before all 100 are seen; the result (using your variable names) is

$$ \sum_{l=0}^N(-1)^{N-l+1}\binom Nl\frac1{1-\frac{\binom lM}{\binom NM}}\;. $$

With $N=424$ and $M=5$ as in your example, this comes out to about $559.8$ (WolframAlpha calculation).

For comparison, without batches (the standard coupon collector's problem) you'd expect to require

$$ N\sum_{l=1}^N\frac1l $$

stickers, corresponding to

$$ \frac NM\sum_{l=1}^N\frac1l $$

batches. With $N=424$ and $M=5$ as in your example, this comes to about $562.1$ batches (WolframAlpha calculation).

In fact, up to the first $55$ digits that WolframAlpha produces, this is precisely $0.4$ batches, or $2$ stickers, less than in the case with repetition allowed within a packet. That makes sense: If the residue modulo $5$ of the number of stickers required were equidistributed, we would expect to "waste" exactly $2$ stickers in the case with repetition allowed within a packet; and as the probability distribution of the number of stickers required varies slowly across many multiples of $5$, the residue modulo $5$ is indeed very close to being equidistributed.

joriki
  • 238,052