2

A young baseball fan wants to collect a complete set of 262 baseball cards. The baseball cards are available in a completely random fashion, one per package of chewing gum.

How many boxes of chewing gum does the fan need to buy in order to have a full set with probability ≥ $0.99$?

I was told I need more info to solve this problem. However, this is all the info given. Is this missing data?

joriki
  • 238,052
  • 3
    This is known as the Coupon Collector's Problem . Lots of information about it online. – lulu Apr 23 '16 at 13:20
  • I reopened the question because it has a feature that the one given as a duplicate, Birthday-coverage problem, doesn't have: The desired probability is close to $1$, which allows for a good estimate using expectation values, allowing the result to be calculated without evaluating too many astronomical Stirling numbers. – joriki Apr 23 '16 at 14:14
  • There is an Erdős and Rényi result, that $\operatorname{P}(T < n\log n + cn) \to e^{-e^{-c}}, \text{as} \ n \to \infty$. If $e^{-e^{-c}}=0.99$ then $c=-\log_e(-\log_e(0.99)) \approx 4.6$ and $262 \log_e(262) + 4.6 \times 262 \approx 2664.1$. Perhaps $n$ is not quite big enough here. – Henry Oct 11 '23 at 00:20

3 Answers3

5

As lulu noted in a commented, this is the coupon collector's problem. The probability of having a complete set of $m$ coupons after drawing $n$ coupons is

$$ \def\stir#1#2{\left\{#1\atop#2\right\}} \frac{m!}{m^n}\stir nm\;, $$

where $\stir nm$ is a Stirling number of the second kind. For a derivation of this probability, see Probability distribution in the coupon collector's problem.

You have $m=262$ and want the probability to be at least $0.99$. Since we can't easily solve for $m$, we should try to get a good estimate so we don't have to compute too many Stirling numbers. The number of coupons you need to draw to have all coupons with probability $0.99$ should be close to the number you need to draw to make the expected number of undrawn coupons $0.01$. So

$$ m\left(1-\frac1m\right)^n\approx0.01\;, $$

and solving for $n$ yields

$$ n\approx\frac{\log0.01-\log m}{\log\left(1-\frac1m\right)}\approx2660.37\;, $$

and indeed calculating the exact probability for the values adjacent to $n=2660$ shows that, as Byron found, the probability $0.99$ is first reached at this value.

joriki
  • 238,052
1

I answered a similar problem here: Birthday-coverage problem

Setting $n=262$ there, I calculate that with $2659$ packages the chance of getting a complete collection is $0.9899956985$, while with $2660$ packages the chance is $0.9900336992$.

So the answer to your question is $2660$.

0

Technically, you need to know how many boxes of chewing gum were produced. For example, if they only manufactured a total of $262$ boxes, and "random" meant simply that you don't know which card is in which package, then obviously you just need to buy all $262$ boxes, in which case you are assured, with $100$% probability, that you have a complete set. So the one piece of information that's missing, I'd say, is whether the problem is making the tacit assumption that there is, in effect, an infinite supply of boxes of chewing gum.

Barry Cipra
  • 79,832