1

I have the following problem, and I was hoping you guys could help me solve it:

Consider a set of $t$ unique, collectable stickers (that accounts for the universe of collectable stickers, i.e., any sticker you buy is one of the $t$ unique stickers. Obviously, you can get repeated stickers - stickers that have the same unique identifier). You start with an album that is initially empty. You bought $c$ stickers. Assume the equiprobability of getting any sticker among the universe of $t$ different stickers. What is the probability of being able to complete the sticker album with the $c$ stickers you bought?

I know that if $c < t$, then such probability is $0$. I first tried solving the problem in the following manner: Given $f(1), f(2), ..., f(t)$ the number of stickers of $c$ that correspond to each sticker in $t$ (i.e., the number of repeated stickers you got for each of the $t$ possibilities), the linear equation I tried solving is:

$$f(1)+f(2)+...+f(t)=c \qquad \qquad \qquad (1)$$

Assuming that this can be modelled to the problem of counting the number of solutions of $(1)$ that are $\geq1$, and dividing by the number of solutions of $(1)$ that are $\geq0$. I got $\dfrac{c!(c-1)!}{(c-t)!(c+t-1)!}$ as the answer for that, but soon realised that the problem cannot be thought of as I assumed. For instance, say $t=c=2$. Then $1+1=2$ as the solution for $(1)$ doesn't have the same probability as, say, $0+2=2$. Any ideas on how to solve this? Any help is very much appreciated! Thank you!

Gabriel
  • 13
  • I think your lead-in "Consider a sticker album with $t$ stickers" is misleading. This puts the reader in the frame of mind thinking that you already have $t$ stickers and you bought $c$ more, but you seem to use it as if there are a total of $t$ unique stickers that one could collect. – Travis Bemrose Apr 20 '14 at 01:10
  • Still trying to make sense of your question. Since you reference a mysterious "of $(1)$", but only at the end label something $(1)$, it leads the reader to think you're referencing something in a text somewhere, or a notion you were taught in a class that you think is standard. References to something should always come after that thing has been defined/labeled (or occasionally immediately before). Lastly (and I'm not picking on you, trying to improve your chances of getting a good answer), it's not clear to me what you're labeling $(1)$ and what that paragraph means. – Travis Bemrose Apr 20 '14 at 01:20
  • Thanks for the tips, I'll try to make the text clearer! – Gabriel Apr 20 '14 at 01:24
  • While your formula are not clear to me, I believe I get the gist of the English, as I've asked myself (and solved) the same problem more than once. I don't recall the answer right now, but I can tell you that as $c \to \infty$, your probability approaches $1$, but never arrives there. – Travis Bemrose Apr 20 '14 at 01:28
  • I'm not sure I made myself clear this time. Could you give it a try again and tell whether my reference to the equation is still obscure? I'm not used to writing formal(ish) mathematics texts, so I'll appreciate your input here. Edit: okay, I'll explain from where I got the formula. – Gabriel Apr 20 '14 at 01:30
  • Part of my confusion is that you've labeled an entire paragraph with your $1$, but you refer to it as if you've labeled a single equation. Also, if I understand correctly, that paragraph is part of your "this is what I tried that didn't work"? If that is the case, may I try my hand at rearranging a bit? If I bungle your intent, I believe you'll have the chance to reject my edit. – Travis Bemrose Apr 20 '14 at 01:35
  • by all means, please! That is indeed the case. I have tried solving it as I expressed in the reference, without success (the problem cannot be modelled like that). – Gabriel Apr 20 '14 at 01:38
  • Take a look at those changes. I don't want to change your meaning, but I think this order will help people understand your meaning better. – Travis Bemrose Apr 20 '14 at 01:46
  • 1
    This sounds like what’s often called the “coupon collector’s problem.” More commonly-asked questions than yours are how many stickers do you need to buy to have a particular chance of getting all the stickers/coupons, or how many stickers do you need to buy, on average (one at a time) to get all varieties, but I think if you search for “coupon collector’s problem,” you’ll find some useful approaches to answering your question. – Steve Kass Apr 20 '14 at 01:53
  • It looks like you and I were making edits at the same time at one point. Your latest edit to add $\binom{c-1}{t-1}$ and $\binom{c+t-1}{t-1}$ didn't make it into my edit which you just accepted. – Travis Bemrose Apr 20 '14 at 01:54
  • No need! @SteveKass 's answer was exactly what I needed. Thanks Steve and @Travis! You've been very helpful. – Gabriel Apr 20 '14 at 02:07
  • Did @SteveKass answer your question by referring you to the Coupon COllector's Problem? Steve, or you, or I (or anyone else if he declines) should put it as an answer below, so it can be accepted. – Travis Bemrose Apr 20 '14 at 17:02
  • Thanks for the suggestion, Travis. I copied my comment to an answer. – Steve Kass Apr 20 '14 at 17:14

2 Answers2

1

You appear to be using the stars and bars method.

You effectively have $t$ bins to place $c$ picks. You want the probability that at least one pick is in each bin. (The 'bins' being the identity of the sticker.)

There are ${t+c-1\choose t-1}$ ways to fill the bins in total and ${c-1\choose t-1}$ ways to fill then with at least one pick. (See Wikipedia)

$$P=\frac{c-1\choose t-1}{t+c-1\choose t-1}=\frac{c!(c-1)!}{(t+c-1)!(c-t)!}$$

The reason this isn't working is that the picks are not indistinct; they are enumerated by order of occurrence. 'Stars and bars' requires indistinct objects to be placed in distinct bins.

Graham Kemp
  • 129,094
  • Let us assume $t=2$, with stickers $s1$ and $s2$. Also, considering $c=2$, we have $4$ possibilities for the 2 stickers we bought: $\left { \left { s1, s1 \right }, { \left { s1, s2 \right }, { \left { s2, s1 \right }, { \left { s2, s2 \right } \right }$. Since they're all equiprobable, $P(t=2,c=2)=2/4=1/2$ (that is the way I see it, at least). My formula (and yours too, since it's the same) gives $P=1/3$. Isn't that wrong? – Gabriel Apr 20 '14 at 01:52
0

Reposted from a comment that the OP described as answering the question.


This sounds like what’s often called the “coupon collector’s problem.” More commonly-asked questions than yours are how many stickers do you need to buy to have a particular chance of getting all the stickers/coupons, or how many stickers do you need to buy, on average (one at a time) to get all varieties, but I think if you search for “coupon collector’s problem” (MSE, Google) you’ll find some useful approaches to answering your question.

Steve Kass
  • 14,881