1

Suppose you have numbers 1, 2, 3, 4. You can pick each of these numbers with probability 1/4. After n iid selections, what is the probability that you have seen all 4 numbers? Express this as a function of n.

For example, if n=5, and you draw the numbers 1,1,3,2,1 then this is unsuccessful, as 4 was never picked. If you draw 3,4,2,3,4, this is also unsuccessful as 1 was never picked. However, 3,2,2,1,4 is successful.

I am aware you can use a discrete Markov chain and find the n-step transition probability matrix on Wolfram, but is there a way to find this without a calculator and with reasoning alone?

Evan Li
  • 23

2 Answers2

0

Let $S(i)$ be the set of outcomes where $i$ does not appear in $n$ picks.

Let $N(j)$ be the number of outcomes in the intersection of $j$ of the $S(i)$. Then $$ N(j)=\overbrace{\quad\binom{4}{j}\quad}^{\substack{\text{number of}\\\text{ways to}\\\text{choose the}\\\text{$j$ missing}}}\overbrace{\ (4-j)^n\ \vphantom{\binom41}}^{\substack{\text{number of}\\\text{ways to pick}\\\text{$n$ from the}\\\text{remainder}}} $$ According to the Generalized Inclusion-Exclusion Principle, the number of outcomes in none of the $S(i)$ is $$ \begin{align} &N(0)-N(1)+N(2)-N(3)+N(4)\\ &=\binom{4}{0}(4-0)^n-\binom{4}{1}(4-1)^n+\binom{4}{2}(4-2)^n-\binom{4}{3}(4-3)^n+\binom{4}{4}(4-4)^n\\ &=4^n-4\cdot3^n+6\cdot2^n-4\cdot1^n+0^n \end{align} $$ The total number of outcomes is $4^n$, so the probability that we see all of the possible numbers is $$ 1-4\ \left(\frac34\right)^n+6\ \left(\frac12\right)^n-4\ \left(\frac14\right)^n+[n=0] $$

robjohn
  • 345,667
0

You may just count systematically as follows:

Number of sequences of length $n$ containing only $1$ digit: $$\color{blue}{\Rightarrow 4}$$

Number of sequences of length $n$ containing exactly $2$ digits:

  • Choose two of the digits: $\color{blue}{\binom 42}$
  • Number of sequences of length $n$ containing at least one of these digits: $\color{blue}{2^n}$
  • Number of sequences of length $n$ containing only one of the $2$ selected digits: $\color{blue}{2}$ $$\color{blue}{\Rightarrow \binom 42\left(2^n-2\right)}$$

Number of sequences of length $n$ containing exactly $3$ digits:

  • Choose three of the digits: $\color{blue}{\binom 43}$
  • Number of sequences of length $n$ containing at least one of these digits: $\color{blue}{3^n}$
  • Number of sequences of length $n$ containing at least one of only two of the $3$ selected digits: $\color{blue}{\binom{3}{2}2^n}$
  • Number of sequences of length $n$ containing only one of the $3$ selected digits: $\color{blue}{3}$

$$\color{blue}{\Rightarrow \binom 43\left(3^n- \binom{3}{2}2^n+3\right)}$$

All together, number of sequences where at least one digit is missing: $$\color{blue}{4 + \binom 42\left(2^n-2\right) + \binom 43\left(3^n- \binom{3}{2}2^n+3\right)=4-6\cdot 2^n+4\cdot 3^n}$$

Now, the probability you are looking for is

$$\frac{4^n - \left(4-6\cdot 2^n+4\cdot 3^n\right)}{4^n}=\boxed{1-\frac 1{4^{n-1}}+6\frac 1{2^n}-4\left(\frac 34\right)^n}$$