1

If we pick a random number from $[1, n]$ with repetition $k$ times. What is the probability distribution of number of distinct numbers picked for a given $k$? The number of distinct numbers picked is $\in [1, min(k, n)]$.

vamsikal
  • 105
  • Could you precisely define the indicator random variable in question? Also are you picking one number randomly, or $k$ numbers? – Akay Mar 08 '17 at 10:19
  • We are picking k random numbers with repetition from $[1, n]$. – vamsikal Mar 09 '17 at 11:51

1 Answers1

2

Sppose we draw $m$ times with $n$ possible values and ask about the number $r$ of distinct values that appeared. The classification by $r$ is given from first principles by

$$\frac{1}{n^m} \sum_{r=1}^n {n\choose r} \times {m\brace r} \times r!.$$

We may include $r=0$ because the Stirling number is zero there. This being a sum of probabilities it should evaluate to one. We get

$$\frac{1}{n^m} \sum_{r=0}^n {n\choose r} \times m! [z^m] (\exp(z)-1)^r = m! [z^m] \frac{1}{n^m} \sum_{r=0}^n {n\choose r} (\exp(z)-1)^r \\ = m! [z^m] \frac{1}{n^m} \exp(nz) = \frac{1}{n^m} n^m = 1$$

and the sanity check goes through. We get for the expected number of distinct values

$$\frac{1}{n^m} \sum_{r=1}^n r {n\choose r} \times m! [z^m] (\exp(z)-1)^r \\ = \frac{1}{n^{m-1}} \sum_{r=1}^n {n-1\choose r-1} \times m! [z^m] (\exp(z)-1)^r \\ = \frac{1}{n^{m-1}} m! [z^m] (\exp(z)-1) \sum_{r=1}^n {n-1\choose r-1} \times (\exp(z)-1)^{r-1} \\ = \frac{1}{n^{m-1}} m! [z^m] (\exp(z)-1) \sum_{r=0}^{n-1} {n-1\choose r} \times (\exp(z)-1)^{r} \\ = \frac{1}{n^{m-1}} m! [z^m] (\exp(z)-1) \exp((n-1)z) = \frac{1}{n^{m-1}} (n^m - (n-1)^m) \\ = n \left(1 - \left(1-\frac{1}{n}\right)^m\right).$$

The species for labeled set partitions is

$$\mathfrak{P}(\mathcal{U}\mathfrak{P}_{\ge 1}(\mathcal{Z}))$$

which yields the generating function

$$G(z, u) = \exp(u(\exp(z)-1)).$$

We verified these with the following script.

ENUM :=
proc(n, m)
    option remember;
    local ind, d, res;

    res := 0;
    for ind from n^m to 2*n^m-1 do
        d := convert(ind, base, n);

        res := res +
        nops(convert(d[1..m], `multiset`));
    od;

    res/n^m;
end;

X := (n,m)-> n*(1-(1-1/n)^m);

Marko Riedel
  • 61,317