Probability - Expected number of draws to get all 52 cards at least once drawing in groups of size n

Question

Imagine you have a deck of cards and want to be fairly sure that you draw each card once (with a perfectly fair, complete, and random deck on each draw, of course). You are drawing cards in groups of size n from the deck. What is the expected number of draws such that each card has been drawn at least once?

Similar to the coupon collector's problem, but not quite. How would one go about integrating the math for that algorithm with drawing multiple cards at the same time?

Edit: found some duplicate questions.

How to calculate the expected value of the coupon collector problem if we are collecting the coupons in groups of k?

Coupon Collector Problem with Batched Selections

reasonable certainty is not a mathematical adjective: you may want the "expected number of draws such that each card has been drawn at least once", is it so? — G Cab, Feb 16 '17 at 18:22
Edited per your comment; sounds approximately the same to me, but if it helps others understand what I'm getting at, all suggestions are welcome. — Landon Bland, Feb 16 '17 at 18:29
@LandonBland: Expected number has a clear definition. If you do the experiment many times this is the mean number of draws required. It may leave you with less than $50%$ chance of having them all. "reasonable certainty" to some people will be $90%$, to some a lower number, but I think would always be at least $50%$. The expected number may not be an integer. — Ross Millikan, Feb 16 '17 at 23:25
There is a $\frac{1}{52}$ chance of drawing any card and n cards in each draw. If the deck maintains that, then it may be easier to check the probability of drawing cards you havent already drawn. — Nick Pavini, Feb 17 '17 at 23:26
For trivia, what I was actually trying to estimate was the number of times an online quiz with 10 randomly chosen questions out of a pool of some variable size (usually 15-30 questions) would need to be taken for the expected number of unique questions to be equal to the total number of questions in the pool.
Figured it was better phrased as a card question. — Landon Bland, Feb 19 '17 at 09:12
Notice that the linked questions don't seem to quite correspond to your problem, because you are (I guess) drawing groups of different cards (no replacement - i.e., no possibility of having duplicated cards in a draw).. — leonbloy, Apr 13 '18 at 21:57
@leonbloy: I agree. I've closed this as a duplicate of a different question in which the coupons in each group are distinct. — joriki, May 10 '20 at 00:22

score 4 · Answer 1 · edited Apr 13 '17 at 12:20

We can actually solve some special cases. Suppose we have $n$ types of coupons which are drawn in packets of $q$ coupons, with no duplicates in the packets. We derive closed forms for any packet size and evaluate them for $q=2.$ Now with $T_{m,n,q}$ the number of ways of drawing $m$ packets of $q$-subsets of $[n]$ so that all possible values of $n$ are present we get for the probability of $m$ draws the closed form

$$P[T=m] = {n\choose q}^{-m} \sum_{k=1}^q {n\choose n-k} \times T_{m-1, n-k, q} \times {n-k \choose q-k} \\ = {n\choose q}^{-m} \sum_{k=1}^q {n\choose k} \times T_{m-1, n-k, q} \times {n-k \choose q-k}.$$

This is for $n\gt q$ since the process always halts at the first step when $n=q.$ The sum variable $k$ is the count of the values missing before the draw of the subset at position $m$ or alternatively of the values that appear in draw $m$ for the first time.

To compute the terms in $T$ we have a simple version of the computation at the following MSE link and introduce the generating function

$$[z^q] \prod_{l=1}^n (1+zA_l)$$

which generates the $q$-subsets so that

$$\left([z^q] \prod_{l=1}^n (1+zA_l)\right)^m$$

generates the $m$-sequences of $q$-subsets. We use inclusion-exclusion to remove those terms where some of the $n$ terms are missing. The nodes $P\subseteq A$ in the poset represent terms from the generating function where the elements of $P$ are missing plus possibly some more. This is evidently accomplished by setting the $A_l\in P$ to zero. We set the remaining $A_l$ to one to obtain a count. The contribution for a given $P$ is

$$[z^q] (1+z)^{n-|P|} = {n-|P|\choose q}.$$

Therefore inclusion-exclusion yields

$$\sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose q}^m$$

which is

$$\bbox[5px,border:2px solid #00A000]{ T_{m,n,q} = \sum_{p=0}^n {n\choose p} (-1)^{n-p} {p\choose q}^m.}$$

This is zero when $m=0$ and $n\ge 1.$ Now as a sanity check we should have $T_{m,n,1} = {m\brace n} \times n!$ and indeed we obtain

$$\sum_{p=0}^n {n\choose p} (-1)^{n-p} {p\choose 1}^m = \sum_{p=0}^n {n\choose p} (-1)^{n-p} p^m = {m\brace n} \times n!$$

and the check goes through. The rest is as shown at the following MSE link.

Next let us try to verify that we have a probability distribution. We have

$$\sum_{m\ge 1} P[T=m] = \sum_{k=1}^q {n\choose k} {n-k\choose q-k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \sum_{m\ge 1} {n\choose q}^{-m} {p\choose q}^{m-1} \\ = \sum_{k=1}^q {n\choose k} {n-k\choose q-k} {n\choose q}^{-1} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \frac{1}{1-{p\choose q}/{n\choose q}} \\ = \sum_{k=1}^q {n\choose k} {n-k\choose q-k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \left({n\choose q}-{p\choose q}\right)^{-1}.$$

Specializing to $q=2$ we get

$$2\sum_{k=1}^2 {n\choose k} {n-k\choose 2-k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \frac{1}{n(n-1)-p(p-1)} \\ = 2n(n-1) \sum_{p=0}^{n-1} {n-1\choose p} (-1)^{n-1-p} \frac{1}{n(n-1)-p(p-1)} \\ + n(n-1) \sum_{p=0}^{n-2} {n-2\choose p} (-1)^{n-2-p} \frac{1}{n(n-1)-p(p-1)}.$$

We evaluate the two sums by residues, using for the first sum

$$f(z) = \frac{(n-1)!}{n(n-1)-z(z-1)} \prod_{p=0}^{n-1} \frac{1}{z-p} = -\frac{(n-1)!}{(z-n)(z-(1-n))} \prod_{p=0}^{n-1} \frac{1}{z-p}.$$

The residue at infinity is zero and the residues at $n$ and $1-n$ are

$$-\frac{(n-1)!}{2n-1} \frac{1}{n!} - \frac{(n-1)!}{1-2n} \frac{(-1)^n (n-2)!}{(2n-2)!}.$$

We get for the second sum by the same technique

$$-\frac{(n-2)!}{2n-1} \frac{1}{n!} - \frac{(n-2)!}{1-2n} \frac{(-1)^{n-1} (n-2)!}{(2n-3)!}.$$

Negate and add to get

$$n(n-1) \times \\ \left(\frac{1}{2n-1}\left(\frac{2}{n}+\frac{1}{n(n-1)}\right) + \frac{(n-2)!}{1-2n} \frac{(-1)^{n-1} (n-2)!}{(2n-3)!} \left(1-2\frac{n-1}{2n-2}\right)\right) \\ = \frac{1}{2n-1} (2n-2+1) = 1.$$

This confirms it being a probability distribution.

We now give a closed form for the expectation. We find

$$\sum_{m\ge 1} m P[T=m] \\ = \sum_{k=1}^q {n\choose k} {n-k\choose q-k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \sum_{m\ge 1} m {n\choose q}^{-m} {p\choose q}^{m-1} \\ = \sum_{k=1}^q {n\choose k} {n-k\choose q-k} {n\choose q}^{-1} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \frac{1}{\left(1-{p\choose q}/{n\choose q}\right)^2}.$$

This is

$$\bbox[5px,border:2px solid #00A000]{ {n\choose q} \sum_{k=1}^q {n\choose k} {n-k\choose q-k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^{n-k-p} \left({n\choose q}-{p\choose q}\right)^{-2}.}$$

We obtain $nH_n$ when we evaluate this for $q=1$ which is a good check but not exactly surprising since we have already seen this work at the other link (the reader is invited to attempt this computation using the above formula as a starting point, which is easier than what follows). We now try for a closed form for $q=2$ and get

$$2n^2(n-1)^2 \sum_{p=0}^{n-1} {n-1\choose p} (-1)^{n-1-p} \frac{1}{(n(n-1)-p(p-1))^2} \\ + n^2(n-1)^2 \sum_{p=0}^{n-2} {n-2\choose p} (-1)^{n-2-p} \frac{1}{(n(n-1)-p(p-1))^2}.$$

We use residues as before with the function

$$g(z) = \frac{(n-1)!}{(n(n-1)-z(z-1))^2} \prod_{p=0}^{n-1} \frac{1}{z-p} = \frac{(n-1)!}{(z-n)^2(z-(1-n))^2} \prod_{p=0}^{n-1} \frac{1}{z-p}.$$

Note that

$$\left(\prod_{p=0}^{n-1}\frac{1}{z-p}\right)' = -\prod_{p=0}^{n-1}\frac{1}{z-p} \sum_{p=0}^{n-1} \frac{1}{z-p}$$

We get for the residue at $n$

$$(n-1)! \left(-\frac{1}{(2n-1)^2} \frac{1}{n!} H_n -\frac{2}{(2n-1)^3} \frac{1}{n!}\right)$$

The residue at $1-n$ yields

$$(n-1)! \left(\frac{1}{(1-2n)^2} \frac{(-1)^n(n-2)!}{(2n-2)!} (H_{2n-2} - H_{n-2}) \\ - \frac{2}{(1-2n)^3} \frac{(-1)^n(n-2)!}{(2n-2)!} \right)$$

For the second sum we get for the residue at $n$

$$(n-2)! \left(-\frac{1}{(2n-1)^2} \frac{1}{n!} (H_n-1) -\frac{2}{(2n-1)^3} \frac{1}{n!}\right)$$

and for the one at $1-n$

$$(n-2)! \left(\frac{1}{(1-2n)^2} \frac{(-1)^{n-1}(n-2)!}{(2n-3)!} (H_{2n-3} - H_{n-2}) \\ - \frac{2}{(1-2n)^3} \frac{(-1)^{n-1}(n-2)!}{(2n-3)!} \right)$$

Collecting everything we find (observe that the terms on $H_{2n-2}$ and on $H_{n-2}$ and the third term from the residues at $1-n$ cancel the same way as in the computation of the probability that we saw earlier)

$$-\frac{1}{(2n-1)^2} H_n \left(\frac{2}{n} + \frac{1}{n(n-1)}\right) + \frac{1}{(2n-1)^2} \frac{1}{n(n-1)} \\- \frac{2}{(2n-1)^3} \left(\frac{2}{n}+\frac{1}{n(n-1)}\right) \\ + (n-1)! \frac{2}{(1-2n)^2} \frac{(-1)^n(n-2)!}{(2n-2)!} \frac{1}{2n-2}$$

This is

$$-\frac{1}{(2n-1)n(n-1)} H_n - \frac{1}{(2n-1)^2} \frac{1}{n(n-1)} \\ + (n-2)! \frac{1}{(1-2n)^2} \frac{(-1)^n(n-2)!}{(2n-2)!}$$

Flip the sign and multiply by $n^2(n-1)^2$ to obtain the formula

$$\frac{n(n-1)}{2n-1} H_n + \frac{n(n-1)}{(2n-1)^2} + (-1)^{n-1} \frac{n^2(n-1)^2}{(2n-1)^2} \frac{(n-2)!\times(n-2)!}{(2n-2)!}$$

We conclude with the closed form (an exact result) for the case of packets containing two coupons which is

$$\bbox[5px,border:2px solid #00A000]{ \frac{n(n-1)}{2n-1} H_n + \frac{n(n-1)}{(2n-1)^2} + (-1)^{n-1} \frac{n^2}{(2n-1)^2} {2n-2\choose n-1}^{-1}.}$$

This attractive formula obviously motivates further research, possibly into the case $q=3,$ which looks difficult, perhaps requiring a computer algebra system during the simplifications. Observe that the dominant asymptotic is $1/2\times n H_n$ which means $q=2$ is about twice as fast as a single coupon so the effect of there being no packets with duplicate coupons is negligible.

Here is some Maple code to help with further research into these statistics. Not optimized to the limit e.g. the sets could be replaced with hash tables, but sufficiently functional to verify all of the above.

with(combinat);

ENUM :=
proc(m, n, q)
    local ind, setct, d, data, seen, pos, res;

    setct := binomial(n, q);
    data := choose({seq(p, p=1..n)}, q);
    res := 0;

    for ind from setct^m to 2*setct^m-1 do
        d := convert(ind, base, setct);

        seen :=
        `union`(seq(data[d[pos]+1], pos=1..m));

        if nops(seen) = n then
            res := res + 1;
        fi;
    od;

    res;
end;

T := (m, n, q) ->
add(binomial(n,p)*(-1)^p*binomial(n-p,q)^m, p=0..n);


PM :=
proc(m, n, q)
    local ind, setct, d, data, seen, pos, res;

    setct := binomial(n, q);
    data := choose({seq(p, p=1..n)}, q);
    res := 0;

    for ind from setct^m to 2*setct^m-1 do
        d := convert(ind, base, setct);

        seen :=
        `union`(seq(data[d[pos]+1], pos=1..m-1));

        if nops(seen) < n and
        nops(seen union data[d[m]+1]) = n
        then
            res := res + 1;
        fi;
    od;

    res;
end;

PMX := (m, n, q) ->
add(binomial(n,k)*T(m-1,n-k,q)*binomial(n-k,q-k),
    k=1..q);

STATAPPROX :=
proc(n, q, f, mx)
    option remember;
    local res, m, setct;

    res := 0; setct := binomial(n, q);

    for m to mx do
        res := res +
        f(m)*PMX(m, n, q)/setct^m;
    od;

    res;
end;

EXPT := (n, q) ->
binomial(n, q)
*add(binomial(n,k)*binomial(n-k,q-k)
     *add(binomial(n-k,p)*(-1)^(n-k-p)
          *(binomial(n,q)-binomial(p,q))^(-2),
          p=0..n-k), k=1..q);

EXPT2 := n ->
n*(n-1)*harmonic(n)/(2*n-1) + n*(n-1)/(2*n-1)^2
+ (-1)^(n-1)*n^2/(2*n-1)^2*binomial(2*n-2,n-1)^(-1);

The following simple C program can be used to verify the closed form of the expectation by numerical simulation. These were in excellent agreement to several significant figures on all values that were examined. Compiled with the std=gnu99 option.

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <time.h>
#include <string.h>

long choose(long n, long k)
{
  long num = 1, denom = 1;

  while(k > 0){
    num *= n;
    denom *= k;

    n--; k--;
  }

  return num/denom;
}

void recurse(int n, int q, int *cdata, int *cposrf,
             int nxt, int *sofar, int sf)
{
  if(sf == q){
    memcpy(cdata + q*(*cposrf), sofar, q*sizeof(int));
    (*cposrf)++;

    return;
  }

  if(nxt >= n) return;

  recurse(n, q, cdata, cposrf, nxt+1, sofar, sf);

  sofar[sf++] = nxt;
  recurse(n, q, cdata, cposrf, nxt+1, sofar, sf);
  sf--;
}


int main(int argc, char **argv)
{
  int n = 6 , q = 3, trials = 1000; 

  if(argc >= 2){
    n = atoi(argv[1]);
  }

  if(argc >= 3){
    q = atoi(argv[2]);
  }

  if(argc >= 4){
    trials = atoi(argv[3]);
  }

  assert(1 <= n);
  assert(1 <= q && q < n);
  assert(1 <= trials);

  int allpackets = (int)choose(n, q);
  int cdata[allpackets*q];
  int choice[q], cpos = 0;

  recurse(n, q, cdata, &cpos, 0, choice, 0);

  int pidx;
  for(pidx = 0; pidx < allpackets; pidx++){
    int offs = pidx*q;

    printf("%d", cdata[offs]);
    for(int elidx = 1; elidx < q; elidx++)
      printf(", %d", cdata[offs+elidx]);

    printf("\n");
  }

  srand48(time(NULL));
  long long data = 0;

  for(int tind = 0; tind < trials; tind++){
    int seen = 0; int steps = 0; 
    int dist[n];

    for(int cind = 0; cind < n; cind++){
      dist[cind] = 0;
    }

    while(seen < n){
      int packet = drand48() * (double)allpackets;
      int offs = packet * q;

      steps++;

      for(pidx = 0; pidx < q; pidx++){
        int coupon = cdata[offs + pidx];

        if(dist[coupon] == 0)
          seen++;
        dist[coupon]++;
      }
    }

    data += steps;
  }

  long double expt = (long double)data/(long double)trials;
  printf("[n = %d, q = %d, trials = %d]: %Le\n", 
         n, q, trials, expt);

  exit(0);
}

Wow!! Impressive analysis, thank you. – Masacroso Feb 18 '17 at 05:33 — Masacroso, Feb 18 '17 at 05:33

Masacroso · Accepted Answer · 2017-02-17T09:22:40.447

This is not a solution to the asked question, it is just an approximation

Let a deck with $M$ distinct cards, and each time you draw $n$ cards randomly, put inside again and draw again, etc... We will suppose that the probability to draw some card is the same for all cards, that is, $p=1/M$.

We will work this problem as a Markov chain: suppose you had drawn $k$ distinct cards (no matter in how many draws, ignore this) and you want to know the probability that drawing the next $n$ cards the state of distinct cards will change from $k$ to $k+j$, where $j\in\{0,\ldots,n\}$.

Then if we draw $n$ cards and all are repeated, we have that

$$\Pr[k\to k]=\frac{k}{M}\cdot\frac{k-1}{M-1}\cdots\frac{k-n+1}{M-n+1}=\frac{k^\underline n}{M^\underline{n}}$$

and in general

$$\Pr[k\to k+j]=\binom{n}{j}\frac{k^\underline{n-j}(M-k)^\underline{j}}{M^\underline n}$$

Then the expected change of $k$ from a draw is

$$\mathrm E[\text{change}]=\sum_{j=0}^n j\Pr[k\to k+j]=\frac1{M^\underline n}\sum_{j=0}^n j\binom{n}{j}k^\underline{n-j}(M-k)^\underline{j}\tag{1}$$

The last summation involves a well-known Chu-Vandermonde identity:

$$\sum_{k=0}^n \binom{n}{k}a^\underline k b^\underline{n-k}=(a+b)^\underline n$$

Then using some algebra in (1) we have that

$$\mathrm E[\text{change}]=\frac{n(M-k)}{M^\underline n}\sum_{j=0}^n \binom{n-1}{j-1}(k-1)^\underline{n-j}(M-k)^\underline{j-1}=\frac{n(M-k)}{M^\underline n}(M-1)^\underline{n-1}=n\frac{M-k}M=n\left(1-\frac{k}M\right)$$

The above means that from some draw the expected number of new cards is $n(M-k)/M$ (observe that this quantity is well-defined only when $0\le k\le M$), then (if Im not wrong, what is not sure) the expected number of different cards after $\ell$ draws is the recurrent sum

$$T_\ell:=\sum_{h=1}^\ell N_h\tag{2}$$

where $N_h:=n\left(1-\frac{\sum_{j=1}^{h-1}N_j}M\right) $ and $N_1=n$. I dont know if (2) have a closed form, anyway with different values of $\ell$ you can get an approximation for the minimal number of throws such that $T\ge M$.

EDIT:

It seems that (2) is closable, using some mathematical software I get the solution:

$$T_\ell=(6M\ell+\ell-\ell^3)\frac{n}{6M}$$

But this function for $M=52$ and $n=5$ never gets bigger than $\approx 34$ (this happen when $\ell=10$), so something is very wrong in my interpretation/calculation of the approximation. Probably the fastest way to approximate the expected number of draws is through some numerical modeling software as R.

In the wikipedia article about the Coupon collector's problem is stated that

Wolfgang Stadje has solved the case where the stickers are bought in packets, which contain no duplicates.[3] The results show that for practical applications, say packets of five stickers, the effect of packets is negligible.

Then this problem is practically the same than the original coupon problem.

From page 18 of this document there is an analysis for this case.

Interesting answer (+1), specially the ref. to Wikipedia art., which is understandable: when $n<<M$ then the situation approaches that for $n=1$. — G Cab, Feb 16 '17 at 21:34
Your expectation computation is nice. Here is a somewhat more direct computation via the linearity of expectation. $\mathbf E[\text{change}] = n\mathbf E[\text{a new drawn card is not one of the $k$ previously collected cards}] = n(1-\frac{k}{M})$. — Hans, Feb 17 '17 at 23:59
The above expectation computation holds under weaker conditions since it relies only on the probability of each draw hitting a new card and not on the specific relationship between the draw under consideration and other draws. — Hans, Feb 18 '17 at 04:30
For what it is worth, I believe the formula for k after i draws is $ k = M(-((1-\frac{n}{M})^i -1)) $ . (Thanks to WolframAlpha; I could not for the life of me solve the recursion on paper.) — Landon Bland, Feb 19 '17 at 13:22
Based on my previous comment (if it is correct), $ i = log_{(1-\frac{n}{M})}(1-\frac{k}{M}) $ . You can approximate i by setting $ k = M - 0.5 $ . I get values usably close to the ones in the linked document. — Landon Bland, Feb 19 '17 at 23:28
@LandonBland: Your corollary of Masacroso's result is correct. This again can be deduced directly from the linearity of expectations. For each type of card, the probability of it not being drawn after each drawing is $1-\frac{n}{M}$ and thus the probability of it not being drawn after $i$ mutually independent drawings is $P=\big(1-\frac{n}{M}\big)^i$. The probability of the card being drawn at least once after $i$ drawings is then $1-P$. By the linearity of the expectation and symmetry, the expected number of cards having been drawn at least once is $M(1-P)$. — Hans, Feb 21 '17 at 21:12
Masacroso: I conjecture your approximation is in general less than the true expected number of draws, perhaps due to convexity of a function. Can you prove or disprove this inequality? — Hans, Feb 21 '17 at 22:03
@Hans my approximation $T_\ell$ is wrong, but the formula for $\mathrm E[\text{change}]$ is correct. I dont know how to approximate the value of the expected number of draws correctly. — Masacroso, Feb 22 '17 at 00:58
Have you seen my direct derivation in my previous comments of your $T_l$ suggested by Landon Bland? Do you agree? — Hans, Feb 22 '17 at 08:12

Landon Bland · Answer 3 · 2017-02-23T19:37:45.463

0

M = total cards in the deck
n = number of cards per draw
k = expected unique cards drawn
i = number of draws (of n cards)

I found the following approximations based on the accepted answer's setup. They get close enough to the values on page 21 of this document that I feel that they are mostly correct, and rather simple to use. $$ k=M(−((1−\frac{n}{M})^i−1)) $$ The same equation solved for i to make things easy. $$ i = \log_{(1-\frac{n}{M})}(1-\frac{k}{M}) $$ You can approximate $ i $ by setting $ k = M - .5 $ when solving the equation for $ i $.

edited Feb 23 '17 at 19:37

answered Feb 21 '17 at 00:26

Landon Bland

53

See my comment below yours under Masacroso's answer. But setting $k=M-0.5$ is rather arbitrary. Do you have any justification other than perhaps it being close enough numerically --- even that could be problematic depending on the size of $M$ and $n$ -- to the true solution? In other words, without knowing the true solution, would you be able to select an optimal $k$? – Hans Feb 21 '17 at 21:28
@Hans I do not have justification other than that, and I would not be able to select an optimal k; it was simply easy to implement with a round(). Perhaps a fraction of M? I have trouble with approximating asymptotic functions, clearly. The initial thought in my mind was that where $ k = M - 0.5 $, k is closer to M than any other number and therefore the algorithm can be terminated. That seems more and more wrong the more I think about it. – Landon Bland Feb 23 '17 at 17:52
Exactly. I do not think this is correct, especially as $M$ increases. You will get an answer that is far larger than the true answer because the asymptotics of the function for $k$ is flat. – Hans Feb 23 '17 at 20:23
Your prediction is correct in that it (fairly slowly) gives a larger answer than the actual one (about 4 off when the answer is 120ish draws), but a fraction of M was (significantly) worse in my testing. Not sure how to get a better value. – Landon Bland Feb 26 '17 at 02:15
Precisely because of the flat asymptotics, it is futile to get a good approximation this way. A good asymptotics should be derived from the exact formula. A possible example of how to do this is http://www.jstor.org/stable/1427566?seq=1#page_scan_tab_contents. He stated in the introduction that there was an asymptotics. I have not reached that section yet. You may either use his result directly or be able to use the technique to derive the needed asymptotics yourself. – Hans Feb 27 '17 at 17:48

Probability - Expected number of draws to get all 52 cards at least once drawing in groups of size n

3 Answers3

Linked