I have a problem that looks quite simple but I can't write it down mathematically correctly.
Imagine I have $N$ persons, and I randomly sample $k$ persons with replacement throughout different rounds $t=1,...T$. I want to know how many different persons $n_{sampled, t}$ have I sampled in $t$ rounds, typically for what $t$ would I have sampled all $N$ persons at least once.
At round $t=1$, I know for sure that I have sampled $k$ different persons. But the next round, I may sample persons that I had already sampled. In the worst case, I would have sampled the exact same persons, so $n_{sampled}=k$ and in the best case, I would have sample completely different persons, so $n_{sampled}=2k$.
Thus, assuming that at the first round $n_{sampled,1}=k$, I need to add for the next rounds the average persons who have not already been sampled.
The probability (P) of a given person to be chosen at round $t$ is :
$$ P=\frac{k}{N} $$
And thus the probability $p_t$ of a given person to not have been already sampled at round $t$ would be :
$$p_t = (1 - P)^t = (1-\frac{k}{N})^t$$
(Is this correct?)
So if we consider a random variable $X_t$ counting the number of persons that haven't already been sampled at round $t$ ($0 \leq X_t \leq k$) and assume it follows a binomial law of parameters $p_t$ and $k$, we would have : $$ \mathbb{E}(Xt) = k p_t$$
And thus,
$$n_{sampled, T} = k + \sum_{t=2}^{T} \mathbb{E}(Xt)$$
Would this be correct ?