I have a bag filled with different colors of balls. My goal is to determine the number of distinct colors that in the bag, but I am limited to taking a small sample. From a sample of $N$ balls, I see that there $X$ different colors. What is the expected number of different colors in the bag?
Some assumptions which need to be made:
- The bag is of sufficiently large size that the probability of drawing a certain color does not depend on the how many balls we have already drawn. (Effectively, we are drawing with replacement.)
- There is an equal number of each color in the bag.
For an example, let's say that I draw $N=17$ balls out of the bag, and I see $X=13$ distinct colors. What is a good estimate for the number of colors in the bag?
So far, I have made little progress towards answering this on my own. I have tried to reverse the solution to the coupon collector's problem (as to solve for the number of colors as opposed to the number of trials), but I became stuck since it involved the harmonic numbers.