I was just reading a blog post about determining how many purchases are needed to complete a set of N unique items, given that you know in advance how many items are in the set.
I was thinking as a modification of this problem, let's say you do not know how many unique items N are in the set (but we can assume each item appears with the same frequency). You're only way to estimate N is by repeatedly sampling ("buying") a box and opening it up to reveal inside which item is present. At some point, you will buy a repeat, but this will be at random, for example, let N = 26 and each letter a-z represent an item type (| represents an end of trial as a repeat is encountered):
a, f, i, l, o, f |
c, q, i, e, s, t, l, r, c |
As soon as a repeated purchase was made, the trial ends. I think this could be done easily in a computer simulation, but I'm wondering how difficult an analytical solution is, or whether it even exists. My guess is this a "classic" problem that has been worked out, but my google search prowess failed to turn up any results.
Specifically, is there a way to estimate N(n)
with some confidence level X%
after n=1, 2, 3, etc
trials?