I have a set, of an unknown size $N$. I can cheaply query a random item from this set. I need to estimate $N$ to within a given confidence range.
I have already setup a structure where I can query an item, determine if it has occurred before, and update my overlap counter. My issue is how do I take a sequence of occurrences counts, and infer the set size they are sampled from.
To give a real example:
Consider Set $s$, $s:=\{1, 2, 3, 4, 5, 6\}$.
8 Random Samples yielded $2, 2, 4, 5, 6, 2, 3, 1$. This sequence is processed into this map:
2 -> 3,
4 -> 1,
5 -> 1,
6 -> 1,
3 -> 1
How could I infer set size from this map?
My research has yielded this formula, for inferring population size from discrete random samples: $$ P\left(N\ |\ {s}_1,{s}_2,o\right)\propto P\left(\ o\ |\ N,{s}_1,{s}_2\right)\times P(N). $$ from this paper, but this formula involves seperate random samples, and I cannot adapt it to handle one sample.
I also had difficult finding answers that gave confidence intervals. As the solution will ultimately be used in an algorithm to determine total size, I plan to simply continue running until my confidence falls into a given range.
Any help is appreciated! I'm half sure there's just some statistical law that everyone but me knows that answers this question.