0

My goal is to find the probability of matching observations from two draws of different sizes.

Say I have an urn of $N$ balls identified $1, 2, 3, ..., N$. I draw $k$ balls from the urn without replacement. I then return those balls to the urn. I then sample $w$ balls where $w\geq k$. Say we then have $v$ matches of balls where $0 \le v \le k$. What is the probability of getting $v$ matches?

For the scenario where we draw $k$ balls twice, the probability of $v$ matches is $p = C(k,v)\times C(N−k,k−v) ∕C(N,k)$ based on (Anselin and Li, 2020).

This answer seems to be the closest to my question. But doesn't quite get to it.

Additionally, this one seems that it may be useful to some degree.

thus__
  • 101
  • 1
    This is a simple hypergeometric probability. In your notation, the answer is $\dfrac{C(k,v)C(n-k,w-v)}{C(n,w)}$. In mine $\dfrac{{k\choose v}{n-k\choose w-v}}{{n \choose w}}$ – Henry Nov 13 '21 at 16:28
  • Thank you so much! And for the condition that $w < k$ would we return to the original formula $C(k ,v)C(n-k, k-v)/C(n,k)$? – thus__ Nov 13 '21 at 17:03
  • See also here. It doesn't matter if $w \ge k$ or $w \lt k$, you get the same result. Note that you can exchange $w$ and $k$ in Henry's formula and obtain the same value. – Fabius Wiesner Nov 13 '21 at 17:37
  • @thus__ No. If $w < k$ you have exactly the same $\dfrac{{k\choose v}{n-k\choose w-v}}{{n \choose w}}$ formula, but it only produces positive results when $\max(0,k+w-n) \le v \le \min(w,k)$ – Henry Nov 13 '21 at 17:37

0 Answers0