1

The event occurs on a given trial with the probability $p$. What is the number $k$ of independent trials needed for the event to occur no less than $N$ times with probability $q$ ?

What I've managed to express so far is probability $q$ through $p$ and $N$ (is this correct?):

$q = p^{k} + C^{k-1}_{k}p^{k-1}(1-p) + ... + C^{N}_{k}p^{N}(1-p)^{k-N}$

But would like to know how to analytically express what I need (I can solve it with numeric methods, of course, but I want a non-iterative formula).

  • 1
    Small correction. You are very rarely going to be able to get equality to hold, so you should look for for when $q$ is greater than or equal to the expression on the right. Also, you have an $n$ when you mean to have $N$. Finally, and almost completely off topic, when you have a variable that only takes integer values, it is a somewhat standard convention to use one of the variables $i,j,k, \ell, m$, or $n$ (or $N$, in some cases). Seeing $p^x$ and then realizing that $x$ has to be an integer made me twitch for a few seconds. – Aaron Oct 17 '12 at 10:01
  • @Aaron: Thanks, corrected the problems. And it is indeed obvious that matching probabilities with equality operator wouldn't make much sense. – Violet Giraffe Oct 17 '12 at 10:13
  • Small correction to my comment. I wrote greater where I meant less than. I hope no great confusion arose. Also, thank you for changing the $x$'s to $k$'s. The formula looks a lot less confusing when I casually glance at it. – Aaron Oct 17 '12 at 10:48

2 Answers2

1

There's still something wrong with your formula; there's an $x$ that hasn't been introduced. Going by the text, you want

$$ \sum_{j=N}^k\binom kjp^j(1-p)^{k-j}\;. $$

You're unlikely to find a closed form for this, since for $p=1/2$ this is proportional to a sum over the lower argument of a binomial coefficient, for which no closed form is known. See also Asymptotics for a partial sum of binomial coefficients.

joriki
  • 238,052
1

This is not a solution, per se, but rather a heuristic you can use to narrow your search for the smallest number of trials you need. I do not think that there is an analytic solution, but rather just a large scale approximation which is quite useful.

Your trials can be viewed as a random variables which take the value $1$ with probability $p$ and the value $0$ with probability $1-p$. These have expected value $p$ and variance $p(1-p)$. By the central limit theorem, the sum of $k$ independent trials (i.e., the number of successful trials you've had) will be approximately normally distributed, with mean $kp$ and variance $kp(1-p)$. We want the probability that we have at least $N$ successes to be $q$ or greater. We can phrase this in terms of the standard error function.

Let $\mathcal N(\mu,\sigma^2)$ denote a normally distributed random variable with mean $\mu$ and variance $\sigma^2$. We want
$$q=P(\mathcal N(kp,kp(1-p))>N)=P(\mathcal N(0,kp(1-p))>N-kp)=P\left(\mathcal N(0,kp(1-p))>\frac{N-kp}{\sqrt{kp(1-p)}}\right).$$

However, $P(\mathcal N(0,1)>x)=\frac{1}{2}-\frac{1}{2}\operatorname{erf}(x/\sqrt{2})$, so we need to solve

$$\operatorname{erf}^{-1}(1-2q)=\frac{N-kp}{\sqrt{2kp(1-p)}}.$$

The left hand side is a constant which can be computed easily using many different computer packages. This reduces the problem to something more computationally feasible than computing the sum in the problem for various values of $k$.

Aaron
  • 24,207