Problem
What is the probability of observing N or more empty buckets given B buckets and A balls, if you throw the balls into any of the buckets with equal probability.
Simulations
Python
from random import random as r
TRIALS = 10**5
BALLS = 10
BUCKETS = 12
MIN_EMPTY = 5
success = 0
for _ in range(TRIALS):
buckets = [0 for _ in range(BUCKETS)]
for _ in range(BALLS):
buckets[int(r() * float(BUCKETS))] += 1
if buckets.count(0) >= MIN_EMPTY:
success += 1
ans = (success / float(TRIALS))
print ans
R
TRIALS = 10**5
BALLS = 10
BUCKETS = 12
MIN_EMPTY = 5
success=0
for(i in 1:TRIALS){
b=rep(1,BUCKETS)
b[sample(1:BUCKETS,BALLS,replace=T)]=0
if(sum(b) >= MIN_EMPTY){
success = success + 1
}
}
success/TRIALS
Computationally difficult numerical solution
Using the equations in this previous question, What's the probability that there's at least one ball in every bin if 2n balls are placed into n bins?, I put together a numerical solution for the probability.
$${\sum_{k=N}^{B-1} (B-k)! S(A,B-k) {B \choose k} \over B^A}$$
This solution produces very large numbers internally for even moderate numbers of Buckets and Balls. Is there a different solution that is friendlier for computers to compute across a wider range of values?
Other solutions
I also came across this problem referred to as "the occupancy problem". See Probability of finding $m$ or more cells empty and http://probabilityandstats.wordpress.com/2010/04/04/a-formula-for-the-occupancy-problem/, although buyer beware, I have not yet been able to get the solution provided on that blog post to give me the answer I get from simulation, and my implementation of the above numerical solution.
Thanks!