1

Consider $m$ distinguishable bins of limited capacity $c$ each. After sequentially assigning $n$ indistinguishable balls uniformly (over all bins that are NOT yet full), what is the probability that $k$ out of the $m$ bins are full, i.e. contain exactly $c$ balls?

EDIT: I am considering the mechanism in which balls are launched into the bins sequentially, rather than laid out simultaneously.

From the answer to The probability of distributing K balls over N boxes of size M with at least Q boxes empty. and with the help of https://www.mathpages.com/home/kmath337/kmath337.htm I understand that the number of ways to allocate $n$ indistinguishable balls to $m$ distinguishable bins of capacity $c$ is given by $$N(n,m,c)= \sum_{v=0}^{m}\left(-1\right)^{v} {m \choose v} { m +n -v\left(c+1\right)-1\choose n -v\left(c+1\right)}$$

The number of ways to do so with exactly $k$ bins being full is $$N_k(n,m,c)={m \choose k } N(n-k\cdot c, m-k, c-1)$$

Contrary to what the answer cited above suggests, these ways do not appear to be equally likely, however, so using $$P(k)=\frac{N_k(n,m,c)}{N(n,m,c)}$$ appears to be incorrect. To see this, consider the special case $n=3$, $m=3$, $c=2$. The probability that none of the bins is full should be $2/9$, while the probability that exactly one bin is full should be $7/9$.

full bins with limited capacity after throwing balls addresses this question for the case of a distribution that is uniform over ALL bins, whereas I am interested in the case in which it is uniform over bins that are still available.

2 Answers2

1

First let's try and make clear that

distributing undistinguishable balls into distinguishable bins does not fully specify which stochastic mechanism we are actually considering, and that is frequently cause of misunderstanding and erroneous conclusions.

Second, please allow me to change the symbols as to keep congruence with other related posts that I am going to cite.
So let's speak of $s$ undistinguishable balls, put into $m$ distinguishable bins, each with the same max capacity $r$.

a) Balls laid into the bins

This is what is considered in the Mathpage article that you cite.

In this case we are looking for $$N_{\,b} (s,r,m) = \text{No}\text{. of solutions to}\;\left\{ \begin{gathered} 0 \leqslant \text{integer }x_{\,j} \leqslant r \hfill \\ x_{\,1} + x_{\,2} + \cdots + x_{\,m} = s \hfill \\ \end{gathered} \right.$$ which is given by the closed sum $$ N_b (s,r,m)\quad \left| {\;0 \leqslant \text{integers }s,m,r} \right.\quad = \sum\limits_{\left( {0\, \leqslant } \right)\,\,k\,\,\left( { \leqslant \,\frac{s}{r+1}\, \leqslant \,m} \right)} {\left( { - 1} \right)^k \binom{m}{k} \binom { s + m - 1 - k\left( {r + 1} \right) } { s - k\left( {r + 1} \right)}\ } $$ as thoroughly explained in this related post.
In particular note the way of expressing the second binomial, which allows to waive from the bounds on the sum.

Also note that the "mechanism" of laying the balls in the bins, when the capacity is unlimited leads to a total
number of ways which is $$ N_b (s,s,m) = \left( \matrix{ s + m - 1 \cr s \cr} \right) $$ i.e. the number of weak compositions of $s$ into $m$ parts, which also is the "Stars&bars" mechanism, and by that we can say that we "are launching the bins (the separators, the bars) into the balls".

Then your question turns out into computing :
- the number of ways to choose $q$ out of $m$ bins to fill up;
- the number of ways to distribute the remaining $s-qr$ balls into $m-q$ bins, with capacity $r-1$
i.e. $$ \bbox[lightyellow] { N_f (s,r,m,q) = \left( \matrix{ m \cr q \cr} \right)N_b (s - qr,r - 1,m - q) }$$

b) Balls thrown into the bins

Instead, by "launching the balls into the bins" normally it is understood that for each ball we have $m$ choices where to launch it and thus a total of $$m^s$$ equiprobable events, when the capacity is not limited.
That is quite different from the above, and corresponds to the "mechanism" in which the balls are labelled with the launching sequence, and they land and stack one over the other inside each bin. So each bin is either empty or contains a subset of $\{1,2, \cdots, s \}$.

Now, $m^s$ is the number of $s$-tuples $(b_1, b_2, \ldots, b_s)$, with $b_k$ representing the landing bin of the $k$-th ball.
But this representation is not helpful for counting the number of balls into the same bin, and we have better to refer to the following splitting of $m^s$ $$ \eqalign{ & m^{\,s} = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\left\{ \matrix{ s \cr k \cr} \right\}m^{\,\underline {\,k\,} } } = \cr & = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\underbrace {\;\left( \matrix{ m \cr k \cr} \right)\;}_{\matrix{ {{\rm choice}\,k\,} \cr {{\rm non - empty}\,{\rm bins}} \cr } }\underbrace {\;\left\{ \matrix{ s \cr k \cr} \right\}\;}_{\matrix{ {{\rm partition }\left\{ {{\rm 1}{\rm ,} \cdots {\rm ,s}} \right\}} \cr {{\rm into}\,k\,{\rm sub - sets}} \cr } }\underbrace {\,k!\;}_{\matrix{ {{\rm permute}\,{\rm the}} \cr {{\rm }k{\rm subsets(bins)}} \cr } }} \cr} $$ which hinges upon the the Stirling N. of the 2nd kind.

Introducing the limitation on the capacity of the bins, i.e. on the size of the sub-sets, we need to call into play the Restrained Stirling N. 2nd kind, indicated by $\left\{ \matrix{ s \cr k \cr} \right\}_{\,r}$.

Necessarily proceeding very concisely and schematically,
denote denote as

$ L_{\,b\,} (s,r,m) $
the No. of lists of $m$ sub-sets $ \left[ {\left\{ {S_{\,1} } \right\},\left\{ {S_{\,2} } \right\}, \cdots ,\left\{ {S_{\,m} } \right\}} \right]$
partitioning $\left\{ {1,\,2,\, \cdots ,\,s} \right\}$;
the sub-sets have size $\le r$, and might be also empty, and their order in the list counts.

so that it is

$$ L_{\,b\,} (s,r,m) = \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,m} \right)} {\left\{ \matrix{ s \cr k \cr} \right\}_{\,r} m^{\,\underline {\,k\,} } } \;\;:\quad L_{\,b\,} (s,s,m) = m^{\,s} $$

Then, denoting with $c_1, c_2,\ldots, c_m$ the size of the $m$ subsets, these will represent a weak composition of $s$ into $m$ parts not greater than $r$, and the number of ways to compose the $m$ subsets will be $$ \eqalign{ & L_{\,b\,} (s,r,m) = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} \cr} \right)\left( \matrix{ s - c_{\,1} \cr \,c_{\,2} \cr} \right) \cdots \left( \matrix{ s - c_{\,1} - \,c_{\,2} - \cdots - c_{\,m - 1} \cr \,c_{\,m} \cr} \right)} = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} \cr} $$

Finally we can split $L_b$ according to the exact number ($j$ in the addends below) of the bins saturated at the max capacity $r$ $$ \eqalign{ & L_{\,b\,} (s,r,m) = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} = \cr & = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} + \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 1} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - 1} = s - r} \cr } } \right.} {\left( \matrix{ m \cr 1 \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m - 1} ,r \cr} \right)} + \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 2} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - 2} = s - 2r} \cr } } \right.} {\left( \matrix{ m \cr 2 \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m - 2} ,r,r \cr} \right)} + \cr & \quad \quad \quad \quad \quad \quad \quad \quad \vdots \cr & + \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,1} ,\,c_{\,2} ,\, \cdots ,\,c_{\,m - 2} \, < \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m - \left\lfloor {s/r} \right\rfloor } = s - \left\lfloor {s/r} \right\rfloor r} \cr } } \right.} {\left( \matrix{ m \cr \left\lfloor {s/r} \right\rfloor \cr} \right)\left( \matrix{ s \cr c_{\,1} ,\,\, \cdots ,c_{\,m - \left\lfloor {s/r} \right\rfloor } ,\underbrace {r, \cdots ,r}_{\left\lfloor {s/r} \right\rfloor } \cr} \right)} = \cr & = \sum\limits_{0\, \le \,j\, \le \,\left\lfloor {s/r} \right\rfloor } {\left( \matrix{ m \cr j \cr} \right){{s!} \over {\left( {s - j\,r} \right)!\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} \cr} $$ or adding the initial conditions, so that it can be used also as a recurrence $$ \bbox[lightyellow] { \eqalign{ & L_{\,b\,} (s,r,m) = \sum\limits_{\left\{ {\matrix{ {0\, \le \,c_{\,j} \, \le \,r} \cr {c_{\,1} + c_{\,2} + \, \cdots + c_{\,m} = s} \cr } } \right.} {\left( \matrix{ s \cr c_{\,1} ,\,c_{\,2} ,\, \cdots ,c_{\,m} \cr} \right)} = \cr & = \left[ {0 = r = s} \right] + \sum\limits_{\left( {0\, \le } \right)\,j\, \le \,\left\lfloor {s/r} \right\rfloor } {\left( \matrix{ m \cr j \cr} \right){{s!} \over {\left( {s - j\,r} \right)!\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} = \cr & = \left[ {0 = r = s} \right] + \sum\limits_{\left( {0\, \le } \right)\,j\,\left( { \le \,\left\lfloor {s/r} \right\rfloor \, \le \,m} \right)} {\left( \matrix{ m \cr j \cr} \right)\left( \matrix{ s \cr j\,r \cr} \right){{\left( {j\,r} \right)!} \over {\left( {r!} \right)^{\,j} }}L_{\,b\,} (s - j\,r,\;r - 1,\;m - j)} \cr} }$$ where the square brackets at the beginning are Iverson bracket.

G Cab
  • 35,272
  • 1
    Thank you so much for your answer, your previous posts have already been of tremendous help! There are a number of things I am still confused about, however: Perhaps most importantly, would you mind elaborating on the distinction between balls that are laid or launched? The problem I am dealing with actually corresponds to the case of launching balls, and I am not sure whether and how this is fundamentally different. – user449277 Nov 21 '19 at 19:55
  • This is very helpful, thanks a lot! What I am referring to is indeed the mechanism of "launching balls into bins" - apologies if this was not clear from the way I posed the question. So this implies that with this mechanism balls are always considered to be distinguishable, by virtue of them being launched sequentially? – user449277 Nov 21 '19 at 23:56
  • @user449277: yes, that's so, but pay attention to the fact that inside each bin the balls are arranged with increasing label (order of launch) from bottom up. So you do not have a "full" dist. balls / dist. bins situation. Drawing some simple sketches of various mechanism will help to catch the differences. understand – G Cab Nov 22 '19 at 14:16
  • Thanks! I have tried to follow your argument and implemented it in an answer draft. Do you agree with this way of stating the number of ways in which the balls can land in the bins under the desired mechanism? It is not, however, obvious to me that each of these will be equally likely so that the probability could indeed be obtained as the fraction. Would you mind elaborating on why you think this is the case, or how otherwise you would obtain the probability from your proposed solution? – user449277 Nov 22 '19 at 20:36
  • I have added a counterexample illustrating how counting the ways does not give rise to the correct probabilities. – user449277 Nov 23 '19 at 03:36
  • @user449277: I tried my best, with the limited space, to explain the basics of this problem, recasting and expanding 2nd case – G Cab Nov 25 '19 at 23:55
  • Thank you so much! But then by my last edit to my answer our formulae actually coincide, if you plug in $L_b(s-jr,r-1,m-j)=\sum_k (m-j)^{\underline{k}} a_{r-1} (s-jr,k)$ in your formula. Do you agree? – user449277 Nov 26 '19 at 18:38
  • @user449277: yes, good, now both our formulas give same results ! – G Cab Nov 26 '19 at 22:25
  • Wonderful, that answers the question then! I accepted your answer, thank you very much for your patient help! – user449277 Nov 27 '19 at 15:54
  • 1
    @user449277: thanks to you for proposing such an interesting question ! – G Cab Nov 28 '19 at 14:48
1

The number of ways in which $s$ balls can be launched sequentially into $m$ bins of capacity $r$ each, with exactly $f$ bins being full, is given by

$$\begin{eqnarray*} N_f(s,m,r) &=&{s\choose f\cdot r} {m \choose f} f! a_r(f\cdot r,f) \sum_{i=0}^{m-f} {m-f \choose i} i! a_{r-1}(s-f\cdot r,i)\\ &=& {s\choose f\cdot r} {m \choose f} f! \frac{(f\cdot r)!}{f! (r!)^f} \sum_{i=0}^{m-f} {m-f \choose i} i! a_{r-1}(s-f\cdot r,i)\\ &=& {m \choose f} \frac{s!}{(s-f\cdot r)! (r!)^f} \sum_{i=0}^{m-f} (m-f)^{\underline{i}} a_{r-1}(s-f\cdot r,i)\\ &=&{m \choose f}{s \choose f \cdot r}\frac{(f\cdot r)!}{(r!)^{f}}\sum_{i=0}^{m-f}\left(m-f\right)^{\underline{i}}a_{r-1}(s-f\cdot r,i) \end{eqnarray*}$$

where

  • $a_r(s,m)$ is the restrained Stirling number of the second kind according to https://math.stackexchange.com/a/2315280, i.e. the number of set partitions of $\{1,\dots,n\}$ into exactly $m$ non-empty subsets of maximal cardinality $r$
  • $m! a_r(s,m)$ is the number of such partitions with "labelled" subsets
  • ${s\choose f\cdot r}$ is the number of ways to select balls that will be in one of the $f$ full bins
  • ${m\choose f}$ is the number of ways to select $f$ full bins
  • $i$ is the number of non-empty subsets with cardinality smaller than $r$
  • ${m-f \choose i}$ is the number of ways to select these from the $m-f$ non-full bins
  • counting the partitions amounts to counting only the ways where balls within the bins are stacked in increasing order of their label, a requirement pointed out in a comment by @GCab.
  • $a_r(f\cdot r,f)=\frac{(f\cdot r)!}{f! (r!)^f}$
  • $n^{\underline k}$ denotes the falling factorial

In the special case $s< r$ we have $a_r(s,m)=\begin{Bmatrix}s\\m\end{Bmatrix}$ (the Stirling number of the second kind) and this number becomes \begin{eqnarray*} N_f(s,m,r)&=&{s\choose f\cdot r} {m \choose f} f! \begin{Bmatrix}f\cdot r\\f\end{Bmatrix} \sum_{i=0}^{m-f} {m-f \choose i} i! \begin{Bmatrix}s-f\cdot r\\i\end{Bmatrix}\\ &=& {s\choose f\cdot r} m^{\underline f} \begin{Bmatrix}f\cdot r\\f\end{Bmatrix} \sum_{i=0}^{m-f} (m-f)^{\underline i} \begin{Bmatrix}s-f\cdot r\\i\end{Bmatrix} \\ &=& {s\choose f\cdot r} m^{\underline f} \begin{Bmatrix}f\cdot r\\f\end{Bmatrix} (m-f)^{s-f\cdot r} \end{eqnarray*}

The total number of ways is $N(s,m,r)=\sum_{f=0}^m N_f(s,m,r)$. This does not, however, imply that the probability of $f$ bins being full is $$P(f)=\frac{N_f(s,m,r)}{N(s,m,r)}$$ as demonstrated by a counterexample with $m=3$, $r=2$:

  • $s=1$: $N_0(1,3,2)=3$ so that $P(f=0)=1$
  • $s=2$: $N_0(2,3,2)=6$ and $N_1(2,3,2)=3$ so that $P(f=0)=2/3$
  • $s=3$: $N_0(3,3,2)=6$ and $N_1(3,3,2)=18$ so that $P(f=0)=1/4$

With $s=3$ the probability that no bin is full should be $2/3\cdot 1/3=2/9$. While 6 out of 24 cases leave no bin full, the correct probability should be 6 out of $3^3=27$ equiprobable events.

  • It is not clear how you intend to consider the $3$ ways leading to put the three balls into the same bin. You can cancel them and keep $24$ equi-probable total ways to fill $3$ bins of capacity $2$ . – G Cab Nov 23 '19 at 23:01
  • How did you get the first formula for $N_f$ ?, as far as I could check it is not correct. Actually, and unfortunately, the recursions for the Restrained Stirling are a bit involved and that makes difficult to split the R. St2 according to having $0,1, \cdots, m$ full bins. – G Cab Nov 23 '19 at 23:16
  • In the mechanism I consider, a ball that would have been put into a bin that was full will then be put in one of the other bins, rather than discarded. Therefore the 24 events are not equiprobable. Consider the case in which the first bin contains 2 balls and the other two bins contain none. The third ball will now end up in the second or third bin with probability 1/2 each, not 1/3. Thus there is additional probability mass on the assignment (2,1,0), relative to the equiprobable case. Does this make sense? – user449277 Nov 24 '19 at 19:22
  • What part of the formula looks incorrect to you? I just attempted to count the ways in which s balls can be assigned to $f$ bins containing $r$ balls and $m-f$ bins containing between $0$ and $r-1$ balls. I computed the restricted Stirling numbers using the recursion, and the resulting output of the formula in the example with $m=3$, $r=2$ seem correct, right? – user449277 Nov 24 '19 at 19:35
  • "Does this make sense ?" yes, that is one of the many possible "mechanisms" which make the throwing the balls in the bins not to have a unique interpretation. But with this mechanism, forget that you can arrive at an algebraic close formula. – G Cab Nov 24 '19 at 21:38
  • Concerning the formula, unfortunately it does not work even in the simple mechanism of $24$ equi-prob. events. The difficulty with St2 is that, when you have blocks with equal size, you cannot permute them : re. for instance to this post – G Cab Nov 24 '19 at 21:42
  • "But with this mechanism, forget that you can arrive at an algebraic close formula." This answers my question then (in the negative) - would you like to add a statement explaining this fact to your answer before I accept it? Since I should probably remove my answer. – user449277 Nov 24 '19 at 23:00
  • The second remark I don't understand, I'm afraid. Since under the "launching balls mechanism" balls are labelled by their sequence, shouldn't we be able to permute even blocks of equal size? So you're saying that 24 is not the correct number? – user449277 Nov 24 '19 at 23:07
  • Not exactly. What I meant to say is that St2 does allow to be easily manipulated. But we cannot discuss further in comments: I'll try to expand my answer. – G Cab Nov 24 '19 at 23:32
  • Thank you, that's most appreciated! You're absolutely right - I'm still rather new to the forum. – user449277 Nov 25 '19 at 02:47