How to prove Bonferroni inequalities?

Question

Define $$S_1 = \sum_{i=1}^n P(A_i)$$ and $$S_2 =\sum_{1 \le i < j \le n}^n P(A_i \cap A_j)$$ as well as $$S_k =\sum_{1 \le i_1 < \cdots < i_k \le n}^n P(A_{i_1} \cap \cdots \cap A_{i_k})$$ Then for odd $k$ in $\{1,\ldots,n\}$ $$P\left(\bigcup_{i=1}^n A_i\right) \le \sum_{j=1}^{k}(-1)^{j-1} S_j$$ For even $k$ in $\{2,\ldots,n\}$ $$P\left(\bigcup_{i=1}^n A_i\right) \ge \sum_{j=1}^{k}(-1)^{j-1}S_j$$

More details of Bonferroni inequalities or Boole's inequality is here.

isn't this inclusion exclusion principle? For odd number of terms you overcount the probability, for even number of terms you undercount. — cactus314, Oct 07 '12 at 15:24

Did · Accepted Answer · 2013-10-04T18:09:30.037

A proof is there. The main idea is that this is the integrated version of analogous pointwise inequalities and that, for every $k$, $$ S_k=\mathbb E\left({T\choose k}\right),\qquad T=\sum_{i=1}^n\mathbf 1_{A_i}. $$ Hence the result follows from the stronger inequalities asserting that, for every positive integer $N$, $$ \sum_{i=0}^k(-1)^ia_i,\qquad a_i={N\choose i}, $$ is nonnegative when $k$ is even and nonpositive when $k$ is odd. In turn, this fact follows from the properties that the sequence $(a_i)_{0\leqslant i\leqslant N}$ is unimodal and that $\sum\limits_{i=0}^N(-1)^ia_i=0$.

Nicholas · Answer 2 · 2019-04-17T20:10:20.337

Bonferroni inequality is closely related to the partial sum of alternating binomial coefficients.

Let's consider an element $w$ in sample space and literally count it in the left-hand side and right-hand side of the inequality. If $w$ belongs to none of $A_1$ to $A_n$, then it is not counted in $\bigcup_{i-1}^n A_i$, and it's not counted in any $A_i$, any $A_i\cap A_i$, ..., and any $A_{i_1}\cap A_{i_2}\ldots\cap A_{i_k}$.

If $w$, however, is contained in $r$ sets from $\{A_1, A_2, \ldots, A_n\}$, let's just say $w$ lies in $A_1, \ldots , A_r$. Then $w$ is counted exactly once in $\bigcup_{i-1}^n A_i$ (LHS), and counted ${r\choose 1}-{r\choose 2}+ \ldots +(-1)^{k-1}{r\choose k}$ times on the right-hand side, where for $k\gt r$, ${r\choose k}=0$. Now, let's compare the counts on both sides.

If $k=r$, using bionomial theorem to exapnd $(1-1)^n$, we have $1={r\choose 1}-{r\choose 2}+ \ldots +(-1)^{r-1}{r\choose r}$. That is, $w$ is counted the same times on both sides.
If $k<r$, again compare $1$ (LHS) and ${r\choose 1}-{r\choose 2}+ \ldots +(-1)^{k-1}{r\choose k}$ (RHS). Specifically, let's find $f(k) = 1-{r\choose 1}-{r\choose 2} \ldots (-1)^{k-1}{r\choose k}$ instead. In fact, $f(k)$ is the partial sum of alternating bionomial coefficients and has closed form $(-1)^k {r-1 \choose k}$. This can be easily proved by induction and Pascal's rule, see here. Now, it's easy to see when $k$ is odd, $f(k)$ is negative and hence $w$ is counted more times on RHS; when $k$ is even, $f(k)$ is positive and hence $w$ is counted less times on the RHS.

In summary, for odd $k$, $w$ is counted either equal or more times on the RHS and hence the first $k$ terms on RHS is an upper bound of $P\left(\bigcup_{i=1}^n A_i\right)$; and for even $k$, $w$ is counted either equal or fewer times on the RHS and hence the first $k$ terms on RHS is an lower bound of $P\left(\bigcup_{i=1}^n A_i\right)$. The alternating partial sum of binomial coefficients results in the alternating Bonferroni bounds.

score 4 · Answer 3 · answered Aug 30 '18 at 20:27

Here is a self-contained proof that expands on @Did's remarks.

The assertion is that $\Delta_k\le0$ when $k$ is odd, and $\Delta_k\ge0$ when $k$ is even, where $$ \Delta_k:=P\left(\bigcup_{i=1}^n A_i\right) +\sum_{j=1}^k(-1)^j S_j.\tag1 $$ To prove this, first observe that $S_j$ is the expected value of $$ \sum_{i_1 < \cdots <i_j} I(A_{i_1}\cap\cdots\cap A_{i_j}) = {T \choose j}\tag2 $$ where $I(\cdot)$ denotes an indicator random variable and $T$ is the integer-valued random variable $T:=\sum_{i=1}^n I(A_i)$. The reason is that $I(A_{i_1}\cap\cdots\cap A_{i_j})(\omega)=1$ if and only if $\omega$ belongs to each of the $j$ sets $A_{i_1},\ldots,A_{i_j}$. Thus the LHS of (2) counts the number of ways to select $j$ different $A$'s to which $\omega$ belongs, and so does the RHS of (2). (We follow the convention ${a \choose b}=0$ when $a<b$, so (2) holds even when $T<j$.)

From (2) we see that $\Delta_k$ is the expected value of $$ I(\cup A_i) +\sum_{j=1}^k (-1)^j {T\choose j} \stackrel{(3a)}=I(\cup A_i)\left[\sum_{j=0}^k(-1)^j {T\choose j}\right] \stackrel{(3b)}=I(\cup A_i)\left[(-1)^k{T-1\choose k}\right].\tag3 $$ To justify equality (3a), consider the cases $\omega\in\cup A_i$ and $\omega\not\in\cup A_i$. For (3b) we apply (pointwise) an identity about the truncated sum of alternating binomial coefficients. From this last expression we conclude that (3) is a non-positive random variable when $k$ is odd, and a non-negative random variable when $k$ is even, which implies the claimed result.

As a bonus, plug $k=n$ in (3). Since $T\le n$, the bracketed quantity will be zero, which implies $\Delta_n=0$, which is the inclusion-exclusion principle.

score 1 · Answer 4 · answered Jul 11 '22 at 18:42

First, let us prove a related numerical lemma.

Lemma: Let $n\in \mathbb N$, let $x_1,\dots,x_n$ be real numbers between $0$ and $1$, and let $m$ be a positive integer for which $m\le n$. For integers $k,r$ such that $1\le k\le r$, Let $e^r_k$ denote the $k^\text{th}$ elementary symmetric polynomial in $r$ variables evaluated at the first $r$ numbers $x_1,\dots,x_r$. That is, $$ e^r_k=\sum_{1\le i_1<i_2<\dots<i_k\le r}x_{i_1}x_{i_2}\cdots x_{i_k}$$ Furthermore, define $e^r_0=1$, and $e^r_{-1}=0$ for any $r\ge 0$. Then $$(-1)^m\prod_{i=1}^n(1-x_i)\le (-1)^m\sum_{k=0}^m e^n_k$$

Proof: We prove this by induction on $n$.

\begin{align}(-1)^m\prod_{i=1}^n (1-x_i) &= (-1)^m\prod_{i=1}^{n-1}(1-x_i)+(-1)^{m-1}x_n\prod_{i=1}^{n-1}(1-x_i)\\ &\le (-1)^m\sum_{k=0}^m (-1)^ke_{k}^{n-1}+(-1)^{m-1}x_n \sum_{k=0}^{m-1}(-1)^ke_{k}^{n-1}\\ &= (-1)^m\sum_{k=0}^m (-1)^k\big(e_{k}^{n-1}+x_n e_{k-1}^{n-1}\big)\\ &= (-1)^m\sum_{k=0}^m (-1)^ke_{k}^{n}\end{align} For the second step, we apply the induction hypothesis twice. In the last step, we use the rule $$e^{n}_k=e^{n-1}_k+x_{n}\cdot e^{n-1}_{k-1},$$ which is analogous to Pascal's rule, and is proven in the same way; take the summands defining $e^n_k$, and split them into groups, based on whether they have $x_n$ as a factor.

With this lemma, the Bonferroni inequalities are easy to derive. Let $X_i={\bf 1}(A_i)$ be the indicator random variable for $A_i$. From the Lemma, $$ (-1)^m\prod_{i=1}^n (1-X_i)\le (-1)^m \sum_{k=0}^m e^n_k(X_1,\dots,X_n) $$ If we negate both sides of this inequality, then add $(-1)^m$ of both sides, we get $$ (-1)^m\left[1-\prod_{i=1}^n (1-X_i)\right]\ge (-1)^m\sum_{k=\color{red}1}^me^n_k(X_1,\dots,X_n), $$ since $e^n_0(X_1,\dots,X_n)=1$. Finally, take the expected value of both sides.

On the LHS, note that $\left[1-\prod_{i=1}^n (1-X_i)\right]$ is exactly the indicator random variable for $\bigcup_{i=1}^n A_i$.
On the RHS, it is easy to see that the expected value of $e^n_k(X_1,\dots,X_n)$ is just $S_k$.

Thus, we have proved that $$ (-1)^mP\left(\bigcup_{i=1}^n A_i\right)\ge (-1)^m \sum_{k=1}^m S_k $$ For each $m$, this is exactly the $m^\text{th}$ Bonferroni inequality; the effect of the $(-1)^m$ is to switch the direction of the inequality when $m$ is odd.

very nice......! – Abdelmalek Abdesselam Feb 22 '24 at 21:48 — Abdelmalek Abdesselam, Feb 22 '24 at 21:48

How to prove Bonferroni inequalities?

4 Answers4

Linked