Prove $P(A) = \sum_{i=1}^{n}P(A_i) - 2\sum_{i

Question

Let $A$ be the collection of outcomes which belong to only one event among events $A_1, \ldots, A_n$. Prove

\begin{align*} P(A) &= \sum_{i=1}^{n}P(A_i) - 2\sum_{i<j \leq n}P(A_i \cap A_j) + 3\sum_{i<j<k \leq n}P(A_i \cap A_j \cap A_k) - \ldots \\ &\qquad \ldots + (-1)^{n-1} n P(\bigcap_{i=1}^n A_i) \end{align*}

The question is from my course's exercise problem, the expression looks pretty similar to inclusion-exclusion formula, I tried induction but didn't work.

score 2 · Answer 1 · answered Oct 04 '19 at 16:29

The formula given in the question generalizes the identity $$ P(A_1\Delta A_2)=P(A_1)+P(A_2)-2P(A_1\cap A_2) $$ where $\Delta$ is the symmetric difference. To prove it in general, we proceed as follows. Write the indicator function for $A$ as $$ I(A)=\sum_{j=1}^nI(A_j)\left(\prod_{i=1, i\neq j}^n (1-I(A_i))\right).\tag{0} $$ Expand the right hand side and consider what happens when you take the expectation of both sides. For example the term $\sum_{i=1}^n P(A_i)$ in your sum arises after taking expectations from the terms involving only $I(A_j)$ on the rhs of (0). Moreover the term $-2\sum_{1\leq i<j<n}P(A_j\cap A_j)$ arises after taking expectations on the terms $-I(A_j)I(A_i)$ for $i\neq j$. You can continue to analyze the sum in this way.

score 1 · Answer 2 · edited Oct 15 '19 at 20:10

We have a more general result. Let $A_1,A_2,\ldots,A_n$ be measurable sets in a finite measure space $(\Omega,\mathcal{F},P)$. For an integer $k$, $0\le k\le n$, let $E_k$ denote the event consisting of $x\in \Omega$ such that $x$ belongs to exactly $k$ sets among $A_1,A_2,\ldots,A_n$. Then $E_k\in\mathcal{F}$ and $$P(E_k)=\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots<i_r\le n}P\left(A_{i_1}\cap A_{i_2}\cap \ldots \cap A_{i_r}\right).\ \ \ \ \ (0)$$ For $k=0$, the sum $\sum_{1\le i_1<i_2<\ldots<i_r\le n}P\left(A_{i_1}\cap A_{i_2}\cap \ldots \cap A_{i_r}\right)$ for $r=0$ is to be interpreted as $P(\Omega)$.

In particular when $\Omega$ is a finite set and $P$ is the counting measure, (0) can be rewritten as $$|E_k|=\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots<i_r\le n}\left|A_{i_1}\cap A_{i_2}\cap \ldots \cap A_{i_r}\right|.$$ Again, for $k=0$ and $r=0$, we use the convention $$|\Omega|=\sum_{1\le i_1<i_2<\ldots<i_r\le n}\left|A_{i_1}\cap A_{i_2}\cap \ldots \cap A_{i_r}\right|.$$

For a proof, let $\chi_S$ denote the characteristic function of $S\in \mathcal{F}$. That is, $P(S)=\int\chi_S dP$. By writing $$E_k=\left(\bigcup_{1\le i_1<i_2<\ldots<i_k\le n}\bigcap_{j=1}^kA_{i_j}\right)\setminus\left(\bigcup_{1\le i_0<i_1<i_2<\ldots<i_k\le n}\bigcap_{j=0}^kA_{i_j}\right),$$ it follows that $E_k\in\mathcal{F}$. Here when $k=0$, we use the convention $$\bigcup_{1\le i_1<i_2<\ldots<i_k\le n}\bigcap_{j=1}^kA_{i_j}=\Omega.$$ We want to verify that $$\chi_{E_k}=\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots<i_r\le n}\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}},\ \ \ \ \ (1)$$ where we interpret $\sum_{1\le i_1<i_2<\ldots<i_r\le n}\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}}$ when $k=0$ and $r=0$ as $\chi_\Omega=1$.

Here is an example. Recall that $$1-\chi_X=\chi_\Omega-\chi_X=\chi_{\Omega\setminus X}=\chi_{X^c}$$ and $$\chi_{X_1\cap X_2\cap \ldots \cap X_m}=\chi_{X_1}\ \chi_{X_2}\ \cdots \ \chi_{X_m}.$$ The case $k=0$ is easy as the LHS of (1) is precisely $$\prod_{j=1}^n(1-\chi_{A_j})=\prod_{j=1}^n \chi_{A_j^c}=\chi_{\bigcap_{j=1}^nA_j^c}=\chi_{\left(\bigcup_{j=1}^nA_j\right)^c}=\chi_{E_0}.$$

Fix $x\in \Omega$. Suppose that $x$ is in precisely $\ell$ sets among $A_1,A_2,\ldots,A_n$. Then it follows that $$\sum_{1\le i_1<i_2<\ldots <i_r\le n}\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}}(x)=\binom{\ell}{r} .$$ Therefore $$\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots <i_r\le n}\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}}(x) = \sum_{r=k}^n (-1)^{r-k}\binom{r}{k}\binom{\ell}{r} = \left\{\begin{array}{ll}0&\text{if }\ell<k\\ \sum_{r=k}^\ell(-1)^{r-k}\binom{r}{k}\binom{\ell}{r}&\text{if }\ell\ge k.\end{array}\right.$$ So when $\ell<k$, (1) when evaluated at $x$ yields a correct result. Let now $\ell\ge k$. Using $\binom{r}{k}\binom{\ell}{r}=\binom{\ell}{k}\binom{\ell-k}{r-k}$ we get $$\sum_{r=k}^\ell(-1)^{r-k}\binom{r}{k}\binom{\ell}{r}=\binom{\ell}{k}\sum_{r=k}^{\ell}(-1)^{r-k}\binom{\ell-k}{r-k}=\binom{\ell}{k}\sum_{j=0}^{\ell-k}(-1)^j\binom{\ell-k}{j}.\ \ \ \ \ (2)$$ It is well known that $\sum_{j=0}^m(-1)^j\binom{m}{j}=1$ for $m=0$, and $\sum_{j=0}^m(-1)^j\binom{m}{j}=(1-1)^m=0$ for $m>0$. Therefore, (2) gives $$\sum_{r=k}^\ell(-1)^{r-k}\binom{r}{k}\binom{\ell}{r}=\left\{\begin{array}{ll}1&\text{if } \ell=k\\0&\text{if }\ell>k.\end{array}\right.$$ Hence, also when $\ell \ge k$, (1) evaluated at $x$ also yields the correct result. This shows that (1) is true.

By integrating (1) we get $$P(E_k)=\int \chi_{E_k}dP=\int\left(\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots<i_r\le n}\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}}\right)dP.$$ By linearity of integration, $$P(E_k)=\sum_{r=k}^n(-1)^{r-k}\binom{r}{k}\sum_{1\le i_1<i_2<\ldots<i_r\le n}\int\chi_{A_{i_1}\cap A_{i_2}\cap\ldots\cap A_{i_r}}dP,$$ which is precisely (0).

score 0 · Answer 3 · answered Oct 04 '19 at 16:36

This seems like a basic induction. So assume this holds for $n$ sets $A_1, \ldots, A_n$ and add on an extra set $A_{n+1}$. And consider the disjoint unions $A=\dot{\bigcup}_{i=1}^nA_i$ and $A'=\dot{\bigcup}_{i=1}^{n+1}A_i$.

We know $P(A)=\text{'sum as in question'}$. Now when we add on the set $A_{n+1}$, we add its probability $P(A_{n+1})$ but subtract $P(A_{n+1}\setminus \bigcup_{i=1}^nA_i)$ and $P(A\setminus A_{n+1})$.

$P(A\setminus A_{n+1})$ can be calculated by considering the sets $B_i=A_i\cap A_{n+1}$ for $i\in \{1, \ldots, n\}$. And noting that $P(A\setminus A_{n+1}=P\left(\dot{\bigcup}_{i=1}^nB_i\right)=P(B_1)+P(B_2)+\cdots+P(B_n)-P(B_1\cap B_2).... $.

Now we are left to show $P(A_{n+1}\setminus \bigcup_{i=1}^nA_i)=P(A_1\cap A_{n+1})+\cdots +P(A_n\cap A_{n+1})-P(A_1\cap A_2\cap A_{n+1}).... $, which is just standard inclusion-exclusion.

robjohn · Answer 4 · 2019-10-05T23:19:46.380

This is precisely the $k=1$ case of the following

Theorem (Generalized Inclusion-Exclusion Principle)

Let $\{S(i)\}_{i=1}^m$ be a finite collection of sets from a finite universe.

Let $N(j)$ be the the sum of the sizes of all intersections of $j$ of the $S(i)$: $$ N(j)=\sum_{|A|=j}\left|\,\bigcap_{i\in A} S(i)\,\right| $$ Thus, $N(0)$ is the size of the universe.

Then, the number of elements in exactly $k$ of the $S(i)$ is $$ \sum_{j=0}^m(-1)^{j-k}\binom{j}{k}N(j) $$

After showing the cancellation lemma $$ \sum_{j=k}^n(-1)^{j-k}\binom{n}{j}\binom{j}{k} =[n=k] $$ where $[\dots]$ are Iverson Brackets, the proof in this answer is only a few lines long.

score 0 · Answer 5 · answered Oct 05 '19 at 13:53

Preliminary. Let $(A_{i})_{i \in I}$ and $B$ be events. Then applying the Inclusion-Exclusion Principle to $A_{i}\cap B$'s, we get

\begin{align*} P\left(\cup_{i\in I} A_i \cap B \right) = \sum_{\substack{J \subseteq I \\ J \neq \varnothing}} (-1)^{|J|-1} P\left(\cap_{j \in J} A_j \cap B\right). \end{align*}

Now by noting that $P\left(B \setminus \cup_{i\in I} A_i\right) = P(B) - P\left(\cup_{i\in I} A_i \cap B \right)$, this implies

\begin{align*} P\left(B \setminus \cup_{i\in I} A_i\right) = \sum_{J \subseteq I} (-1)^{|J|} P\left(\cap_{j \in J} A_j \cap B\right) \tag{1} \end{align*}

Alternatively, $\text{(1)}$ can be proved directly from $P\left(B \setminus \cup_{i\in I} A_i\right) = E\left[ \mathbf{1}_{B} \prod_{i\in I}( 1 - \mathbf{1}_{A_i} )\right]$.

Proof. Write $[n] = \{1,\cdots,n\}$ and define the event $E_m$ by

\begin{align*} E_m = \{\text{exactly $m$ out of $A_1,\cdots,A_n$ occur}\} = \bigcup_{\substack{ J \subseteq [n] \\ |J| = m}} \left( ( \cap_{j \in J} A_j ) \setminus ( \cup_{k \in [n]\setminus J} A_k ) \right). \end{align*}

Then OP's case corresponds to $E_1$. Taking probability to both sides,

\begin{align*} P(E_m) = \sum_{\substack{ J \subseteq [n] \\ |J| = m}} P\left( ( \cap_{j \in J} A_j ) \setminus ( \cup_{k \in [n]\setminus J} A_k ) \right) \stackrel{(1)}{=} \sum_{\substack{ J \subseteq [n] \\ |J| = m}} \sum_{K \subseteq [n]\setminus J} (-1)^{|K|} P\left( \cap_{k \in K \cup J} A_k \right). \end{align*}

Now, for each $I \subseteq [n]$ with $|I| = r \geq m$, there are exactly $\binom{r}{m}$ ways of splitting $I$ into two disjoint sets $J, K$ with $|J| = m$ and $|K| = r-m$, and so, the above sum simplifies to

\begin{align*} P(E_m) &= \sum_{r = m}^{n} (-1)^{r-m}\binom{r}{m} \sum_{\substack{ I \subseteq [n] \\ |I| = r}} P\left( \cap_{i \in I} A_i \right) \end{align*}

The case $m = 1$ reduces to the identity to be proved in OP. $\square$

Prove $P(A) = \sum_{i=1}^{n}P(A_i) - 2\sum_{i

5 Answers5