I'm trying to "explain" (I think this would not be a formal proof because I use a special case of the formula itself when I was "proving" it. So the tag I put might need to be edited.) this formula: ($E_m$ counts those elements with exactly $m$ properties. $S_j$ is explained in the quoted text below)
$$ E_m=\sum_{j=m}^n(-1)^{j-m}\color{blue}{{j\choose m}}S_j $$
The following is the final version I just finished editing on one of my old answers. Let me cite the text below. My main focus is on the correctness of steps 3. and 4. (but any error identified is welcome!) because it's still very counterintuitive for me. Link to the original answer: https://math.stackexchange.com/a/3804796/390226
If my argument would work, I would also like to know how to turn this "proof" into a formal one. Thanks in advance!
Terms used in the answer:
- $A_k$: The set of elements has (at least) property indexed $k$.
- IEP for Inclusion-Exclusion Principle.
- Exactly-none IEP: exactly none of the indexed properties got selected via IEP. i.e.
Long answer to understand the two coefficients:
- Let's define $S_m$: It's a shorthand to simplify the formula of IEP. Assume there are $n$ properties in total, and $I$ is an index set, then we can pick any $m\le n$:
$$\begin{align} S_m&=\sum\left|\textrm{an intersection of }m\textrm{ sets}\right|\\ &=\sum_{|I|=m}\left|\bigcap_{j\in I}A_j\right| \end{align}$$
- $E_0$ gives exactly-none IEP: (Let $S_0=U$):
$$\begin{align}E_0 &=\sum_{j=0}^n(-1)^{j-0}{j\choose 0}S_j\\ &=\sum_{j=0}^{n}(-1)^{j}S_j\\ &=\left|\bigcap_{j=1}^n\bar{A_j}\right|.\\ \end{align}$$
for each $S_j$, the coefficient is $(-1)^j$:
$$\begin{align} E_0&=\sum_{j=m}^n(-1)^{j-m}{j\choose m}S_j, m=0\\ &=\sum_{j=0}^n(-1)^{j-0}{j\choose 0}S_j\\ &=\sum_{j=0}^{n}(-1)^{j}S_j. \end{align} $$
The ${j\choose m}$ disappears in $E_0$. But for $E_m$, we got more than that. So maybe there are more than one exactly-none IEPs in $E_m$?
- My attempt to explain $E_m$:
$$ E_0=\sum_{j=m}^n(-1)^{j-m}{j\choose m}S_j\\ $$
3.1. Observe the first term, $j=m$. Now $S_m$ gives:
$$\begin{align} S_m&=\sum_{|I|=m}\left|\bigcap_{j\in I}A_j\right|\\ &=\left|A_{1,2,...,m}\right| + \left|A_{1,2,...,(m-1),(m+1)}\right| + ... + \left|A_{(n-m+1),(n-m+2),...,n}\right|\\ \end{align}\\ $$
3.2. Now we calculate $E_0$ once for each term $A_{...}$ as the universe. So we have: (The notation $E_{0,U}$ where $U$ means the universe defined for the calculation)
$$\begin{align} E_{0,A_{1,2,...,m}}&=\color{blue}{\left|A_{1,2,...,m}\right|} + \sum_{j=m+1}^n(-1)^{j-m}S_j\\ E_{0,A_{1,2,...,(m-1),(m+1)}}&=\color{blue}{\left|A_{1,2,...,(m-1),(m+1)}\right|} + \sum_{j=m+1}^n(-1)^{j-m}S_j\\ \vdots\\ E_{0,A_{(n-m+1),(n-m+2),...,n}}&=\color{blue}{\left|A_{(n-m+1),(n-m+2),...,n}\right|} + \sum_{j=m+1}^n(-1)^{j-m}S_j\\ \end{align} $$
Notice that these $\left|A_{...}\right|$ have been provided by $E_0$. And the sum is $\color{blue}{S_m}$, i.e. the (common) first term of $E_m$.
you might have many questions at this stage:
Q1:
Those $A_{...}$ might overlap some of the others?
Yes. But if each $E_{0, A_{\dots}}$ work, no overlapping. Because $E_m$ means exactly-m properties when the calculation is complete.
Q2:
Each $E_0$, ignoring those $(-1)^{j-m}$, have $1$ for each term. Now you're trying to convince me that applying it many times will create ${j\choose m}$, a variable for each term in the resulting $E_m$? By intuition, if you apply it, say $5$ times, you should have a $5$ for each term?
Yes. I don't know how to explain this either for now.
- Now, the strangest step, I cannot believe it too:
Let's try to calculate how many $S_j$ are needed when $j$ is given. This is equivalent to finding how many universes include each of them. So here is the term:
$$ {j\choose m} $$
given any $j$ properties, you choose $m$ of them, you find one universe, i.e. one left-hand-side on the list of 3.2. that includes it on the right-hand-side. This explains the mysterious ${j\choose m}$ of $E_m$.