I'm a beginning student of Probability and Statistics and I've been reading the book Elementary Probability for Applications by Rick Durret.
In this book, he outlines the 4 Axioms of Probability.
- For any event $A$, $0 \leq P (A) \leq 1$.
- If $\Omega $ is the sample space then $P (\Omega) =1$.
- If $A$ and $B$ are disjoint, that is, the intersection $A \cap B = \emptyset$, then $$P(A\cup B) = P(A) + P(B)$$
- If $A_1, A_2,\ldots$, is an infinite sequence of pairwise disjoint events (that is, $A_i\cap A_j = \emptyset$ when $i \neq j $) then $$P\left(\bigcup_{i=1}^\infty A_i\right)=\sum_{i=1}^\infty P(A_i).$$
The book fails to explain why we need Axiom 4. I have tried searching on Wikipedia but I haven't had any luck. I don't understand how we can have a probability of disjoint infinite events. The book states that when the you have infinitely many events, the last argument breaks down and this is now a new assumption. But then the book states that we need this or else the theory of Probability becomes useless.
I was wondering if there were any intuitive examples of situations where this fourth axiom applies.
Why is it so important for probabilty theory? And why does the author state that not everyone believes we should use this axiom.