I was trying to understand why it was mathematically justified to multiply edge probabilities in a tree diagra and I came across the following question:
Why do we multiply in tree diagrams?
The second answer is nearly exactly what I needed, however, there are some things that I still don't understand about it. Or maybe there are just some things about the answer that are not mathematically rigorous enough for me (or trivial enough for the poster that the details were omitted), specially in the context of view probability with set theory. That is what I wish to address: how to treat probability tree diagrams with set theory rigorously and thus calculate probabilities on trees in a mathematically justified way.
The main issue that I have is how leaves/outcomes are specified with set notation in a sloppy way, which leads has lead to weird justification of calculating probabilities in tree diagrams. I will try to address what I think the issue are in detail in the frame of set theory to make sure that everything is precisely and clearly defined.
The exact issue that I am having is with the notation $\cup$ and $\cap$ being used to describe probabilistic statements. In high school we are taught to think about these as AND and ORs. I wish to abandon that mentality (since I think its one of the reasons for my confusion) and be extremely precise on the use of $\cap$ and $\cup$. Intersection and Union are two operations that only apply to sets. I will use them in that way and wish to address their correct use in probability theory.
First lets try to define "outcome" and "events" precisely and see how they relate to tree diagrams.
An outcome normally means a specific way of specifying the result of an experiment. For example, in the monty hall problem we can specify the outcome of the experiment by the following triplet:
- outcome = (car location, player's initial guess, door revealed by the host).
i.e. an outcome is fully specified when we specify the location where the car actually is, what the players initial guess is and the door that was revealed by the host. Hence resulting in the following tree diagram:
(which I got from MIT's course for mathematics for computer science 6.042). as it can be appreciated, the leaves of the tree are the outcomes and all the leaves are the whole samples space $S$. In these terms the samples space is the set of triples:
Now, an event is a subset of this samples space, i.e. choosing a subset of the leaves.
The issue that I have is that I have seen the leaves of such a tree trees diagram denoted as $(A \cap A \cap B)$ (for the first one on my example) instead of $(A, A, B)$. For me, these two are not the same. The second triplet is just a sequence that acts as an "index" to specify a specific outcome in the sample space (which is an element of the set $S$). The notation with intersection (i.e. $A \cap A \cap B$) tries to specify a leaf but it seems plain wrong to me and confusing (or horrible abuse of notation? not sure...). Let me justify why I think its an incorrect way to specify a leaf:
- firstly, it is not clear to me what $(A \cap A \cap B)$ even means. For me, that just means the empty set, because intersection should only be applied to sets and $(A \cap A \cap B)$ has no intersection.
- secondly, even if you try to "repair" the first issue by insisting that the first position, second and third position are simply events and taking intersections of them is valid, still brings problems. i.e. $(A \cap A \cap B)$ are intersection of "events" is still wrong I believe. That solution only bring further problems/question. First, if A is now an event, then, what exactly is it a subset of? (since thats what an event is. If you are trying to use set notation to denote stuff, you better specify what the sets/subsets are). How do we re-define the sample space so that this notation indication of a leaf is justified? If we could do this, then (maybe) the justification explained in the question I posted might be valid (with further justifications).
- If you try to use set notation to specify a leaf, it seems to me that the correct way to do it is by unions, not intersections. The reason is because, that would actually lead to the correct meaning of what a triplet specifies (and avoid having the issue of the empty set that I specified in my first point). However, since the order of the elements of a sets "don't matter" and because the triplets are sequences (where the order maters), the way to fix the new problem I have introduced by using unions is by using a subscript on the position of the triplet (kind of defining a bijection) i.e. the outcome $(A, A, B)$ corresponds $\{ A_1 \cup A_2 \cup B_3\}$. Anyway, taking this definition doesn't help that much, because its not clear to me how to use the general chain rule of probability to justify the probability of a leaf.
Basically, how do you rigorously justify using the chain rule of probability to calculate the probability of a single outcome in a probability tree diagram?