I have a dataset that gives information of a population. For instance, I know the fraction of people that are males (M
) and that are within a certain age range (A
), P(M & A)
, and then I know the fraction of males that live in a certain area (L
), P(M & L)
.
What I'm interested in computing is P(M & A & L)
, which is the fraction of people that are males, are within a certain age range and live in a certain area.
Using Baye's formula I can say that
P(M & A & L) = P(M & A | L) P(L)
But my dataset only gives P(L)
and not P(M & A | L)
. However, if I assumed that M & A
and L
are independent I have
P(M & A | L) = P(M & A) P(L)
How large is the error on P(M & A | L)
if I make this assumption. Do you know of any other method I could use to estimate P(M & A | L)
without assuming independence?