How is $P(A|B') = P(A|B)$ the definition for independence of events $A,B$?

Question

Notations: $P(XY) := P(X \cap Y)$ and $X' := X^{\small \complement}$.

I read from a comment on MSE that $P(A | B) = P(A)$ and $P(A | B) = P(A | B')$ are equivalent definitions of independence of two events and they both lead to $P(AB) = P(A)P(B)$. The first definition now seems rather natural to me, thanks to @lulu and @Ryan G.

We assume that the probabilities $0 < P(A), P(B) < 1$.

Derivation from the first def.: $P(A|B) = P(A) \iff \frac{P(AB)}{P(B)} = P(A) \iff P(AB) = P(A)P(B)$.

Derivation from the second def.: $P(A|B) = P(A|B') \iff \frac{P(AB)}{P(B)} = \frac{P(AB')}{P(B')}$.

Now if they are equivalent, then $\frac{P(AB')}{P(B')} = P(A) \iff P(A | B') = P(A)$ which is not necessarily the case.

Can anyone please confirm this or tell me what exactly is the statement/assumption that I am missing?

score 2 · Accepted Answer · answered Aug 23 '21 at 21:23

Assuming $P(A|B)=P(A|B')$, we have \begin{align*}& & \frac{P(AB)}{P(B)} &= \frac{P(AB')}{P(B')} \\ &\Leftrightarrow &P(B')P(AB) &= P(B)P(AB') \\ &\Leftrightarrow & (1-P(B))P(AB) &= P(B)P(AB') \\ &\Leftrightarrow & P(AB) &= P(B)(P(AB')+P(AB)), \end{align*} and since $P(AB') + P(AB) = P(A)$ we have shown $P(A|B) = P(A|B')$ is equivalent to $P(AB) = P(B)P(A)$.

score 1 · Answer 2 · answered Aug 23 '21 at 21:24

We will prove the equivalence of the definitions.

The case for $P(A|B)=P(A)\implies P(A|B)=P(A|B')$ is trivial.

The other case, for $P(A|B)=P(A|B')\implies P(A|B)=P(A)$ can be proved using the law of total probability:

Where in the second equality we used the assumption about $P(A|B)=P(A|B')$ and that $P(B')=1-P(B)$

score 1 · Answer 3 · answered Aug 23 '21 at 21:26

If $A$ is independent of $B$ then $P(AB')=P(A)-P(AB)=P(A)-P(A)P(B)=P(A)(1-P(B))=P(A)P(B')$, so $A$ is also independent of $B'$. Hence $P(A\mid B)=P(A\mid B')=P(A)$.

Conversely, if $P(A\mid B)=P(A\mid B')$ then multiplying both sides by $P(B)P(B')=P(B)(1-P(B))$ gives $P(AB)(1-P(B))=P(AB')P(B)$, that is $P(B)(P(AB')+P(AB))=P(AB)$, which is the same as $P(B)P(A)=P(AB)$ since $P(AB')+P(AB)=P(A)$.

ryang · Answer 4 · 2021-08-24T07:10:37.523

1

Let $0<P(B)<1.$ Then $$\text{the probability of event }A \textbf{ does not depend on } \;\textbf{whether or not}\; \textbf{ event $B$ occurs} \\\iff P(A|B)=P(A|B^c) \\\iff \frac{P(A\cap B)}{P(B)}=\frac{P(A\cap B^c)}{P(B^c)} \\ =\frac{\text n(A\cap B^c)}{\text n(B^c)} \\ =\frac{\text n(A)-\text n(A\cap B)}{\text n(B^c)} \\ =\frac{P(A)-P(A\cap B)}{P(B^c)} \\ =\frac{P(A)-P(A\cap B)}{1-P(B)} \\\iff P(A\cap B)=P(A)P(B).$$ Notice that this intuitive characterisation of pairwise independence excludes the cases where the subjects' probabilities are $(0,0),(1,1),(0,1)$ or $(1,0).$
On the other hand, the more common intuitive characterisation $$\text{the probability of event }A \textbf{ is unaffected by the knowledge that event $B$ occurs},$$ i.e., $P(A|B)=P(A),$ excludes only the $(0,0)$ case.
In contrast, the formal definition—by sidestepping division by $0$—applies with no restriction on the subjects' probabilities.

Apart from the restrictions, all three are equivalent to one another.

edited Aug 24 '21 at 07:10

answered Aug 23 '21 at 21:49

ryang

38,879
14
81
179

Since there are multiple answers (as I expected) and the first answerer has written it quite clearly, I think it would be unfair to not accept his/her answer. I am sorry for that. But thank you for answering the question and letting me know of this identity. – dictatemetokcus Aug 23 '21 at 22:28
I would like to ask: what did you mean by "the former has a tighter restriction than the latter" while referring to both these definitions? – dictatemetokcus Aug 23 '21 at 22:29
I get it, thank you Ryan. I will have a look at the ref. – dictatemetokcus Aug 24 '21 at 00:03
@dictatemetokcus Glad to be of help! 1. There is only one definition, as referenced in Point 3 of this answer. 2. None of the 3 versions are actually technically equivalent to each other: Version 1 (the main subject of this page) has a tighter restriction than Version 2 (the standard informal characterisation), while Version 3 (the definition)—by design—is friendliest, in order to deal with impossible and almost-never events. (I've updated the answer to clarify.) – ryang Aug 24 '21 at 07:10

Sandipan Dey · Answer 5 · 2021-08-24T06:33:15.697

We can use Bayes theorem too for proving equivalence. Let's show that from second definition we can readily obtain the first definition. Going from the other way will be similar.

$P(B|A)$

$=\frac{P(A|B)P(B)}{P(A|B)P(B)+P(A|B')P(B')}$, by Bayes

$=\frac{P(A|B)P(B)}{P(A|B)P(B)+P(A|B)P(B')}$, since $P(A|B)=P(A|B')$, by second definition of independence

$=\frac{P(A|B)P(B)}{P(A|B)(P(B)+P(B'))}$

$=\frac{P(A|B)P(B)}{P(A|B)}$, since $P(B)+P(B')=1$, by law of countable unions

$=P(B)$, assuming $P(A|B) \neq 0$.

Hence, we have, $P(B|A)=P(B)$, and this is precisely another way of writing the first defination of independence (swap A and B).

score 0 · Answer 6 · answered Aug 24 '21 at 04:05

There are only so many ways to prove the equivalence of the two forms of independence. However, as with most somewhat difficult concepts, a particular approach may resonate better with someone than other approaches. So at the risk of possibly duplicating someone else's formulas, I will present my approach along with my explanations.

Equivalence of the Two Forms

Suppose that $$ P(A\mid B)=P(A\mid B')\tag1 $$ Then $$ \begin{align} P(A) &=P(A\cap B)+P(A\cap B')\tag{2a}\\[3pt] &=P(A\mid B)P(B)+P(A\mid B')P(B')\tag{2b}\\[3pt] &=P(A\mid B)P(B)+P(A\mid B)(1-P(B))\tag{2c}\\[3pt] &=P(A\mid B)\tag{2d} \end{align} $$ Explanation:
$\text{(2a)}$: Disjoint Union
$\text{(2b)}$: Bayes' Theorem
$\text{(2c)}$: apply $(1)$ and Disjoint Union
$\text{(2d)}$: simplify

Suppose that $$ P(A\mid B)=P(A)\tag3 $$ Then $$ \begin{align} P(A\mid B') &=\frac{P(A\cap B')}{P(B')}\tag{4a}\\ &=\frac{P(A)-P(A\cap B)}{P(B')}\tag{4b}\\ &=\frac{P(A\mid B)-P(A\mid B)P(B)}{P(B')}\tag{4c}\\[5pt] &=P(A\mid B)\tag{4d} \end{align} $$ Explanation:
$\text{(4a)}$: Bayes' Theorem
$\text{(4b)}$: Disjoint Union
$\text{(4c)}$: apply $(3)$ and Bayes' Theorem
$\text{(4d)}$: Disjoint Union and simplify

Disjoint Union

By Disjoint Union, we mean that when $P(A\cap B)=0$, we have $P(A\cup B)=P(A)+P(B)$. This can be viewed as an application of the Inclusion-Exclusion Principle. We use it here to say that $P(A\cap B)+P(A\cap B')=P(A)$ and $P(B)+P(B')=1$.

score 0 · Answer 7 · answered Aug 24 '21 at 04:36

How is $P(A|B′)=P(A|B)$ the definition for independence of events $A,B$?

The equality says that whether given that $B$ occurs or not, we shall have the same expectation that $A$ occurs. IE: Our probability measure for $A$ is not dependent on the occurrence of $B$.

That means exactly that: event $A$ is independent of event $B$.

How is $P(A|B') = P(A|B)$ the definition for independence of events $A,B$?

7 Answers7

Linked