non-measure theoretic proof of towering property of expectation

Question

I'm trying to derive a proof that, for discrete random variables $U$, $V$, $W$ with finite expectation, it holds that $$E[\ E[V\mid U,W]\mid W\ ] = E[V\mid W]. \qquad (1) $$ (I'm aware of the measure theory based proof, but I'm trying to find a proof that does without. The main difficulty I'm encountering is the additional conditioning on $W$; the proof of $E[ E[V\mid U] ] = E[V]$ is clear to me.)

Let $X(\Omega_X)$ be the range of RV $X$. The LHS of $(1)$ is itself a random variable, namely a function $f$ where \begin{align*} f(w_1) =& \sum_{(u,w) \in U(\Omega_U)\times V(\Omega_W)} E[\ V \mid U = u \wedge W = w]\ Pr[ U = u \wedge W = w \mid W = w_1 ]\\ =& \sum_{(u,w)} \sum_{v \in V(\Omega_V)} v \cdot Pr[ V = v \mid U = u \wedge W = w]\cdot Pr[ U = u \wedge W = w \mid W = w_1 ] \end{align*}

The RHS of $(1)$ is a function $g$ such that $$ g(w_1) = \sum_{v \in V(\Omega_V)} v \cdot Pr[\ V = v \mid W = w_1\ ]. $$ To proof $(1)$, we need to show that for all $w_1 \in W(\Omega_W)$ we have $$f(w_1) = g(w_1).$$ This is true if $$ Pr[ V = v \mid U = u \wedge W = w\ ] \cdot Pr[U = u \wedge W = w \mid W = w_1 ] = Pr[V = v \mid W = w_1 ], $$ but I couldn't progress from here. I have a feeling that my interpretation of the LHS is incorrect...

Are you considering the case when random every random variable $U,V,W$ takes only finitely many values? — SBF, Aug 31 '12 at 08:06
Not necessarily, but for now I would be grateful for a proof even for this case. — somebody, Aug 31 '12 at 08:43

score 2 · Accepted Answer · answered Oct 14 '12 at 10:35

This is true if $$ Pr[ V = v \mid U = u \wedge W = w\ ] \cdot Pr[U = u \wedge W = w \mid W = w_1 ] = Pr[V = v \mid W = w_1 ] $$

Not quite. In fact, this is true if, for every given $w_1$, $$ \sum_{u,w}\Pr[ V = v \mid U = u, W = w] \cdot \Pr[U = u, W = w \mid W = w_1 ] = \Pr[V = v \mid W = w_1 ]. $$ To prove this, fix $u$ and consider $a(w)=\Pr[U = u, W = w \mid W = w_1 ]$. If one refers to the definition of conditional expectation of events, one sees that $a(w)=0$ for every $w$ except $w=w_1$, and that $a(w_1)=\Pr[U = u\mid W = w_1 ]$. It follows that each sum over $w$ in the LHS of the identity to be proved has only one nonzero term, which is $$ \Pr[ V = v \mid U = u, W = w_1] \cdot \Pr[U = u\mid W = w_1 ]. $$ This is $$ \frac{\Pr[ V = v,U = u, W = w_1]}{\Pr[U = u, W = w_1]} \cdot \frac{\Pr[U = u,W = w_1 ]}{\Pr[W = w_1 ]}=\Pr[ V = v, U = u\mid W = w_1] . $$ Hence the whole LHS is $$ \sum_u\Pr[ V = v, U = u\mid W = w_1]=\Pr[ V = v\mid W = w_1], $$ as desired.

non-measure theoretic proof of towering property of expectation

1 Answers1