Struggling to understand this definition of conditional expectation

Question

Definition: Suppose we are in the probability space $ (\Omega,\mathcal{F},\mathbb{P})$. Let $ \mathcal{G} \subset \mathcal{F}$ a sub $\sigma$-algebra and let $X$ be an integrable random variable then there exists a unique $Z$ $\mathcal{G}$-mesurable and integrable such that for every bounded random variable $U$:

$$ E[XU]=E[ZU] $$ We define $Z=E[X|\mathcal{G}]$ the conditional expectation of X with respect to $\mathcal{G}$.

I don't get why we write this expression $E[XU]=E[ZU]$ and I also don't get why are we able to define $Z=E[X|\mathcal{G}]$ given that first expression?

(I translated the definition from French so sorry if it's not well-formed.)

It is more usual to require equality for all ${\cal G} $ measurable characteristic functions. — copper.hat, Oct 20 '21 at 23:38

score 1 · Answer 1 · answered Oct 20 '21 at 23:28

This definition is (slightly) incorrect as written. If, say, $X$ is a bounded positive random variable with $\text{Var}(X)>0$, and $\mathcal{G}=\{\Omega, \varnothing\}$, then the $\mathcal{G}$-measurable (i.e., constant) random variable $Z$ satisfying the definition is necessarily $Z=E[X]$. This can be seen by noting the definition (taking $U=1$) forces $E[X] = E[Z]$ and so we get $Z=E[X]$ since $Z$ is constant. But then, if we take $U=X$ in the definition, we get $E[XU] = E[X^2] \neq E[X]^2 = E[E[X]X] = E[ZX]$, which is a contradiction.

The remedy to this is that the definition should posit that the bounded random variables $U$ are $\mathcal{G}$-measurable. In that case, the dominated convergence theorem implies that we can (equivalently) simplify the definition to the case that the $U$'s are of the form $\mathbf{1}_{H}$, $H \in \mathcal{G}$. Now, this definition is much more intuitive, and there are answers elsewhere discussing why.

Struggling to understand this definition of conditional expectation

1 Answers1