Proof of the Derived Distribution Procedure in Statistics

Question

In introductory statistics we are taught that, if you want to know the distribution of a random variable that is a function of another random variable, you can use the following procedure (this is copied from Bertsekas and Tsitsiklis, 2nd Ed. Section 4.1):

Calculate the CDF $F_Y$ of Y using the formula $$ F_Y(y) = P(Y \le y) = \int_{\{x|g(x) \le y\}} f_X(x) dx $$

Differentiate to obtain the PDF of Y: $$ f_Y(y) = \frac{dF_Y}{dy}(y). $$

That looks reasonable, and I can even draw pictures to convince myself that it is probably true. But I can't prove it. Here is my attempt.

By definition we have $Y = g(X)$ and I start with the definition of the PDF of $Y$

$$ P(c \le Y \le d) \doteq \int_{\{y|c \le y \le d\}} f_Y(y) dy. $$

Eventually, I would like to take the limit $c \rightarrow -\infty$, but I don't think I need that now and I like to use identities in proofs.

To make the problem easier for me to understand, I split the integral up into little pieces (hoping that the sum will come together in the end) and focus on the integral near the point $y_0$:

$$ P(c \le Y \le d) = \int_{\{y|c \le y \le y_0\}} f_Y(y) dy + \int_{\{y|y_0 \le y \le y_0 + \delta_y\}} f_Y(y) dy + \int_{\{y|y_0 + \delta_y \le y \le d\}} f_Y(y) dy. $$

Using $y = g(x)$, I have $dy = \frac{dg}{dx}(x)dx$ and, assuming $g(x)$ is invertible, I find that

$$ \int_{\{y|y_0 \le y \le y_0 + \delta_y\}} f_Y(y) dy = \int_{\{x|g^{-1}(y_0) \le x \le g^{-1}(y_0 + \delta_y)\}} f_Y(g(x)) \frac{dg}{dx}(x)dx. $$

I can look at the integral on the right and say that it looks like the definition of the PDF of $X$, if only the integrand were $f_X(x)$.

At this point I am stuck. I suspect that integration by parts will help, but when I try that, I don't seem to get anywhere, not least because there is always an $f_Y$ in there instead of an $f_X$.

On top of that, I assumed $g^{-1}(y_0) \le x \le g^{-1}(y_0 + \delta_y)$, i.e. that $g(x)$ is monotonically increasing. I think I can work out what to do if $g(x)$ is monotonically decreasing, but I have no idea what to do if it isn't monotonic.

What am I missing?

If your question is about where $\int_{{x \mid g(x) \le y}} f_X(x) , dx$ comes from, I think it is immediate from rewriting $P(Y \le y)$ as $P(g(X) \le y)$ and expressing the latter as an integral. — angryavian, Aug 18 '21 at 16:42
@angryavian That error is well-spotted, thanks! As to your clarification question, it does not seem immediate to me at all. It seems plausibly true, but that isn't a proof. — Finncent Price, Aug 18 '21 at 16:57
I am using a more general definition of the density where for "nice" sets (read: measurable sets $A$ we have $P(X \in A) = \int_A f_X(x) , dx$. Your definition only considers $A$ of the form $[c,d]$. With $A$ being ${x \mid g(x) \le y}$ the result follows immediately from $P(g(X) \le y)$. — angryavian, Aug 18 '21 at 17:02

Snoop · Answer 1 · 2021-08-19T08:19:03.530

Let $(\Omega,\mathcal{F},\mu)$ be a probability space. We have $X:\Omega \to \mathbb{R}$ which is a measurable random variable. Let $g:\mathbb{R} \to \mathbb{R}$ be Borel measurable. Define $Y:= g(X)$. Then it can be proved that $Y:\Omega \to \mathbb{R}$ is a measurable random variable. The cumulative distribution function of $Y$ is given by $$F_Y(t)=\mu(\{Y\leq t\})=\mu(Y^{-1}(-\infty,t])$$ We have $Y^{-1}=X^{-1}\circ g^{-1}$ so $$\begin{aligned}\mu(Y^{-1}(-\infty,t])&=\int_{\Omega}\mathbb{I}_{Y^{-1}(-\infty,t]}(\omega)\mu(d\omega)=\\&=\int_{\Omega}\mathbb{I}_{X^{-1}(g^{-1}(-\infty,t])}(\omega)\mu(d\omega)=\\&=\int_\Omega\mathbb{I}_{g^{-1}(-\infty,t]}(X(\omega))\mu(d\omega)=\\&=\int_{\mathbb{R}}\mathbb{I}_{g^{-1}(-\infty,t]}(x)\mu_X(dx) \end{aligned}$$ where $\mu_X(A)=\mu(X^{-1}(A)), \, A \in \mathcal{B}(\mathbb{R})$, that is, the probability distribution of $X$. Therefore $$\mu(\{Y\leq t\})=\mu_X(g^{-1}(-\infty,t])$$ But notice that $$\int_{\mathbb{R}}\mathbb{I}_{g^{-1}(-\infty,t]}(x)\mu_X(dx)=\int_{g^{-1}(-\infty,t]}\mu_X(dx)$$ and $g^{-1}(-\infty,t]=\{x|g(x)\leq t\}$, and the result you wanted follows. If $X$ has a continuous density $f_X$, then $$F_Y(t)=\int_{\{x|g(x)\leq t\}}\mu_X(dx)=\int_{\{x|g(x)\leq t\}}f_X(x)dx$$ In general, if the rv $Y$ has a continuous density $f_Y$ then $$\frac{d}{dt}F_Y(t)=\frac{d}{dt}\int_{(-\infty,t]}f_Y(y)dy=f_Y(t)-\lim_{r \to -\infty}f_Y(r)\cdot 0=f_Y(t)$$

For reference on the last part: define $F^{(r)}(t)=\int_{[r,t]}f(y)dy$; then $F^{(r)}\to F$ by DCT and $\frac{d}{dt}F^{(r)}(t)=f(t)-f(r)\cdot 0=f(t),\forall r$. Then we obtain the result. — Snoop, Dec 08 '22 at 07:14

Proof of the Derived Distribution Procedure in Statistics

1 Answers1