In introductory statistics we are taught that, if you want to know the distribution of a random variable that is a function of another random variable, you can use the following procedure (this is copied from Bertsekas and Tsitsiklis, 2nd Ed. Section 4.1):
- Calculate the CDF $F_Y$ of Y using the formula $$ F_Y(y) = P(Y \le y) = \int_{\{x|g(x) \le y\}} f_X(x) dx $$
- Differentiate to obtain the PDF of Y: $$ f_Y(y) = \frac{dF_Y}{dy}(y). $$
That looks reasonable, and I can even draw pictures to convince myself that it is probably true. But I can't prove it. Here is my attempt.
By definition we have $Y = g(X)$ and I start with the definition of the PDF of $Y$
$$ P(c \le Y \le d) \doteq \int_{\{y|c \le y \le d\}} f_Y(y) dy. $$
Eventually, I would like to take the limit $c \rightarrow -\infty$, but I don't think I need that now and I like to use identities in proofs.
To make the problem easier for me to understand, I split the integral up into little pieces (hoping that the sum will come together in the end) and focus on the integral near the point $y_0$:
$$ P(c \le Y \le d) = \int_{\{y|c \le y \le y_0\}} f_Y(y) dy + \int_{\{y|y_0 \le y \le y_0 + \delta_y\}} f_Y(y) dy + \int_{\{y|y_0 + \delta_y \le y \le d\}} f_Y(y) dy. $$
Using $y = g(x)$, I have $dy = \frac{dg}{dx}(x)dx$ and, assuming $g(x)$ is invertible, I find that
$$ \int_{\{y|y_0 \le y \le y_0 + \delta_y\}} f_Y(y) dy = \int_{\{x|g^{-1}(y_0) \le x \le g^{-1}(y_0 + \delta_y)\}} f_Y(g(x)) \frac{dg}{dx}(x)dx. $$
I can look at the integral on the right and say that it looks like the definition of the PDF of $X$, if only the integrand were $f_X(x)$.
At this point I am stuck. I suspect that integration by parts will help, but when I try that, I don't seem to get anywhere, not least because there is always an $f_Y$ in there instead of an $f_X$.
On top of that, I assumed $g^{-1}(y_0) \le x \le g^{-1}(y_0 + \delta_y)$, i.e. that $g(x)$ is monotonically increasing. I think I can work out what to do if $g(x)$ is monotonically decreasing, but I have no idea what to do if it isn't monotonic.
What am I missing?