How does integrating just over the PDF of a random variable give you the expected value of that random variable?

Question

My text contains the following exercise:

Let $X$ be a continuous random variable with CDF $F_X$. Suppose that $\mathbb{P}(X > 0) = 1$ and that $\mathbb{E}(X)$ exists. Show that:

$$\mathbb{E}(X) = \int_0^{\infty} \mathbb{P}(X > x) dx$$

I have already completed the proof, which centers on recognizing:

$$\int_0^{\infty} \mathbb{P}(X > x) = \int_0^{\infty} (1 - F_X(x))dx$$

And then applying integration by parts. But the proof didn't provide any intuition for why the proof statement is true. In particular it is surprising to me that we can integrate over probabilities (technically over probability densities) without including any value of the random variable at all and end up with a value that is in the same units/space as the random variable. $0 < F_X(x) < 1$ but values of $X$ are only required to be non-negative so they could be arbitrarily large. Why does this work?

@Masacroso oops, you’re right that is perfect! – Joseph Garvin Apr 18 '20 at 02:47 — Joseph Garvin, Apr 18 '20 at 02:47

score 2 · Accepted Answer · answered Apr 18 '20 at 02:47

I think the discrete version is easier to explain, so I will go with that. Say $X$ is a discrete random variable that takes nonnegative integer values $0,1,2,\ldots$. We want to show that $$ E[X] = \sum_{k=1}^\infty P(X \geq k). $$ First, by definition, we have $$ E[X] = \sum_{k=0}^\infty kP(X=k). $$ This sum can be written out as $$ P(X=1) + 2P(X=2) + 3P(X=3) + \cdots $$ In this sum, you have one copy of $P(X=1)$, two copies of $P(X=2)$, three copies of $P(X=3)$, and so on. If you rearrange terms, you can rewrite the sum as \begin{align*} &\big(P(X=1)+P(X=2)+P(X=3)+P(X=4)\cdots\big) \\ +& \big(P(X=2)+P(X=3)+P(X=4)+\cdots\big) \\ +& \big(P(X=3)+P(X=4)+\cdots\big) + \cdots \end{align*} And note that the first sum is $P(X \geq 1)$, the second is $P(X \geq 2)$, the third is $P(X \geq 3)$, and so on. Therefore, we have rewritten the sum as $$ \sum_{k=1}^\infty P(X \geq k), $$ which is what we wanted. Intuitively, what is happening in both sums is that the largest outcomes get the most weight.

score 0 · Answer 2 · answered Apr 18 '20 at 02:58

Approach it from both ends and attempt to meet at the middle.

If the dirstibution for absolutely continuous random variable $X$ has a strictly non-negative support, and with probability density function $f_X(~)$ and cummulative distribution function $F_X(~)$, then we have:

$$\begin{align}\mathsf E(X)&=\int_0^\infty s~f_X(s)~\mathrm d s\tag 1\\[1ex]&=\int_0^\infty \left(\int_0^s 1~\mathsf dx\right)~f_X(s)~\mathsf ds\tag 2\\[1ex]&\vdots\tag 3\\[2ex]&\vdots\tag 4\\[1ex]&=\int_0^\infty 1-F_X(x)~\mathsf d x\tag 5\\[1ex]&=\int_0^\infty \mathsf P(X>x)~\mathsf d x\tag 6\end{align}$$

How does integrating just over the PDF of a random variable give you the expected value of that random variable?

2 Answers2