Show that the median minimizes $E[|X-c|]$.

Question

The following proof follows from the answer of @grand_chat. I slightly change the proof, and it gives the different result, and I am wondering why it does not work.

Let $m$ be the median of $X$. Consider the case where $c \ge 0$ and $m = 0$. It sufficies to prove that $E[|X - c|] \ge E[|X|]$. When $X < 0$, $|X - c| - |X| = c$. On the other hand, when $X \ge 0$, $|X-c| - |X| \ge -c$ regardless of whether $X \ge c$ or not. Therefore, we have that$$E[(|X-c| - |X|)I_{X < 0}] = \int (|X-c| - |X|) I_{X < 0} d P = c P(X <0),$$and also$$E[(|X-c| - |X|)I_{X \ge 0}] = \int (|X-c| - |X|)I_{X \ge 0} dP \ge -c P(X \ge 0).$$If we sum these two equations up, we have that $E[|X-c| - |X|] \ge cP(X < 0) - cP(X \ge 0)$. However, in this case, the right hand side is equal to $c(1-P(X \ge 0)) - c P(X \ge0) \le 0$ because $P(X \ge 0 =m) \ge 1/2$.

The only change is that while the original proof uses $\{X \le 0\}$ and $\{X >0\}$, I instead use $\{X <0\}$ and $\{X \ge0\}$.Why does the variant of the original proof gives a different result? Can you point out why my logic is wrong?

score 1 · Accepted Answer · answered Aug 29 '20 at 17:39

The critical difference is indeed how you handle the event that $X=0.$

For a fixed value of $c,$ define a function $g$ such that $g(x) = \lvert x - c\rvert - \lvert x\rvert$ for all real $x.$ Then in the case $c \geq 0,$ $$ g(X) = \lvert X - c\rvert - \lvert X\rvert. $$

Now we can write several steps of both proofs more succinctly. When $X \leq 0,$ or when $X < 0$, we can verify that $g(X) = c.$ When $X > 0,$ or when $X \geq 0$, we can verify that $g(X) \geq -c.$

In particular, when $X=0,$ we have $g(X) = c,$ and of course this and the assumption that $c \geq 0 \geq -c$ together imply we also have $g(X) \geq -c.$

Then of the many possible functions we could write that bound $g$ from below, two of them are $$ L(x) = \begin{cases} c & x \geq 0, \\ -c & x < 0 \end{cases} $$ and $$ M(x) = \begin{cases} c & x > 0, \\ -c & x \leq 0. \end{cases} $$

That is, $L(x) \leq g(x)$ and $M(x) \leq g(x)$ for all real $x.$ Notice that $L$ and $M$ are identical except for the fact that $L(0) - M(0) = 2c.$

What each proof does is essentially to integrate one of these functions applied to $X$ over the entire probability space. In the proof you referred to, the integral is $$ \int L(X)\, \mathrm dP = \int L(X) I_{X \leq 0}\, \mathrm dP + \int L(X) I_{X > 0}\, \mathrm dP $$ and in your proof the integral is $$ \int M(X)\, \mathrm dP = \int M(X) I_{X < 0}\, \mathrm dP + \int M(X) I_{X \geq 0}\, \mathrm dP. $$

We also find that $\int L(X)\, \mathrm dP \leq E[g(X)]$ and $\int ML(X)\, \mathrm dP \leq E[g(X)].$

The first proof then finds that $\int L(X)\, \mathrm dP = c(2P(X\leq 0) - 1) \geq 0$ due to the fact that $\frac12 \leq P(X\leq 0)$ (by the assumption that $m=0$ is a median of $X$).

But we also have an upper bound for the integral, because another way of evaluating it is $$\int L(X)\, \mathrm dP = c(P(X=0) + P(X>0) - P(X<0)) $$ and we know that $P(X>0)\leq\frac12,$ $P(X<0)\leq\frac12,$ and $P(X=0) + P(X>0) + P(X<0) = 1.$ From these facts we can deduce that $\lvert P(X>0) - P(X<0)\rvert \leq P(X=0)$ and therefore $$ 0 \leq \int L(X)\, \mathrm dP \leq 2cP(X=0). $$

That is, we have both a lower bound and an upper bound for the integral.

For the purposes of the first proof, only the lower bound is necessary; but when we consider your proof, we must face the fact that $$ \int (L(X) - M(X))\, \mathrm dP = 2cP(X=0) $$ since $L(0) - M(0) = 2c$ and $L(X) - M(X) = 0$ whenever $X \neq 0.$ Therefore $\int M(X)\, \mathrm dP = \int L(X) \mathrm dP - 2cP(X=0)$ and the bounds of $\int L(X) \mathrm dP$ tell us that $$ -2cP(X=0) \leq \int M(X)\, \mathrm dP \leq 0. $$

In summary:

Each proof depends on integrating a function of $X$ that is a lower bound of $\lvert X - c\rvert - \lvert X\rvert.$
In each proof, the integral is shown to be a lower bound of $E[\lvert X - c\rvert - \lvert X\rvert].$
The value of the integral in the first proof is in the closed interval $[0, 2cP(X=0)].$
The value of the integral in your proof is in the closed interval $[-2cP(X=0),0].$
The last step of the proof requires an integral whose value is provably non-negative.
The integral in the first proof suffices for this purpose; yours does not.
The flaw in your proof is that you reduced the value of the integral by the non-negative quantity $2cP(X=0),$ which you could not afford to do.

Show that the median minimizes $E[|X-c|]$.

1 Answers1