Maximum mean absolute difference of two iid random variables : a random variable proof?

Question

The post is a follow-up to this question :

Maximum mean absolute difference of two iid random variables

The question is to show that for two iid random variables $X$ and $Y$ on the unit interval, one has:

$$\mathbb E[|Y-X|] \le 1/2 $$

(The maximizer then is the $1/2 (\delta_0+\delta_1)$ distribution.)

The proposed proof (there is only one), by Sergei Golovan, and the tricks it uses, is quite striking. Still I don't see a way to convert that proof in term of random variables only, which leaves me unsatisfied (in particular, it is difficult to interpret probabilistically the integration by parts step).

Also, the upper bound, 1/2, lets me wondering if some symmetry argument could be used here.

--

So I'm asking if there would be a proof that sticks with the random variables only, in the sense, say that it does not use integration by parts.

It may well be there is no such proof, and that one ultimately has to resort to integration by parts, I have no idea...

--

Some equalities one may write to start with but that seem not to help much :

\begin{align*} \mathbb E[|Y-X|] & = \mathbb E[(Y-X) 1_{Y>X}]+ \mathbb E[(X-Y) 1_{X>Y}] \\ & = 2 \mathbb E[(Y-X) 1_{Y>X}] \\ & = 2 (\mathbb E[Y 1_{Y>X}] - \mathbb E[X 1_{Y>X}]) \\ & = 2 (\mathbb E[Y 1_{Y>X}] - \mathbb E[Y 1_{X>Y}]) \\ & = 2 \mathbb E[Y (1_{Y>X}-1_{X>Y}] ]\end{align*}

pre-kidney · Accepted Answer · 2019-08-09T04:12:43.013

3

Per the answer I have just posted at the reference question, there is a simple (essentially one line) proof using only operations on random variables, together with the famous identity expressing the expectation as the integral of the tail probabilities (which also has a one line proof using Tonelli's Theorem).

The idea is to write the expectation as an integral over $[0,1]$ of a quantity that is bounded by $\tfrac 12$. The precise identity is $$ \frac{\mathbb E|X-Y|}{2}=\int_0^1 \mathbb P(X>t)\cdot \mathbb P(X\leq t)\ dt, $$ and you can bound the integrand by $\tfrac 14$ using the AM-GM inequality.

I realize you asked for a proof that avoids integration by parts, and this is technically true for the proof I have given. However, it is important to be aware that while the tail integral formula I have used has a simple Tonelli proof that does not use integration by parts, it is often referred to as an "integration-by-parts" formula since that is the simplest way to prove it in elementary probability theory when one does not have access to Fubini / Tonelli theorems.

I realized I can make the proof of the identity more explicitly probabilistic as follows. Observe the almost sure identity $$ (X-Y)\cdot 1[X>Y]=\int_0^1 1[X>t\geq Y]\ dt, $$ and similarly with $X$ and $Y$ reversed. Since $$|X-Y|=(X-Y)\cdot 1[X>Y]+(Y-X)\cdot 1[Y>X],$$ it follows that $$ |X-Y|=\int_0^1 1[X>t\geq Y]+1[Y>t\geq X]\ dt. $$ Now by Tonelli's Theorem we can put in the expectations on both sides to obtain $$ \frac{\mathbb E|X-Y|}{2}=\int_0^1 \mathbb P(X>t\geq Y)\ dt=\int_0^1 \mathbb P(X>t)\cdot \mathbb P(t\geq Y)\ dt, $$ using independence of $X$ and $Y$. Since $X$ and $Y$ have the same distribution, the identity at the top of my post follows.

edited Aug 09 '19 at 04:12

answered Aug 09 '19 at 03:50

pre-kidney

30,223

One line from $\int 1_{X<t<Y} dt = (Y-X)_+$ indeed ! Love it. +1 of course. – Olivier Aug 09 '19 at 07:44
Although, no need for reference for the inequality $x(1-x) \le 1/4$ as far as I am concerned :-) – Olivier Aug 09 '19 at 10:02
1

Last comment/question : you mention in the post to the reference question that "[your] proof applies in general to all random variables supported in [0,1]." But I think this is also the case of the original proof by @Sergey Golovan. If one interpret the integrals as Stieltjies integral, I do not see any forbidden step... Tell me if this is wrong. https://en.wikipedia.org/wiki/Riemann%E2%80%93Stieltjes_integral – Olivier Aug 09 '19 at 10:06
@Olivier to me (as someone with only passing familiarity with the Stieltjes integral), the proof seems incomplete as it does not justify the existence of the various Stieltjes integrals which appear (see https://en.wikipedia.org/wiki/Riemann%E2%80%93Stieltjes_integral#Existence_of_the_integral for more on this). Note this is not to say there is an issue with Mr. Golovan's proof, since he claimed it only in the case of continuous random variables - it is your reinterpretation using Stieltjes integration which would seem to require a few extra justifications... – pre-kidney Aug 10 '19 at 04:03
Thanks. I checked carefully and the proof is valid indeed for any random variable (the integration by parts as written on wiki is enough to justifiy the chain of equalities). – Olivier Aug 11 '19 at 14:17
Why does the Stieltjes integral exist though? That is the key... since a Stieltjes is defined as a limit of partial sums (just like the Riemann integral) it has issues of the limit not even existing. I am not convinced. The integration by parts is only meaningful if the Stieltjes integral exists, otherwise it is an equality between two undefined expressions... – pre-kidney Aug 11 '19 at 19:41
For this definition (with limit of partial sums), it exists whenever the integrand is continuous and the integrator is of bounded variation (the interesting case in probability is the case of a distribution function, hence bounded and increasing) : this is one of a few possible conditions, check https://www.encyclopediaofmath.org/index.php/Stieltjes_integral – Olivier Aug 11 '19 at 21:21
Some authors (Revuz-Yor for instance, around page 5) define Stieltjes integral $\int f_s dA_s$ as the Lebesgue integral of $f_s$ against $\mu$ associated to the right continuous function of bounded variation $A$ via $A_t = \mu([0,t])$.Then you again have an integration by parts formula for rather general integrands (bounded variation is the key word). – Olivier Aug 11 '19 at 21:32

Maximum mean absolute difference of two iid random variables : a random variable proof?

1 Answers1