4

Suppose $M$ is a compact subset of the separable Hilbert space $l_2$. Suppose, $X$ and $Y$ are i.i.d. random vectors with support in $M$. What is the largest possible $E \| X - Y \| $?

Suppose $M$ is a compact subset of the separable Hilbert space $l_2$. Then $M$ is closed and bounded, and there exist such $x$ and $y$ in $M$, such, that $\| x - y \| = \operatorname{diam}(M)$. Now, suppose $X$ and $Y$ are i.i.d. random vectors, such that $P(X = x) = P(X = y) = \frac{1}{2}$. One can see, that

$$E\| X - Y \| = \frac{1}{4}\| x - x \| + \frac{1}{4}\| x - y \| + \frac{1}{4}\| y - x\| + \frac{1}{4}\| y - y \| = \frac{1}{2}\operatorname{diam}(M)$$

Now let’s prove that it is the maximal possible expected distance, or to be more exact, that if $X = (X_n)_{i = 1}^{\infty}$ and $Y = (Y_n)_{i = 1}^{\infty}$ are i.i.d. random vectors with support in $M$, then $E\| X - Y \| \leq \frac{1}{2}\operatorname{diam}(M)$. By Hölder inequality:

$$E\| X - Y \| \leq \left(E(\| X - Y \|^2)\right)^{\frac{1}{2}}.$$

And one can see, that

\begin{align*} E(\| X - Y \|^2) &= E\langle X - Y, X - Y \rangle = 2E\langle X, X \rangle - 2E\langle X, Y \rangle \\ &= 2\left(\sum_{n = 1}^\infty E(X_n)^2 - \sum_{n = 1}^\infty EX_n Y_n \right) \\ &= 2\left(\sum_{n = 1}^\infty E(X_n)^2 - \sum_{n = 1}^\infty EX_n EX_n\right ) \\ &\leq 2\left(\sum_{n = 1}^\infty E(X_n)^2 \right) \\ &= 2E\langle X, X \rangle = 2E(\| X \|^2) \end{align*}

Now, suppose, that the least closed ball containing $M$ is the closed ball with radius $\frac{1}{2}$ and center $0$. That will result in $\| X \|$ being a random variable on $[-1, 1]$. So, its second moment does not exceed $\frac{1}{4}$ (There are several proofs of this fact here: What is the largest possible variance of a random variable on $[0; 1]$?), and we get $E\| X - Y \| \leq \frac{1}{\sqrt{2}}$.

And now let’s return to the general case. Suppose $z$ is the center of the least closed ball containing $M$. Then $\frac{M - z}{\operatorname{diam}(M)}$ is such a subset, that the least closed ball containing it is the closed ball with radius $\frac{1}{2}$ and center $0$. So

$$E\| X - Y \| = \operatorname{diam}(M)E \left\| \frac{X - z}{\operatorname{diam}(M)} - \frac{Y - z}{\operatorname{diam}(M)}\right\| \leq \frac{1}{\sqrt{2}}\operatorname{diam}(M)$$

So we know, that the largest possible $E\| X - Y \|$ we search is certainly $\geq \frac{1}{2}\operatorname{diam}(M)$ and certainly $\leq \frac{1}{\sqrt{2}}\operatorname{diam}(M)$. However, I do not know, how to find its exact value.

This question is partially inspired by the following question: Probability distribution to maximize the expected distance between two points

Sangchul Lee
  • 167,468
Chain Markov
  • 15,564
  • 6
  • 36
  • 116
  • I made some edit on your question so that formulas are better to read, I hope. – Sangchul Lee Feb 18 '19 at 04:29
  • It definitely depends on the choice of $M$. I can set up variational argument to obtain a characterization of the maximizer, and in the case where the convex hull of $M$ is a finite-dimensional polytope, then I can give a formula computing the optimal constant. For a very specific case where the convex hull of $M$ is a regular $n$-gon (so that its $n$ vertices $v_1, \cdots, v_n$ also lie in the original set $M$), the expectation is maximized by $X, Y$'s uniformly distributed on ${v_1, \cdots, v_n}$. – Sangchul Lee Feb 18 '19 at 08:30

1 Answers1

2

Here is a note on some of my observations, but this answer is far from being complete, as I did not explain many steps.


This problem allows variational formulation. Let $\mathcal{P}$ denotes the set of all probability measures on the compact set $M \subset \mathbb{R}^d$. Then, on $\mathcal{P}$, we define

$$ \langle \mu, \nu \rangle = \int_{M^2} \| x - y \| \, \mu(\mathrm{d}x)\nu(\mathrm{d}y), \qquad Q(\mu) = \langle \mu, \mu \rangle. $$

This is related to our problem by noting that, if $X$ and $Y$ are i.i.d. random variables on $M$, its distribution $\mu$ is an element of $\mathcal{P}$ and $\mathbb{E}[\| X - Y \|] = Q(\mu)$ holds. In light of this, the question boils down to finding the maximum of $Q$ over $M$. If we equip $\mathcal{P}$ the topology of weak-* convergence, then $\langle \cdot, \cdot \rangle$ is continuous, and so, $Q$ attains maximum on $\mathcal{P}$. Then the following is a variational formulation of characterizing these maximums:

Claim. For $\mu \in \mathcal{P}$, the followings are equivalent:

  1. $\mu$ maximizes $Q$.
  2. $\langle \mu, \delta_z \rangle \leq \langle \mu, \delta_x \rangle$ for any $z \in M$ and $x \in \operatorname{supp}(\mu)$.

The argument is quite similar to one of my previous answer to a similar question, so let me skip the proof at this moment and rejoice its consequence. Let $\mu$ be a maximizer of $Q$ over $\mathcal{P}$. Then the following transform

$$ L\mu(z) := \langle \mu, \delta_z \rangle = \int_M \|x - z\| \, \mu(\mathrm{d}x)$$

is a continuous convex function on $\mathbb{R}^d$. Since every point in the support of $\mu$ is a maximum point of $L\mu$ over $M$, by writing $m = \max_M L\mu$, we easily read out that

$$ \operatorname{supp}(\mu) \subseteq (L\mu)^{-1}(\{m\}) \qquad \text{and} \qquad M \subseteq (L\mu)^{-1}((-\infty, m]). $$

To make further progress, we separate exceptional case from the general argument:

  1. If it happens that $\operatorname{supp}(\mu)$ lies in a line, then it reduces to 1-d problem. In such case, it is not hard to check that $\mu$ must be of the form $\mu = \frac{1}{2}(\delta_{x_0} + \delta_{x_1})$ for some distinct points $x_1$ and $x_2$. In such case $M$ must be a compact subset of the line segment $\overline{x_0x_1}$.

  2. Otherwise, $(L\mu)^{-1}((-\infty, m])$ is a strict convex set and $(L\mu)^{-1}(\{m\}) = \partial (L\mu)^{-1}((-\infty, m])$. This may be used to further restrict the possible form that $\operatorname{supp}(\mu)$ can take.

  3. As a special case, consider the situation where the convex hull of $M$ is a convex polytope. Then $\operatorname{supp}(\mu)$ is supported on the vertex-set of that polytope. If $x_1, \cdots, x_n$ denotes these vertices, then the problem reduces to solving the system of linear equations

    $$ L \mu = m\mathbf{1}, \qquad \mathbf{1}^{\mathsf{T}}\mu = \mathbf{1} $$

    Here, we identify $L$ as the symmetric matrix $L = (\| x_j - x_k \|)_{j,k=1}^{n}$ and the measure $\mu$ as the column vector $(\mu(\{x_j\}))_{j=1}^{n}$. Also, $\mathbf{1}$ is the column vector of length $n$ consisting of only $1$'s. The conclusion is that $m = 1/(\mathbf{1}^\mathsf{T}L^{-1}\mathbf{1})$, or equivalently, $m$ is the inverse of the sum of all entries of $L^{-1}$.

  4. In some cases, we may use the characterization of the maximizer directly. For instance, if the convex hull of $M$ is a ball, then $\mu$ must be the uniform distribution over the boundary of the ball, again allowing an explicit computation of the optimal constant.

Sangchul Lee
  • 167,468