Mean minimizes squared error in normed vector space

Question

The goal is to show the cost function $$J(x) = \sum_{k=1}^n ||x - x_k ||^2$$ is minimized when $x = m$, where $m$ is the sample mean $m = \frac1n \sum_{k=1}^n x_k$

I would like to stay as general as possible and consider a normed vector space $\left(X, ||\cdot||\right)$ where $x_1, ..., x_n$ are given points in $X$.

I came across this previous post Formal proof that mean minimize squared error function. In their proof, they used :

\begin{align} J(x) &= \sum_{k=1}^n ||(x-m) - (x_k - m)||^2 \\ &= \sum_{k=1}^n ||x-m||^2 - 2(x-m)^T \sum_{k=1}^n (x_k-m) + \sum_{k=1}^n ||x_k - m||^2 \end{align}

I wonder how they got the second term $2(x-m)^T \sum_{k=1}^n (x_k-m)$ ?

My guess is that they used the identity $||a-b||^2 = ||a||^2 - 2 a \cdot b + ||b||^2$. However this identity assumes that the inner product $a \cdot b$ is defined, which is not always the case in general. What if the norm we're considering isn't induced by an inner product ?

How would this proof be extended to normed vector spaces that don't have an inner product ?

EDIT : The answers have pointed out that this claim isn't true in general (see counter-example below).

As a follow-up question : is it true for $n=2$ ? That is, given two points $x_1$ and $x_2$ in $X$, does $\frac12 (x_1 + x_2)$ always minimize $J$ ?

Consider whether this is even true if norm used isn't the 2-norm. — Brian Borchers, Mar 22 '23 at 23:14
@BrianBorchers Thanks for you reply. I couldn't find any counter-example, which leads me to believe m always minimizes J. If the norm used isn't the 2-norm, that minimum might not be unique. My claim is that m is one of them. — mathnoob44, Mar 22 '23 at 23:19

score 2 · Accepted Answer · answered Mar 22 '23 at 23:34

2

The implicit assumption in the proof is we are working under the Euclidean notion of inner product and norm (hence, writing an inner product $\langle x,y\rangle$ as $x^{T}y$ ).

In any case, the statement is not true under general norms. Consider the max norm, which, for $x\in \mathbb{R}^n$, is defined as:

$$\|x\|_\infty:=\max(|x_1|,...,|x_n|).$$

In the special case of $\mathbb{R}^2$, consider vectors $x_1=(1,0),x_2=(0,1),x_3=(0,0),$ which have a sample mean of $m=(1/3,1/3).$ However,

$$J(m)=(2/3)^2+(2/3)^2+(1/3)^2=1>3/4=(1/2)^2+(1/2)^2+(1/2)^2=J((1/2,1/2)).$$

answered Mar 22 '23 at 23:34

Golden_Ratio

12,591

Neat, thank you! I only considered two points when looking for a counter-example, and couldn't find any. As a follow-up question : does my claim hold for $n=2$, i.e. with only two points $x_1$ and $x_2$ ? – mathnoob44 Mar 22 '23 at 23:43
@mathnoob44 See my second answer – Golden_Ratio Mar 23 '23 at 00:21

Golden_Ratio · Answer 2 · 2023-03-23T00:31:13.360

1

As for the updated question, the claim holds for $n=2$ since

$$\begin{align}J(x)-J(m)&=\|x-x_1\|^2+\|x-x_2\|^2-\Bigg\|\frac{1}{2}(x_1+x_2)-x_1\Bigg\|^2-\Bigg\|\frac{1}{2}(x_1+x_2)-x_2\Bigg\|^2\\ &=\|x-x_1\|^2+\|x-x_2\|^2-\Bigg\|\frac{1}{2}(x_2-x_1)\Bigg\|^2-\Bigg\|\frac{1}{2}(x_1-x_2)\Bigg\|^2\\ &=\|x-x_1\|^2+\|x-x_2\|^2-2\Bigg\|\frac{1}{2}(x_2-x_1)\Bigg\|^2\\ &=\|x-x_1\|^2+\|x-x_2\|^2-\frac{1}{2}\|x_2-x_1\|^2\\ &=\|x-x_1\|^2+\|x-x_2\|^2-\frac{1}{2}\|x_2-x+x-x_1\|^2\\ &\geq \|x-x_1\|^2+\|x-x_2\|^2-\frac{1}{2}(\|x_2-x\|+\|x-x_1\|)^2\quad (\text{triangle ineq.)}\\ &= \frac{1}{2}(\|x-x_1\|^2+\|x-x_2\|^2-2\|x_2-x\|\|x-x_1\|)\\ &=\frac{1}{2}(\|x-x_1\|-\|x-x_2\|)^2\\ &\geq 0. \end{align}$$

edited Mar 23 '23 at 00:31

answered Mar 23 '23 at 00:21

Golden_Ratio

12,591

Amazing! Thank you :) – mathnoob44 Mar 23 '23 at 13:01
@mathnoob44 no prob! – Golden_Ratio Mar 23 '23 at 15:12

Mean minimizes squared error in normed vector space

2 Answers2