2

For a data set $\bigcup_{i=1}^nx_i$ of $n$ values with mean $\bar{x}$ the variance is defined as $$\sigma^2=\frac{\displaystyle\sum_{i=1}^n(x_i-\bar{x})^2}{n}$$

My textbook says that “the square ensures that each term in the sum is positive, which is why the sum turns out not to be zero.” However wouldn’t $$\sigma^2=\frac{\displaystyle\sum_{i=1}^n|x_i-\bar{x}|}{n}$$ also prevent a sum of zero?

2 Answers2

4

Yes, your expression is also non-negative. In the language of linear algebra, your expression is based on the $\ell^1$ norm, while the definition of variance is based on the $\ell^2$ norm. More generally, for any real number $p\geq 1$ there is an $\ell^p$ norm $$ ||x||_p=\Big[\sum_{i=1}^n|x_i|^p\Big]^{\frac{1}{p}} $$ as well as the $\ell^{\infty}$ norm $$ ||x||_{\infty}=\max\{|x_1|,\dots,|x_n|\}$$ These other norms can be useful in some contexts, but the $\ell^2$ norm is the most useful because it is the only one which is induced by an inner product (except for the trivial case $n=1$ when all these norms are the same). This means that one can then talk about angles between vectors.

carmichael561
  • 53,688
0

Variance is a function of the First and Second moments of a distribution.

The above can be though either as estimation of the Variance of the Distribution given samples or a discrete approximation of it.

Now, its positivity is a property, not the reason why it was defined that way.

Royi
  • 8,711