43

Suppose that we have two different discreet signal vectors of $N^\text{th}$ dimension, namely $\mathbf{x}[i]$ and $\mathbf{y}[i]$, each one having a total of $M$ set of samples/vectors.

$\mathbf{x}[m] = [x_{m,1} \,\,\,\,\, x_{m,2} \,\,\,\,\, x_{m,3} \,\,\,\,\, ... \,\,\,\,\, x_{m,N}]^\text{T}; \,\,\,\,\,\,\, 1 \leq m \leq M$
$\mathbf{y}[m] = [y_{m,1} \,\,\,\,\, y_{m,2} \,\,\,\,\, y_{m,3} \,\,\,\,\, ... \,\,\,\,\, y_{m,N}]^\text{T}; \,\,\,\,\,\,\,\,\, 1 \leq m \leq M$

And, I build up a covariance matrix in-between these signals.

$\{C\}_{ij} = E\left\{(\mathbf{x}[i] - \bar{\mathbf{x}}[i])^\text{T}(\mathbf{y}[j] - \bar{\mathbf{y}}[j])\right\}; \,\,\,\,\,\,\,\,\,\,\,\, 1 \leq i,j \leq M $

Where, $E\{\}$ is the "expected value" operator.

What is the proof that, for all arbitrary values of $\mathbf{x}$ and $\mathbf{y}$ vector sets, the covariance matrix $C$ is always semi-definite ($C \succeq0$) (i.e.; not negative definte; all of its eigenvalues are non-negative)?

hkBattousai
  • 4,543

3 Answers3

57

A symmetric matrix $C$ of size $n\times n$ is semi-definite if and only if $u^tCu\geqslant0$ for every $n\times1$ (column) vector $u$, where $u^t$ is the $1\times n$ transposed (line) vector. If $C$ is a covariance matrix in the sense that $C=\mathrm E(XX^t)$ for some $n\times 1$ random vector $X$, then the linearity of the expectation yields that $u^tCu=\mathrm E(Z_u^2)$, where $Z_u=u^tX$ is a real valued random variable, in particular $u^tCu\geqslant0$ for every $u$.

If $C=\mathrm E(XY^t)$ for two centered random vectors $X$ and $Y$, then $u^tCu=\mathrm E(Z_uT_u)$ where $Z_u=u^tX$ and $T_u=u^tY$ are two real valued centered random variables. Thus, there is no reason to expect that $u^tCu\geqslant0$ for every $u$ (and, indeed, $Y=-X$ provides a counterexample).

Did
  • 279,727
36

Covariance matrix C is calculated by the formula, $$ \mathbf{C} \triangleq E\{(\mathbf{x}-\bar{\mathbf{x}})(\mathbf{x}-\bar{\mathbf{x}})^T\}. $$ Where are going to use the definition of positive semi-definite matrix which says:

A real square matrix $\mathbf{A}$ is positive semi-definite if and only if
$\mathbf{b}^T\mathbf{A}\mathbf{b}\succeq0$
is true for arbitrary real column vector $\mathbf{b}$ in appropriate size.

For an arbitrary real vector u, we can write, $$ \begin{array}{rcl} \mathbf{u}^T\mathbf{C}\mathbf{u} & = & \mathbf{u}^TE\{(\mathbf{x}-\bar{\mathbf{x}})(\mathbf{x}-\bar{\mathbf{x}})^T\}\mathbf{u} \\ & = & E\{\mathbf{u}^T(\mathbf{x}-\bar{\mathbf{x}})(\mathbf{x}-\bar{\mathbf{x}})^T\mathbf{u}\} \\ & = & E\{s^2\} \\ & = & \sigma_s^2. \\ \end{array} $$ Where $\sigma_s$ is the variance of the zero-mean scalar random variable $s$, that is, $$ s = \mathbf{u}^T(\mathbf{x}-\bar{\mathbf{x}}) = (\mathbf{x}-\bar{\mathbf{x}})^T\mathbf{u}. $$ Square of any real number is equal to or greater than zero. $$ \sigma_s^2 \ge 0 $$ Thus, $$ \mathbf{u}^T\mathbf{C}\mathbf{u} = \sigma_s^2 \ge 0. $$ Which implies that covariance matrix of any real random vector is always positive semi-definite.

hkBattousai
  • 4,543
  • Interesting... Did you compare your approach to (a part of) an answer posted one year earlier? – Did Apr 24 '14 at 06:19
  • 14
    Rereading this answer five years later, I realize it is actually completely wrong, confusing random variables with real numbers. Nice upvotes though... – Did Nov 24 '18 at 19:51
  • @Did: Completely agree with you. No idea why this answer is even allowed here. – Akshay Bansal Feb 15 '19 at 09:25
  • @Did what do you mean, what's the matter here ? reading that second answer made me understand that the whole proof of the first answer lies upon the fact that there is a number that is squared thus non negative. What's wrong with that second answer? – Marine Galantin Apr 16 '20 at 15:39
  • @MarineGalantin: In the second calculations hkBattousai states that "s = u^T(x-xbar)" and in third calculations "sigma_s = u^T(x-xbar)". s is a stochastic variable and sigma_s is the variance. – aaa May 29 '20 at 09:40
  • @aaa what do you mean? I don't understand your comment. – Marine Galantin May 29 '20 at 12:50
  • 1
    @MarineGalantin The incorrectness lies in that hkBattousai mixes the the stochastic variable "s" with the variance "sigma", that is, he treats them as they were equals. (That's at least how I understood it, I might be wrong.) – aaa May 29 '20 at 13:12
  • 2
    For anyone reading this now, the answer has been corrected. – dkarkada Feb 22 '24 at 20:41
2

This is the correct way to justify and elaborate on hkBattousai's answer.

If $\mu=(E(X_{1}),...,E(X_{n}))^{T}$ and $\Sigma$ be the covariance matrix. Then for any $v\in\Bbb{R}^{n}$

We have \begin{align}\langle v,\Sigma v\rangle&=\sum_{i,j=1}^{n}v_{i}\Sigma_{ij}v_{j}\\ &= \sum_{i,j=1}^{n}v_{i}E\bigg((X_{i}-\mu_{i})(X_{j}-\mu_{j})\bigg)v_{j}\end{align}

Now if we consider the random variable $\displaystyle Y=\sum_{i=1}^{n}v_{i}X_{i}$ .

Then $\displaystyle\bigg(Y-\sum_{i=1}^{n}v_{i}\mu_{i}\bigg)^{2}=\sum_{i,j=1}^{n}\left(v_{i}X_{i}-v_{i}\mu_{i}\right)\left(v_{j}X_{j}-v_{j}\mu_{j}\right)=\sum_{i,j=1}^{n}v_{i}\left(X_{i}-\mu_{i}\right)(X_{j}-\mu_{j})v_{j}$

Thus $\displaystyle E\bigg(\sum_{i,j=1}^{n}v_{i}\left(X_{i}-\mu_{i}\right)(X_{j}-\mu_{j})v_{j}\bigg)=\sum_{i,j=1}^{n}v_{i}E\bigg((X_{i}-\mu_{i})(X_{j}-\mu_{j})\bigg)v_{j}=Var(Y)\geq0$

And that's basically it. So $\langle v,\Sigma v\rangle\geq 0$ for all $v\in\Bbb{R}^{n}$