8

Why is covariance considered as inner product if there is no projection of one vector onto another?

Right now I perceive this as just a multiplication of $x$ segment of vector($x_i - \bar{x}$) and $y$ segment($y_i - \bar{y}$) of the same vector in order to understand direction of relationship.

Stephen Rauch
  • 1,783
  • 11
  • 22
  • 34
user641597
  • 143
  • 3
  • 7

2 Answers2

2

Definition

A inner product (AKA dot product and scalar product) can be define on two vectors $\mathbf{x}$ and $\mathbf{y}$ $\in \mathcal{R^n} $ as

$$ \mathbf{x.x^T} = \langle \mathbf{x},\mathbf{y}\rangle_\mathcal{R^n}=\langle \mathbf{y},\mathbf{x}\rangle_\mathcal{R^n} = \sum_{i=1}^{n} x_i \times y_i $$

The inner product can be seem as the length of the projection of a vector into another and it is widely used as a similarity measure between two vectors.

Also the inner product have the following properties:

The covariance of two random variables $X$ and $Y$ can be defined as

$$ E[(X-E[X]) \times (Y - E[Y])] $$

the covariance holds the properties of been commutative, bilinear and positive-definite.

These properties imply that the covariance is an Inner Product in a vector space, more specifically the Quotient Space.

Association with the kernel trick

If you are familiar with Support Vector Machines you probably familiar with the Kernel Trick where you implicitly compute the inner product of two vectors into a mapped space, called feature space. Without performing the mapping you can compute the inner product into even a possibly infinite dimensional space given that this mapping.

To perform that inner product, you need to find a function, known as kernel functions, that can perform this inner product without explicitly mapping the vectors.

For a kernel function to exist it needs to have the following atributes:

  • It needs to be symmetric
  • It needs to be positive-definite

That is sufficient and necessary to a function $\kappa(\mathbf{x,y})$ to be considered a inner product in an arbitrary vector space $\mathcal{H}$.

As the covariance, comply to this definition it is a Kernel Function and consequentially it is an Inner Product in a Vector Space.

Pedro Henrique Monforte
  • 1,656
  • 1
  • 12
  • 26
  • There is a mistake in the definition of positive definition. The condition $x \neq 0$ is missing. Aditionnaly the covariance is not positive definite. Indeed consider a non-zero constant random variable. It is not zero and yet it's variance is 0. – Digitallis Dec 19 '23 at 16:13
0

The usual inner product on the space of random variables with finite second moments is defined by $$ \langle X, Y \rangle = E(XY)$$

This means that $Cov(X,Y) = \langle X-E(X), Y - E(Y) \rangle$. That is, the covariance between two random variables (with finite second moments) is the inner product of the corresponding centered variables.


However the covariance function $Cov(,) : X,Y \mapsto Cov(X,Y)$ is not an inner product (on the space of random variables with finite second moments)! Indeed it should have to verify the properties

  1. Linear in each argument
  2. Symmetric i.e. $$ Cov(X,Y) = Cov(Y,X)$$
  3. non negative definite $$ Cov(X,X) \geq 0$$ and $$ Cov(X,X)= 0 \iff X = 0$$ where $Cov(X,X) := Var(X)$

It is this last condition which is problematic. Indeed recall that for any constant random variable $c \neq 0$ we have

$$ Var(c) = 0$$ which contradicts the last needed property.

So unless you restrict yourself to a space of random variables with mean 0, the covariance will not be an inner product.

Digitallis
  • 101
  • 1