4

I am currently trying to understand the following three points which we discussed in lectures recently:

  1. We say that $X=(X_1,\ldots,X_d)$ is $d$-dimensional multivariate Gaussian distributed if $X\sim N(\mu,Q)$ for some $\mu\in\mathbb R^d$, $Q=(q_{ij})\in\mathbb R^{d\times d}$, that is, $X_i\sim N(\mu_i, q_{ii})$ and $\text{cov}(X_i,X_j)=q_{ij}$.

  2. There holds: $X=(X_1,\ldots,X_d)$ is $p$-dimensional multi-variate Gaussian distributed if and only if any linear combination of $X_1,\ldots,X_d$ is Gaussian.

  3. If $X,Y$ are Gaussian variables, then $X+Y$ is not be Gaussian (in general). This only holds true if $X,Y$ are indepenent.


I get the feeling that the "multivariate Gaussian" definition in point (1) is somewhat wrong (or incomplete), because otherwise (2) and (3) would contradict each other. But (2) seems correct (as I have found it in many other lecture notes online) and (3) seems correct as well (because of Simon Nickerson's comment in Proof that the sum of two Gaussian variables is another Gaussian).

I know there are other definitions for "multivariate Gaussian", which do not contradict (2) and (3), but I am basically wondering whether there is any way of fixing the definition I have got, or is it just plain wrong?

Phil-ZXX
  • 3,194

1 Answers1

9

There is no contradiction. "$X$,$Y$ are Gaussian variables" is not equivalent to "$(X,Y)$ is Gaussian".

What is true is: if $X_1,X_2$ are jointly Gaussian (or, equivalently, if $(X_1,X_2)$ is a 2-dimensional Gaussian), then $X_1 + X_2$ (or any linear combination) is Gaussian.

Now, to say that each of $X_1$ and $X_2$ is a Gaussian variable, only implies that the multivariate variable $(X_1,X_2)$ has some joint distribution, with the property that each marginal is Gaussian. This does not imply that they are jointly Gaussian.

Point 3 is correct but slightly misleading. It should really say: "This only holds true if $X,Y$ are jointly Gaussian (independence is just a particular case)"

For example: let $Z_1,Z_2$ be two iid $N(0,1)$ gaussians. Let $X_1=Z_1$, $X_2= \operatorname{sgn}(Z_1) |Z_2|$. Check that $X_1$ and $X_2$ are both Gaussian, but $(X_1,X_2)$ is not gaussian. And check that $Y=X_1+X_2$ is not Gaussian (its density at $Y=0$ is zero).


Edited: point 1 can indeed be confusing. What follows "that is..." is not part of the definition, but a consequence: if $X$ is $d$-dimensional Gaussian with this and that parameter, then each component $X_i$ is also gaussian in itself, with these and those paramenters; but the converse is not true.

To make point 1 a real definition one could say: $X$ is $d$-dimensional multivariate Gaussian distributed if $X\sim \mathcal N(\mu,Q)$ for some $\mu\in\mathbb R^d$, and $Q=(q_{ij})\in\mathbb R^{d\times d}$ positive definite. And by $X\sim \mathcal N(\mu,Q)$ we mean that its density is: $$f_X(x)=\frac{1}{\sqrt{2 \pi |Q|^n}} \exp \left(-\frac{(x-\mu)^t Q^{-1}(x-\mu)}{2}\right)$$

An alternative constructive definition would be: $X$ is $d$-dimensional multivariate Gaussian is there exists some linear transformation $Z=AX+b$ that results in a set of $d$ iid $\mathcal N(0,1)$ variables.

leonbloy
  • 63,430
  • The contradiction I see is: Let $X_1$ and $X_2$ be any two Gaussian variables, i.e. $X_1\sim N(a,b)$, $X_2\sim N(c,d)$ for some $a,b,c,d$. So according to point 1 we have that $X=(X_1,X_2)$ is a multivariate Gaussian distribution. So by point 2, any liner combination is Gaussian, in particular $X_1+X_2$, but then point 3 becomes meaningless. Hence my question, is there anything wrong with point 1? :/ – Phil-ZXX Dec 06 '13 at 02:34
  • I see, I added some clarification. – leonbloy Dec 06 '13 at 02:56