2

I understand that by Cholesky Decomposition, multivariate normal distribution $X=[X_1, ..., X_n]$~$ N(0,\Sigma)$ can be simulated as $RZ$, where $Z=[Z_1, ..., Z_n]$~$N(0,I_{n\times n})$ and $R$ is a lower triangular matrix such that $RR^T=\Sigma$.

My question is:

Assume that for each variable $X_i$, there are $m$ samples. Notice that $\Sigma_{ij} =cov(X_i,X_j)=\frac{1}{m}\sum_{k=1}^{m}(x_{i,k}-\overline{x}_i)(x_{j,k}-\overline{x}_j)=\sum_{k=1}^{m}y_{i,k}\times y_{j,k}$, where $y_{i,k} = \frac{x_{i,k}-\overline{x}_i}{\sqrt{m}}$. Therefore, we can define a matrix $A_{n\times m}$ with $a_{ij}=y_{ij}$. It's easy to see that

$AA^T=\Sigma$

So $X$ can be simulated as $AZ'$, where $Z'=[Z_1,...,Z_m]$~$N(0,I_{m\times m})$. Why don't people simply use this method to simulate multivariate normal distribution (by avoiding the computation to find $R$)? What's the disadvantage of this method compared to Cholesky Decomposition?

dh7
  • 21

1 Answers1

0

Your equations aren't valid. You've equated the covariance matrix of the distribution with an estimate of the covariance matrix based on a sample. Such a method wouldn't be used because you'd be perpetuating the random bias in the samples by effectively redefining the distribution to match your samples. The case $m=1$ makes it particularly evident that this doesn't simulate the actual target distribution.

joriki
  • 238,052
  • From my understanding, if we are going to use Cholesky Decomposition, an estimate of the covariance matrix based on the samples will still need to be performed no? So that the lower triangular matrix can be calculated and the simulation is obtained. – dh7 Jun 26 '18 at 21:13
  • @doudou: That depends entirely on your application. You didn't provide the context in which you're doing this. There are many applications where you know the covariance and don't need to estimate it from samples. However, if you restrict your question to cases where the covariance is unknown and needs to be estimated from samples, I think it might be an interesting question and one could try to analyse the respective advantages of these two methods. – joriki Jun 26 '18 at 21:41
  • One serious disadvantage of your method is that it requires $m$ univariate Gaussians and an $n\times m$ matrix multiplication, which could be huge if you take lots of samples to get a good estimate of the covariance matrix, whereas the conventional method only requires $n$ univariate Gaussians and an $n\times n$ matrix multiplication. – joriki Jun 26 '18 at 21:41
  • Thanks for your reply @joriki Sorry I should've added the background. The covariance matrix needs to be estimated as there are samples of variables only. You are right if m>>n the method I mentioned above requires much more random numbers to be generated. In the case of n>>m, my method should be preferred. Also, if we add one more X_i or remove one, Cholesky Decomposition requires to recalculate the lower triangular matrix as the dimension is changed. But in my method, no such calculation is needed. We just need to add or remove one row in the matrix A. – dh7 Jun 26 '18 at 22:44