This is my first post on math.stackexchange and i am not a mathematician, but i took some undergrad math courses and some grad mathematical modelling courses, so i come with a basic understanding of math and computed a lot of eigenvectors and eigenvalues already... I hope my notation can be understood easily.
I am trying to describe a two dimensional scatter plot by 3 parameter, which define the shape of an ellipse (i.e. length of major-axis, length of minor-axis and the angle between the major-axis and the ordinate of a cartesian coordinate system. The reason for this is, the scatter-plot changes over time and i want to track that change by tracking the change of those 3 parameters, i.e. the shape of the ellipse. I do not need a confidence-region or anything like this, just the parameters of an ellipse, fitting to my data.
All i have given (and need) is the x- and y-coordinates of the scatter plot. I understand that the eigenvectors of the covariance-matrix give me orientation of the major- and minor-axis, which gives me the angle between major-axis and ordinate (just as in a principal component analysis). What I do not understand is, how do I get the length of the major- and minor-axis from the covariance matrix. I found two different approaches:
- Similar post on math.stackexchange linking to this source code on Wikipedia. They do a Cholesky-Decomposition of the Covariance matrix and transform a unit-circle with it. I cannot follow this mathematically, nor do i know why this would be correct.
- Similar post on stackoverflow They define the width and the height (i.e. length of the major- and minor-axis) as 2 * nstd * np.sqrt(vals), where nstd is denotes the n-th standard deviation which. In my case this would be 3.
I can follow the second approach better, used the python code and the result looks great! But i do not understand why the length of the axis scales with the square root of the eigenvalues. I think that there must be a simple relation between the eigenvalue of a covariance matrix and the standard deviation of the projected data to the ellipse-axis, which i just cannot find in the literature and which i cannot derive myself. Can anyone enlighten me on this or help me to find something to read, such i can understand this better.
Best, Nras.
Update: I got the Textbook "Circular Statistics in Biology" from Batchelet, 1981. On pp.268-274 they calculate the semi-axis and the inclination angle for an example with $\sigma_1=3$ and $\sigma_2=2$ and with the correlation coefficient $\rho=0.4$. This yields the covariance matrix $$\begin{align*} Cov & =\left[ {\begin{array}{cc} \sigma_1^2 & \rho\sigma_1\sigma_2\\ \rho\sigma_1\sigma_2 & \sigma_2^2\\ \end{array} } \right] \\ &= \left[ {\begin{array}{cc} 9.0 & 2.4\\ 2.4 & 4.0\\ \end{array} } \right] \end{align*}$$ and go through the equations 14.4.8 to 14.4.11 and find the results $$a=3.157, b=1.742, \Theta=21.9^\circ$$ The result just magically appears, as i cannot find the referenced equations, they stop after 14.4.7, right before this example and the chapter finishes after the example. What a pity!
Running the Code in python, where the semi-axis length is defined by the square root of the eigenvalue, i obtain the same result:
import numpy
cov = numpy.array([[9., 2.4], [2.4, 4.]])
eigenvals, eigenvecs = numpy.linalg.eig(cov)
numpy.sqrt(eigenvals)
numpy.rad2deg(numpy.arccos(eigenvecs[0, 0]))
The output is as follows:
array([ 3.1568251, 1.7419688])
21.915430336046288
So there must be something that explains why the ellipse semi-axis length scales with the square root of the eigenvalue. I now know that it is true, but still do not know why.
Update2: The referenced equations really do not exist, but the calculation is sketched in the previous chapter 13.8. Summary: Let the Distributions 1 and 2 be centered at $\mu_1$ and $\mu_2$, respectively and and let $$ A=\sigma_2^2,\quad B=-\rho\sigma_1\sigma_2,\quad C=\sigma_1^2,\quad D=(1-\rho^2)\sigma_1^2\sigma_2^2$$ then the Ellipse follows the equation $$ A(x-\mu_1)^2 + 2B(x-\mu_1)(y-\mu_2) + C(y-\mu_2)^2 = D$$ From that equation the semi-axis length and inclination of the major-axis is computed as follows: Let $$R=[(A-C)^2 + 4B^2]^{1/2}$$ then the semi-major-axis $a$ and the semi-minor-axis $b$ can be computed as $$b, a = \left(\dfrac{2D}{A+C\pm R}\right)^{1/2}$$ and for the inclination $$\Theta = \arctan\left(\dfrac{2B}{A-C-R}\right)$$ Substituting in the values from the example yields the exact same solution as the approach with the square root of the eigenvalues and the inclination which can be obtained from the arccos of the eigenvector.
From here it cannot be fare to show that $a=\sqrt{\lambda_1}$ and $b=\sqrt{\lambda_2}$ if $\lambda_i$ are the eigenvalues of the covariance matrix.
numpy.linalg.eig
have unit length. Therefore, the eigenvalues directly relate to the variances (and standard deviation is the square root of the variance). It's easiest to think about the uncorrelated case at first. In that case, the eigenvectors are parallel to the coordinate system, and the eigenvalues are just the variances of the data (and the square root of those is the standard deviation). (Of course, relating the standard deviation to the confidence interval assumes a normal distribution, but that's a separate issue.) – Joe Kington Aug 28 '14 at 15:04