Complex matrix after taking square root in EM algorithm

Question

I was trying to implement the paper for aspect ranking: Product Aspect Ranking and Its Applications. The main algorithm described in the paper is based on EM algorithm and the authors have clearly given the equations to compute during the E-M step.

The problem is that during the M-step the equation

$\mathbf{\hat{\Sigma}} = \left ( \frac{1}{\varphi}\sum_{r \in R}\mathbf{((\omega_r - \mu)(\omega_r - \mu)^T)}) + (\frac{|R|-\varphi}{2\varphi})^2.\mathbf{I} \right )^{1/2} - \frac{(|R| - \varphi)}{2\varphi}.\mathbf{I}$

where $\varphi \in \mathbf{R}, \omega_r \in \mathbf{R^{m}}, \mu \in \mathbf{R^{m}}, \mathbf{I} \in \mathbf{R^{mxm}} \ \text{(Identity Matrix)}$

is causing the matrix $\mathbf{\Sigma}$ to get complex numbers. This is because of the square root of the matrix in the first part of the equation. Since nothing was mentioned about the initialisation in the paper, I initialised each of the parameters with standard normal distribution and taken value of $\varphi$ as $100$. I also experimented with different values of $\varphi$ in range $[0.001-100000]$ but it didn't work. Also, as asked here: square root of a real matrix, the answer says "A real matrix S can possess infinitely many real or nonreal square roots".

I'm implementing this using python and it is giving me warning for getting complex numbers. Also, based on the results stored in $\omega_r \forall r$ after some iterations, I'm not getting any good results as claimed by the paper. I believe this is because of not handling complex numbers in this matrix. Now, my question is:

Should I get complex numbers in the matrix like that? and if so, then Is there any way to handle that so that the algorithm converges?

Thanks in advance :)

Mark L. Stone · Accepted Answer · 2018-06-27T11:25:19.307

Equation (13) in the paper shows $(\frac{|R|-\varphi}{2\varphi})^2$, where you forgot the square. Is that just a typo in your question presentation, or did you not do the squaring in your actual computation? If the latter, that can explain why you might get a non positive semidefinite argument of the square root, which might then not be real.

Edit: Given that the missing square is just a typo, I suggest you examine the matrix, let's call it $A$, whose square root you are taking.

First of all, make sure you are taking the matrix square root, not the square root of each element. Presuming that is done correctly, make sure that $A$ is symmetric. Even round-off level asymmetry can result in small magnitude imaginary components in the square root. If $A - A^T$ is not exactly zero, you should examine your code for any gross errors. If there are none, and $A-A^T$ is very close to, but not exactly equal to zero, you should synnetrize it before taking the matrix square root. Do this by setting $A = $ the argument of the square root. Then set $A = 0.5*(A + A^T)$. This new $A$ should be exactly symmetric, and its matrix square root should be real.

Also, regarding your statement ""A real matrix S can possess infinitely many real or nonreal square roots", you should note the correct comment that "symmetric positive semidefinite matrices have unique positive semidefinite square roots." Your $A$ should be symmetric positive semidefinite; that's what the symmewtrizing step is meant to ensure.

Sorry that was my typo in the question. In the actual code, I did the square. Thanks :) — Naveen Pundir, Jun 27 '18 at 05:02
It worked! Thank you so much :) Also, for others, this post was also helpful:- https://math.stackexchange.com/questions/909814/how-to-avoid-complex-value-for-square-root-of-a-symmetric-matrix — Naveen Pundir, Jun 28 '18 at 12:32
That other link describes what to do if your matrix is not quite positive semidefinite. I described what to do if it's not quite symmetric I've used both approaches, and others, on many occasions. — Mark L. Stone, Jun 28 '18 at 12:53

Complex matrix after taking square root in EM algorithm

1 Answers1