Is the pseudoinverse matrix the solution to the least squares problem?

Question

I'm trying to verify that, given a matrix M, the pseudo-inverse $$M^{+}=(M^TM)^{-1}M^T$$ is the solution for the least squares.. but something went wrong and I can't undestand why...

$$e=\frac{1}{2}||y-Mx||=\frac{1}{2}(y-Mx)^T(y-Mx)\\ =\frac{1}{2}(y^Ty-y^TMx-x^TM^Ty+x^TM^TMx)=\\ =\frac{1}{2}(y^Ty-2y^TMx+x^TM^TMx) $$

so $$\frac{de}{dx}=\frac{1}{2}(-2y^TM+x^TM^TM)=0\\ x^TM^TM=2y^TM\\M^TMx=2M^Ty\\x=2(M^TM)^{-1}M^Ty$$

why can't I cancel the factor '2'?

score 1 · Answer 1 · edited Apr 13 '17 at 12:21

1

For a symmetric matrix $A$, the derivative of $x^\top A x$ with respect to $x$ is $2Ax$. You can prove this directly.

Using this fact you get $M^\top y = M^\top M x$ which yields the answer.

edited Apr 13 '17 at 12:21

Community

1

answered Mar 30 '17 at 20:47

angryavian

89,882

But does it hold also in the case of non-symmetric matrix? – sunrise Mar 31 '17 at 05:49
3

@sunrise $M^\top M$ is always symmetric – angryavian Mar 31 '17 at 17:30

score 0 · Answer 2 · answered Mar 30 '17 at 20:28

0

Hint: $x^TM^TMx=(Mx)^2$. When you take the derivative with respect to $x$, you can't treat $x^T$ as a constant.

answered Mar 30 '17 at 20:28

Paul

8,153

.. :) If I write: $Mx \cdot Mx$, using the chain rule, I obtain $M\cdot Mx+Mx\cdot M$... but then... ? – sunrise Mar 30 '17 at 20:33
Not quite, since $Mx . M$ is not defined. – Paul Mar 30 '17 at 20:36
I'm sorry... I really don't know how to go on... thanks for your precious help! – sunrise Mar 31 '17 at 05:51

score 0 · Answer 3 · answered Mar 31 '17 at 00:37

The form given assumes the full column rank linear system $$ \mathbf{A}x = b $$ where $$ \mathbf{A}\in\mathbb{C}^{m\times n}_{\rho}, \quad b \in\mathbb{C}^{m}, \quad x\in\mathbb{C}^{n} $$ There is an additional requirement that $b\notin\mathcal{N}\left(\mathbf{A}^{*}\right)$

The given solution comes from the normal equations $$ \mathbf{A}^{*} \mathbf{A}x = \mathbf{A}^{*} b $$ The product matrix can be inverted because the matrix rank is the same as the number of columns. The solution is: $$ x = \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1} \mathbf{A}^{*}b $$

Now the issue is to show that that $$ \mathbf{A}^{\dagger} = \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1} \mathbf{A}^{*} $$ using the singular value decomposition: $$ \begin{align} \mathbf{A} &= \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{c} \mathbf{S} \\ \mathbf{0} \end{array} \right] % V \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \end{align} $$ Show that $$ \mathbf{A}^{*} \mathbf{A} = \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S} \, \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} $$ which inverts easily $$ \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1} = \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-2} \, \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} $$ Finally $$ \begin{align} \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1}\mathbf{A}^{*} &= \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-2} \, \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \left( \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S} \, \color{blue}{\mathbf{U}_{\mathcal{R}}}^{*}\right) = \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-2} \, \color{blue}{\mathbf{U}_{\mathcal{R}}}^{*} \\ % &= \mathbf{A}^{\dagger} % \end{align} $$

Mmmh... I'm not asking for the SVD, but for the proof of the relation between psudoinverse and least squares problem... — sunrise, Mar 31 '17 at 05:55
@sunrise: Does this help? https://math.stackexchange.com/questions/2209379/singular-value-decomposition-proof/2211001#2211001 https://math.stackexchange.com/questions/772039/how-does-the-svd-solve-the-least-squares-problem/2173715#2173715 — dantopa, Mar 31 '17 at 13:30

Is the pseudoinverse matrix the solution to the least squares problem?

3 Answers3

Linked