4

Let $A$ be $m\times n$ complex matrix. Then how can we prove the following. $$A^+=(A^*A)^DA^*=A^*(AA^*)^D$$ where $D$ denotes the Drazin inverse and $A^+$ is the Moore-Penrose Pseudoinverse of $A$. I found this question at http://planetmath.org/drazininverse.

Thanks.

Sam
  • 139
  • 1
    This seems to duplicate the mislabled question 'Moore-Penrose pseudo inverse' https://math.stackexchange.com/questions/2167660/moore-penrose-pseudo-inverse – dantopa Mar 16 '17 at 23:24

1 Answers1

6

Core-Nilpotent Decomposition

Let $\mathbf{A}$ be a square matrix of rank $\rho<n$, with index $k$ such that $\text{rank} \left( \mathbf{A}^{k} \right) = \rho$: $$ \mathbf{A} \in \mathbb{C}^{n\times n} $$ then there exists a nonsingular matrix $\mathbf{Q}$ such that $$ \mathbf{A} = \mathbf{Q} \left[ \begin{array}{cc} \mathbf{C} & \mathbf{0} \\ \mathbf{0} & \mathbf{N} \end{array} \right] \mathbf{Q}^{-1} $$ where the nonsingular core matrix $$ \mathbf{C} \in \mathbb{C}^{\rho \times \rho} $$ and the nilpotent matrix $\mathbf{N}$ has index $k$. We may think of this as a dilute form of diagonalization.

Drazin inverse

The Drazin inverse is defined in terms of the core-nilpotent decomposition $$ \mathbf{A}^{D} = \mathbf{Q} \left[ \begin{array}{cc} \mathbf{C}^{-1} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \mathbf{Q}^{-1} $$

Singular value decomposition

The SVD can be viewed as a tool which gets matrices as close to diagonalization as possible. For the target matrix we can lift the requirement that the matrix has as many rows as columns $m=n$: $$ \mathbf{A} \in \mathbb{C}^{m\times n}_{\rho} $$

The SVD provides an orthonormal basis for both domain $\mathbb{C}^{n}$ and codomain $\mathbb{C}^{m}$. $$ \begin{align} % \mathbf{C}^{n} = \color{blue}{\mathcal{R} \left( \mathbf{A} \right)} \oplus \color{red}{\mathcal{N} \left( \mathbf{A}^{*} \right)} \\ % \mathbf{C}^{m} = \color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)} \oplus \color{red} {\mathcal{N} \left( \mathbf{A} \right)} % \end{align} $$ The $\color{red}{nullspace}$ terms will be silent in the pseudoinverse, just as the nilpotent matrix vanishes in the core-nilpotent decomposition. $$ \begin{align} \mathbf{A} &= \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cccccc} \sigma_{1} & 0 & \dots & & & \dots & 0 \\ 0 & \sigma_{2} \\ \vdots && \ddots \\ & & & \sigma_{\rho} \\ & & & & 0 & \\ \vdots &&&&&\ddots \\ 0 & & & & & & 0 \\ \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{V}_{\mathcal{N}}}^{*} \end{array} \right] \\ % & = % U \left[ \begin{array}{cccccccc} \color{blue}{u_{1}} & \dots & \color{blue}{u_{\rho}} & \color{red}{u_{\rho+1}} & \dots & \color{red}{u_{n}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S}_{\rho\times \rho} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{v_{1}^{*}} \\ \vdots \\ \color{blue}{v_{\rho}^{*}} \\ \color{red}{v_{\rho+1}^{*}} \\ \vdots \\ \color{red}{v_{n}^{*}} \end{array} \right] % \end{align} $$ Let's rewrite the SVD in a form similar to the $\mathbf{CN}$ decomposition: $$ \begin{align} \mathbf{A} &= \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{V}_{\mathcal{N}}}^{*} \end{array} \right] \end{align} $$ While the matrix of singular values $\mathbf{S}$ is diagonal, the core matrix $\mathbf{C}$ is simply nonsingular.

Manipulating the singular value decomposition

The forms needed are the Hermitian conjugate and the Moore-Penrose pseudoinverse. Recall that $\mathbf{S}^{\mathrm{T}} = \mathbf{S}$: $$ \begin{align} \mathbf{A} &= \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{V}_{\mathcal{N}}}^{*} \end{array} \right] \\[5pt] %% hc \mathbf{A}^{*} &= \mathbf{V} \, \Sigma^{\mathrm{T}} \mathbf{U}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{U}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{U}_{\mathcal{N}}}^{*} \end{array} \right] \\[5pt] %% mp \mathbf{A}^{\dagger} &= \mathbf{V} \, \Sigma^{\dagger} \mathbf{U}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S}^{-1} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{U}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{U}_{\mathcal{N}}}^{*} \end{array} \right] % \end{align} $$ Assemble the product matrices $$ \begin{align} \mathbf{A}^{*} \mathbf{A} &= \left( \mathbf{V} \, \Sigma^{\mathrm{T}} \mathbf{U}^{*} \right) \left( \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \right) \\ &= \mathbf{V} \, \Sigma^{\mathrm{T}} \Sigma \, \mathbf{V}^{*} = \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] \, \left[ \begin{array}{cc} \mathbf{S}^{2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] ^{*} \\ %% Wy \mathbf{A} \mathbf{A}^{*} &= \mathbf{U} \, \Sigma \Sigma^{\mathrm{T}} \, \mathbf{U}^{*} = \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] \, \left[ \begin{array}{cc} \mathbf{S}^{2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] ^{*} % \end{align} $$ The Moore-Penrose inverses of the product matrices are $$ \begin{align} %% Wx \left( \mathbf{A}^{*} \mathbf{A} \right)^{\dagger} &= \left( \mathbf{V} \, \Sigma^{\mathrm{T}} \Sigma \, \mathbf{V}^{*} \right)^{\dagger} % = \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] \, \left[ \begin{array}{cc} \mathbf{S}^{-2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] ^{*} \\ %% Wy \left( \mathbf{A} \mathbf{A}^{*} \right)^{\dagger} &= \left( \mathbf{U} \, \Sigma \Sigma^{\mathrm{T}} \, \mathbf{U}^{*} \right)^{\dagger} % = \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] \, \left[ \begin{array}{cc} \mathbf{S}^{-2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] ^{*} % \end{align} $$

Connecting to the Drazin inverses

After the associations:

  1. $\mathbf{Q} \to \mathbf{V}$
  2. $\mathbf{Q}^{-1} \to \mathbf{V}^{*}$
  3. $\mathbf{C} \to \mathbf{S}^{2}$

the Drazin inverse is $$ \begin{align} \color{green}{\left( \mathbf{A}^{*} \mathbf{A} \right)^{D}} = \left( \mathbf{V} \, \Sigma^{\mathrm{T}} \Sigma \, \mathbf{V}^{*} \right)^{D} % = \left( \mathbf{V} \left[ \begin{array}{cc} \mathbf{S}^{2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \mathbf{V}^{*} \right)^{D} = \mathbf{V} \left[ \begin{array}{cc} \mathbf{S}^{-2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \mathbf{V}^{*} % &= \color{green}{\left( \mathbf{A}^{*} \mathbf{A} \right)^{\dagger}} \\ \end{align} $$ The other Drazin inverse should be straightforward: $$ \begin{align} \color{green}{\left( \mathbf{A} \mathbf{A}^{*} \right)^{D}} = \left( \mathbf{V} \, \Sigma^{\mathrm{T}} \Sigma \, \mathbf{V}^{*} \right)^{D} = \mathbf{U} \left[ \begin{array}{cc} \mathbf{S}^{-2} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] \mathbf{U}^{*} % &= \color{green}{\left( \mathbf{A} \mathbf{A}^{*} \right)^{\dagger}} \\ \end{align} $$

Conclusion

Another post derives the different forms of the Moore-Penrose pseudoscience. What forms does the Moore-Penrose inverse take under systems with full rank, full column rank, and full row rank? Two specific cases are of interest in this post.

Case (1):

The only nontrivial nullspace is $\color{red}{\mathcal{N}_{\mathbf{A}^{*}}}$: the target matrix is overdetermined, more rows than columns, full column rank. The normal equations solution is equivalent to the pseudoinverse: $$ \mathbf{A}^{\dagger} = \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1} \mathbf{A}^{*} $$ Using an identity from the previous section leaves us with $$ \mathbf{A}^{\dagger} = \left( \mathbf{A}^{*} \mathbf{A} \right)^{-1} \mathbf{A}^{*} = \left( \mathbf{A}^{*} \mathbf{A} \right)^{D} \mathbf{A}^{*} $$

Case (2):

The only nontrivial nullspace is $\color{red}{\mathcal{N}_{\mathbf{A}}}$: the target matrix is underdetermined, more columns than rows, full row rank. The normal equations solution is equivalent to the pseudoinverse: $$ \mathbf{A}^{\dagger} = \mathbf{A}^{*} \left( \mathbf{A} \mathbf{A}^{*} \right)^{-1} = \mathbf{A}^{*} \left( \mathbf{A} \mathbf{A}^{*} \right)^{D} $$

Case (3)

Neither nullspace is trivial. To prove the conjecture with the Drazin inverse, start with the Moore-Penrose identity $$ \mathbf{A}^{\dagger} = % \color{green}{\left( \mathbf{A}^{*} \mathbf{A} \right)^{\dagger}} \mathbf{A}^{*} = % \mathbf{A}^{*} \color{green}{\left( \mathbf{A} \mathbf{A}^{*} \right)^{\dagger}} $$ (The proof of this statement in on p. 27 Regression and the Moore-Penrose pseudoinverse)

The block form manipulations of the SVD verify this statement quickly. Notice there is no assumption about the invertability of the product matrices. Using the $\color{green}{equivalence}$ relationships provides $$ \mathbf{A}^{\dagger} = % \color{green}{\left( \mathbf{A}^{*} \mathbf{A} \right)^{D}} \mathbf{A}^{*} = % \mathbf{A}^{*} \color{green}{\left( \mathbf{A} \mathbf{A}^{*} \right)^{D}} $$

dantopa
  • 10,342
  • @dantopa...Thanks for your solution. But, my question is not answered yet. Here I am asking why is $A^+=(A^A)^DA^$ or $A^+=A^(AA^)^D$ in the 3rd case? It seems that you have proved for the case $(A^A)^D=(A^A)^+$ and $(AA^)^D=(AA^)^+$. – Sam Mar 24 '17 at 18:41
  • @Sam: Certainly the conclusion must be more explicit. Implied in this answer are results in this post. http://math.stackexchange.com/questions/1537880/what-forms-does-the-moore-penrose-inverse-take-under-systems-with-full-rank-ful/2200203#2200203. If the updated answer is not satisfactory, please post your criticisms. Thanks for a well-posed and insightful question. – dantopa Mar 24 '17 at 20:54
  • @ dantopa.. I agree with your solution. But, you haven't stated for the 3rd case. I mean when $A$ has neither full column rank nor full row rank. Thanks for your time, too. Still I am struggling for it. – Sam Mar 24 '17 at 21:36
  • 3
    Note that the nilpotent matrix $\mathbf{N}$ in the core-nilpotent decomposition of $A^* A$ will be $0$, since $A^* A$ is diagonalizable. – darij grinberg Mar 24 '17 at 22:07