6

One well known fact about matrix norms is the following:

If $\lambda_1\geq \dots\geq \lambda_n$ are eigenvalues of a square matrix $A$, then:

$$\frac{1}{||A^{-1}||} \leq |\lambda|\leq ||A||$$

If we take our matrix norm to be the matrix 2-norm, and recall that the matrix 2 norm gives us the largest singular value, i.e. $||A||_2=\sigma_1$, then the upper bound implies:

$$|\lambda_1|\leq \sigma_1$$

My question is: are there necessary and sufficient conditions for when equality above holds? I vaguely remember something like: $|\lambda_1|=\sigma_1$ iff $\lambda_1$ is non-defective, i.e. its algebraic multiplicity equals its geometric multiplicity. However, I have not been able to find a reference or prove this. Can someone point me to a reference or provide a proof. Thank you in advance for your time.

John
  • 175
  • 1
  • 5

2 Answers2

4

Let $\mathcal{S}_{\rho}$ be the subspace of eigenvectors corresponding to the eigenvalues $\lambda$ of $A$ such that $\rho(A)=|\lambda|$ and let $\mathcal{V}_1$ be the subspace of left singular vectors of $A$ associated with the maximal singular value $\sigma_1=\|A\|_2$ of $A$ (or, if you like, the eigenvector subspace corresponding to the eigenvalue $\sigma_1^2$ of $A^*A$).

We have that $\rho(A)=\|A\|_2$ if and only if $\mathcal{S}_{\rho}\cap\mathcal{V}_1\neq\{0\}$ (that is, the intersection of the two subspaces is not trivial).

Note that $\|Ax\|_2=\rho(A)\|x\|_2$ for all $x\in\mathcal{S}_{\rho}$ since $Ax=\lambda x$ for some $\lambda$ such that $|\lambda|=\rho(A)$ and $\|Ax\|_2=\sigma_1\|x\|_2$ for all $x\in\mathcal{V}_1$. Also, $\|Ax\|_2<\sigma_1\|x\|_2$ for all $x\not\in\mathcal{V}_1$.

Hence if $\mathcal{S}_{\rho}\cap\mathcal{V}_1$ is nontrivial, then $\rho(A)\|x\|_2=\sigma_1\|x\|_2$ for some nonzero $x\in\mathcal{S}_{\rho}\cap\mathcal{V}_1$ and hence $\rho(A)=\sigma_1$. On the other hand, if $\mathcal{S}_{\rho}\cap\mathcal{V}_1$ is trivial then any nonzero $x\in\mathcal{S}_{\rho}$ is not contained in $\mathcal{V}_1$ and thus $\rho(A)\|x\|_2=\|Ax\|_2<\sigma_1\|x\|_2$ for all nonzero $x\in\mathcal{S}_{\rho}$ resulting in $\rho(A)<\sigma_1$.

Let $$ A=U\Sigma V^* $$ be the SVD of $A$ and consider the partitioning $$ A=[U_1,\tilde{U}]\begin{bmatrix}\sigma_1 I_{n_1} & 0 \\ 0 & \tilde{\Sigma}\end{bmatrix}\begin{bmatrix}V_1^*\\\tilde{V}^*\end{bmatrix}, $$ that is, $$ AV_1=\sigma_1 U_1, \quad A\tilde{V}=\tilde{U}\tilde{\Sigma}, $$ where the columns of $V_1$ span the subspace $\mathcal{V}_1$. If $\mathcal{S}_{\rho}\cap\mathcal{V}_1$ is nontrivial, we can find a unitary transformation $M\in\mathbb{C}^{n_1\times n_1}$ such that $V_1M=[V_{\rho},V_{\rho}^{\perp}]$ where the columns of $V_{\rho}$ form an orthonormal basis of $\mathcal{S}_{\rho}\cap\mathcal{V}_1$ and $V_{\rho}^{\perp}$ spans the remainder of $\mathcal{V}_1$. Therefore $$ AV_1M=A[V_{\rho},V_{\rho}^{\perp}]=\sigma_1 U_1 M = \sigma_1[U_{\rho},U_{\rho}^{\perp}], $$ where the partitioning of $U_1M$ is conforming to the partitioning of $V_1M$. Since $Ax=\lambda x$ with $|\lambda|=\rho(A)$ for all $x$ in the range of $V_{\rho}$, we have that the columns $U_{\rho}$ and $V_{\rho}$ are related by $U_{\rho}=V_{\rho}D$, where $D=\mathrm{diag}(e^{i\phi_1},\ldots,e^{i\phi_k})$ (where $k$ is the dimension of $\mathcal{S}_{\rho}\cap\mathcal{V}_1$).

Therefore, in terms of SVD:

$\rho(A)=\|A\|_2$ iff $A$ has an SVD $A=U\Sigma V^*$ with $\Sigma=\mathrm{diag}(\sigma_1,\ldots,\sigma_n)$, $\sigma_1\geq\cdots\geq\sigma_n$, $U=[u_1,\ldots,u_n]$, $V=[v_1,\ldots,v_n]$, such that $u_j=e^{i\phi_j}v_j$, $j=1,\ldots,k$, for some $k>0$.

2

Consider the Schur decomposition of $\DeclareMathOperator{\tr}{tr} A$, that is, let $U$ be unitary and $T$ be upper triangular such that $A = UTU^*$. I will leave it to you to verify that $\|T\|_2 = \|A\|_2$.

We can then write $T$ in the form $$ T = \pmatrix{\lambda_1 & v^*\\0&T'} $$ where $\lambda_1$ is the eigenvalue with greatest magnitude, $v$ is a vector, and $T'$ is a smaller upper-triangular matrix. Let the vector $x$ be arbitrary with $\|x\| = 1$. We can write $x$ as a block vector in the form $x^T = (x_1, (x')^T)$. We then have $$ T^*T = \pmatrix{|\lambda_1|^2 & (\lambda v)^*\\ \lambda v & vv^* + (T')^*T} $$

$$ x^*T^*Tx = \pmatrix{\overline{x_{1}}&(x')^*} \pmatrix{|\lambda_1|^2 & (\lambda v)^*\\ \lambda v & vv^* + (T')^*T} \pmatrix{x_1\\x'} = \\ |\lambda_1|^2 |x_1|^2 + 2 \text{Re}\left\{\lambda x_1 (x')^* v\right\} + (x')^* [vv^* + (T')^*T'] x' $$ And the question you have, then, is under what conditions (on $v$ and $T'$) can we guarantee that this total is bounded above by $|\lambda_1|^2$.

So, that's not a necessary and sufficient condition, but it certainly gets you closer.

In particular, we can say that if $\|T'\| \leq |\lambda_1|$ and $v = 0$, then your condition will be satisfied, which is more general than stating that $A$ is normal.

Ben Grossmann
  • 225,327