Singular value decomposition in the language of operator theory

Question

Let $H_i$ be a $\mathbb R$-Hilbert space, $A\in\mathfrak L(H_1,H_2)$ be compact, $|A|:=\sqrt{A^\ast A}$ and $\sigma\in\mathbb R$.

How would we describe the singular value decomposition of $A$ in the language of operator theory? (Assuming $\dim H_i\in\mathbb N$, if necessary.)

To fix terminology, say that $\sigma>0$ is a singular value of $A$ if $\sigma$ is an eigenvalue of $|A|$, i.e. $\mathcal N(\sigma-|A|)\ne\{0\}$. This definition is equivalent to claim that there are $x_i\in H_i$ with $\left\|x_i\right\|_{H_i}=1$ and $$Ax_1\sigma x_2\text{ and }A^\ast x_2=\sigma x_1\tag1.$$

By the Courant-Rayleigh minimax principle, we may enumerate the singular values of $A$ in nonincreasing order. So, let $\sigma_i(A)$ denote the $i$th largest singular value of $A$ for $i\in\mathbb N$. (If there are only $k$ different singular values, $\sigma_i(A)=0$ for all $i>k$.)

Now we may mimic some parts of the spectral theorem for compact self-adjoint operators: Let \begin{align}E_i&:=\mathcal N(\sigma_i(A)-|A|),\\d_i&:=\dim E_i\end{align} and $\left(e^{(i)}_1,\ldots,e^{(i)}_{d_i}\right)$ be an orthonormal basis of $E_i$ for $i\in\mathbb N$ and \begin{align}(\sigma_i)_{i\in\mathbb N}:=(\underbrace{\sigma_1(A),\ldots,\sigma_1(A)}_{=:\:d_1\text{ times}},\underbrace{\sigma_2(A),\ldots,\sigma_2(A)}_{=:\:d_2\text{ times}},\ldots),\\(e_i)_{i\in\mathbb N}:=\left(e^{(1)}_1,\ldots,e^{(1)}_{d_1},e^{(2)}_1,\ldots,e^{(2)}_{d_2},\ldots\right).\end{align} Then $(e_i)_{i\in\mathbb N}$ is an orthonormal basis of $\mathcal N(A)^\perp$ (since $\mathcal N(A)=\mathcal N(|A|)$) and $$|A|x_1=\sum_{i\in\mathbb N}\sigma_i\langle x_1,e_i\rangle_{H_1}e_i\tag2.$$

How do we need to proceed? And how is this related to the polar decomposition$^1$ of $A$?

$^1$ There is an unique partial isometry $U$ from $H_1$ to $H_2$ with $\mathcal N(U)=\mathcal N(A)$ and $A=U|A|$.

Martin Argerami · Accepted Answer · 2020-05-13T20:11:47.957

3

The singular value decomposition is obtained from the polar decomposition, together with the spectral theorem.

The polar decomposition gives you $A=V|A|$, where $V$ is a partial isometry such that $\operatorname{ran}V^*V=\overline{\operatorname{ran}A^*}$, and $|A|=(A^*A)^{1/2}$. Since $|A|\in B(H_1)$ and positive and compact, we apply the Spectral Theorem to obtain $$\tag1 |A|=\sum_{j=1}^\infty\sigma_j\,P_j, $$ where $\sigma_1\geq\sigma_2\geq\cdots\geq0$ and each $P_j$ is a rank-one projection. We can rewrite $(1)$ as $$\tag2 |A|=U^*DU, $$ where $U$ is a unitary and the is the diagonal operator (in the canonical basis, say) with diagonal $\sigma_1,\sigma_2,\ldots$

Then $$\tag3 A=VU^*DU=WDU $$ where $D$ is as above, $W$ is a partial isometry, and $U$ is unitary.

An often more useful way of writing this is choosing unit vectors $e_j$ with $P_je_j=e_j$ (so they form an orthonormal basis of the range of $|A|$) and write $(1)$ as $$\tag4 |A|=\sum_{k=1}^\infty\sigma_k\,\langle\cdot,e_k\rangle \,e_k. $$ Then $$\tag5 A=V|A|=\sum_{k=1}^\infty\sigma_k\,\langle\cdot,e_k\rangle \,Ve_k. $$ As $V$ is an isometry on $\operatorname{ran}|A|$, we get that $\{Ve_k\}$ is orthonormal. So the Singular Value Decomposition can be restated as saying

If $A\in L(H_1,H_2)$ is compact there exist orthonormal families $\{e_k\}\subset H_1$ and $\{f_k\}\subset H_2$ such that $$\tag6 A=\sum_{k=1}^\infty\sigma_k\,\langle\cdot,e_k\rangle \,f_k. $$

edited May 13 '20 at 20:11

answered May 13 '20 at 04:06

Martin Argerami

205,756

As I read your answer, I've remembered that I already knew the answer some time ago. The funny thing is that I've answered a related question asked by you ... I'm hopping between too many disciplines recently; repeatedly forgetting things I already knew. – 0xbadf00d May 13 '20 at 04:45
BTW, is there an established naming for the operators $A^\ast A$ and $|A|$? – 0xbadf00d May 13 '20 at 04:49
Not super standard, but it is common to say that $|A|$ is the absolute value of $A$. – Martin Argerami May 13 '20 at 05:09
Thanks for mentioning it. Oh, and please note that something went wrong in your identity $V^V=\overline{\operatorname{ran}A^}$. We've clearly got $\mathcal N(V)=\mathcal N(A)$ and hence $\overline{\mathcal R(A^\ast)}=\mathcal N(V)^\perp$. – 0xbadf00d May 13 '20 at 10:07
Everything in your post is clear to me. Especially the "more useful" form. But I fail to see how we construct the unitary operator $U$. I've asked for that separately: https://math.stackexchange.com/q/3657914/47771. – 0xbadf00d May 13 '20 at 14:18
And there is another subtlety: You say the $e_j$ form an orthonormal basis of $\mathcal R(|A|)$, but I think they actually even form an orthonormal basis of $\mathcal N(A)^\perp=\mathcal N(|A|)^\perp=\overline{\mathcal R(|A|)}$. (And note that $\mathcal R(|A|)$ doesn't need to be closed; hence not be Hilbert space itself.) Beyond that, I'm a bit struggling to answer the question whether $\operatorname{rank}|A|=\operatorname{rank}A$. It is surely true when $H_2$ is finite dimensional (since then $\overline{\mathcal R(|A|)}=\mathcal R(|A|)$). – 0xbadf00d May 13 '20 at 14:48
Your first question: $\mathcal N(V)^\perp=\mathcal N(V^V)^\perp=\mathcal R(V^V)$, so $V^V$ is the projection onto $\overline{\mathcal R(A^)}$. – Martin Argerami May 13 '20 at 14:58
Second question: it's a basis for the range of $|A|$: you can see it explicitly in $(4)$. And as $V$ is an isometry on the range of $|A|$ it preserves dimension, so the rank is the same. – Martin Argerami May 13 '20 at 15:06
First question: What I meant was that your identity doesn't make sense, since the left-hand side is an operator, while the right-hand side is a space. – 0xbadf00d May 13 '20 at 15:25
Second question: There seems to be something I'm missing. I've read that the orthonormal basis $(e_i)$ from the spectral theorem (which is only an orthonormal basis of $\mathcal R(|A|)$ as you say) can be supplemented to an orthonormal basis of $H_1$ by an orthonormal basis of $\mathcal N(|A|)$. How is this possible? I thought this would be an application of $H_1=\mathcal N(|A|)\oplus\mathcal N(|A|)^\perp$, but if $(e_i)$ is not an orthonormal basis of $\mathcal N(|A|)^\perp$, I don't get why this should hold. – 0xbadf00d May 13 '20 at 15:28
Yes, one can make the distinction. It doesn't give you any information, though: a projection identifies naturally with its range. – Martin Argerami May 13 '20 at 15:29
Yes, you have ${e_j}^\perp=R(|A|)^\perp=\ker |A|$. Not sure what the problem is. – Martin Argerami May 13 '20 at 15:34
What are you denoting by ${e_j}^\perp$? My problem is that $H_1=M\oplus M^\perp$ only holds for closed subspaces $M$. If $(e_i){i\in I}$ is an orthonormal basis of $\mathcal R(|A|)$, how do you express an element of $\overline{\mathcal R(|A|)}\setminus\mathcal R(|A|)$ only with $(e_i){i\in I}$ and an orthonormal basis of $\mathcal N(|A|)$? – 0xbadf00d May 13 '20 at 17:16
The orthogonal of a set is the set of those elements orthogonal to each element of the set; you've been using this notation all the time. As for your other questions, how is an "orthonormal basis of $M$" different from an "orthonormal basis of $\overline M$"? – Martin Argerami May 13 '20 at 17:26
Is this a question or are you saying that an orthonormal basis of $M$ is automatically an orthonormal basis of $\overline M$? I call $(x_i){i\in I}$ an orthonormal basis of a subset $M$ of a Hilbert space $H$ if $x=\sum{i\in I}\langle x,e_i\rangle_He_i$ for all $x\in M$. – 0xbadf00d May 13 '20 at 17:39
Yes. And what would an orthonormal basis for $\overline M$ be? If you can enlarge the basis, you would get that $M$ is not dense in $\overline M$. – Martin Argerami May 13 '20 at 17:43
Now you're confusing me. You say that if $(e_i)_{i\in I}$ is an orthonormal basis of $M$, then it must also be an orthonormal basis of $\overline M$? But this is precisely what I said initially for $M=\mathcal R(|A|)$, isn't it? Sorry if I'm missing something obvious. – 0xbadf00d May 13 '20 at 17:46
I feel a bit stupid, since you have definitely more knowledge than me, but I really don't get it. Also struggle to see what you wrote about $\operatorname{rank}|A|$. You say that $V$ is an isometry on $\mathcal R(A)$, but I would say it is even an isometry $\overline{\mathcal R(A)}$. However, from the fact it is an isometry on $\mathcal R(A)$ we can infer that $\operatorname{rank}|A|=\dim\mathcal R(|A|)=\dim U\mathcal R(|A|)$ and now I guess we just need to notice that $U\mathcal R(|A|)=\mathcal R(U|A|)=\mathcal R(A)$. – 0xbadf00d May 13 '20 at 18:31
In any case, I think you can strengthen your conclusion in your answer: $(e_i){i\in I}$ and $(f_i){i\in I}$ are not arbitrary orthonormal families of $H_1$ and $H_2$, respectively. They are orthonormal bases of $\mathcal R(|A|)$ and $\mathcal R(A)$, respectively. In particular, they have the same dimension. It would be great if you could fill the gaps which are still open in my last two comments. Even if it seems to be obvious for you. – 0xbadf00d May 13 '20 at 19:10
At this stage, I don't know what you understand and what you don't. Talking about an orthonormal basis of a non-closed space makes little sense. Like saying that ${e^{2\pi i k t}}_k$ is an orthonormal basis of $C[0,1]$ inside $L^2[0,1]$; it's already a basis of $L^2[0,1]$, so making the distinction is kind of pointless. That they are orthonormal bases of $\overline{\mathcal R(|A|)}$ and $\overline{\mathcal R(A)}$ respectively, follows directly from $(6)$; so I have no idea what gaps you are talking about. – Martin Argerami May 13 '20 at 20:32
Then we simply got a major misunderstanding. That the ${e_j}$ are an orthonormal basis of $\overline{\mathcal R(|A|)}$ was what I wrote initially and I felt like you've corrected me that they are only an orthonormal basis of $\mathcal R(|A|)$ in your third comment. – 0xbadf00d May 14 '20 at 04:10
I don't know what correction you refer to. But, more importantly, what difference could it possibly make? Again: an orthonormal basis cannot be "only" the orthonormal basis of a non-closed subspace. An orthonormal basis of a subspace is an orthogonal set of vectors that is maximally orthogonal in the subspace. If it is maximal orthogonal in a subspace, it is maximal orthogonal in the closure; so the distinction you are trying to make is absolutely irrelevant. – Martin Argerami May 14 '20 at 06:11
Yes, you're right. And that was why I was so confused by your correction. But I guess you didn't try to correct me and I just understood your comment totally wrong. – 0xbadf00d May 14 '20 at 06:13
I cannot comment/edit/delete the "correction" because I don't know what it is. – Martin Argerami May 14 '20 at 06:21
How did you obtain that $W=VU^\ast$ is a partial isometry? And what are $\mathcal N(W)$ and $\mathcal R(W)$? Clearly, $U$ is unitary and hence bijective. $V$ is a partial isometry with $\mathcal N(V)=\mathcal N(A)=\mathcal N(|A|)$ and $\mathcal N(V)={x\in H_1:U^\ast x\in\mathcal N(V)}$ ... – 0xbadf00d May 15 '20 at 18:53
1

Since $U$ is a unitary, $W^W=UV^VU^$, which is a projection; that's enough to imply that $W$ is a partial isometry. The equality $(3)$ tells you that $R(A)\subset R(W)$. Also, $WW^=VU^UV^=VV^*$, so R(W)=R(V)$. – Martin Argerami May 15 '20 at 19:18
Thanks. It's clear to me that $W^\ast W$ is an orthogonal projection and hence $W$ is a partial isometry. $\mathcal R(A)\subseteq\mathcal R(W)$ is trivial from the definition. But how did you conclude $\mathcal R(W)=\mathcal R(V)$ from $WW^\ast=VV^\ast$? – 0xbadf00d May 16 '20 at 04:59
1

As $W$ is a partial isometry, the projection $WW^*$ is the projection onto $R(W)$. – Martin Argerami May 16 '20 at 14:28
Can we alter the definition of $W$ and $D$ so that $W$ is an orthogonal operator on $H_2$? I've asked this question for the finite-dimensional case here: https://math.stackexchange.com/q/3656131/47771. But I'd be interested in the general case as well. – 0xbadf00d May 17 '20 at 19:04
I don't think so. If $A$ is a weighted unilateral shift, then $U=I$ and $W$ is the unilateral shift. – Martin Argerami May 17 '20 at 20:32
I'm not sure why this example shows that the desired construction is not possible. Maybe you've got my intent wrong? I'd like to have a result like Theorem 1.1 here: https://www2.math.ethz.ch/education/bachelor/lectures/hs2014/other/linalg_INFK/svdneu.pdf. – 0xbadf00d May 18 '20 at 05:02
Yes, that's exactly what I answered. If, in your notation, $$A=\sum_n\frac1n,e_{n+1}\otimes e_n,$$ then $A$ cannot be written as $WDV$ with $D$ diagonal and $W,V$ unitaries. – Martin Argerami May 18 '20 at 05:44
I see ... So, this is only possible in the finite-dimensional case, right? – 0xbadf00d May 18 '20 at 06:33
Yes, in finite-dimension you can do the polar decomposition with a unitary. – Martin Argerami May 18 '20 at 14:53
I'm wondering whether the SVD is uniquely characterized by assuming that $W$ is a partial isometry: https://math.stackexchange.com/q/3693892/47771. I'm not sure whether this a general property of partial isometries, but at least the specific one you've constructed in your answer has "$|I|$ orthogonal columns": Since $V^\ast V$ is the orthogonal projection onto $\overline{\mathcal R(|A|)}$ and $(e_i){i\in I}$ is an ONB of this space, $$\langle f_m,f_n\rangle{H_2}=\begin{cases}\delta_{mn}&\text{, if }m,n\in I\0&\text{otherwise}\end{cases}$$ for all $m,n\in N_1=\mathbb N\cap[1,\dim H_1]$. – 0xbadf00d May 28 '20 at 15:58
From this we obtain $$W^\ast W=\sum_{i\in I}\tilde e_i\otimes e_i$$ (sum in the SOT), where $(\tilde e_n){n\in N_1}$ is an arbitrary ONB of $H_1$ we have chosen such that $D=\sum{i\in I}\sigma_i\tilde e_i\otimes\tilde e_i$, and $$WW^\ast=\sum_{n\in N_1}f_n\otimes f_n.$$ What I'm trying to achieve is to see why $A^\ast A=U^\ast DW^\ast WDU=U^\ast D^2U$ (i.e. why the $W^\ast W$ can be removed). – 0xbadf00d May 28 '20 at 15:58

Singular value decomposition in the language of operator theory

1 Answers1

Linked