Bounding the Hilbert-Schmidt norm of a certain linear operator

Question

This is a follow up to the post A bound for the Hilbert-Schmidt norm of a linear operator without using commutivity or simultaneous diagonalizability

Consider an infinite dimensional separable Hilbert space $\mathcal{H}$, and let $A$, $B_m$, and $L$ denote linear, compact operators. Suppose further that $\|B_m\|_{op} \le \rho^m$ for some $0<\rho < 1$, and $L$ is symmetric and positive definite, so that the spectral theorem gives

$$ L(\cdot) = \sum_{\ell=1}^\infty \lambda_\ell \langle \phi_\ell,\cdot\rangle \phi_\ell, \;\; \sum_{\ell=1}^\infty \lambda_\ell < \infty. $$ We define a pseudo-inverse of $L$ as

$$ L^{-1}\pi_n(\cdot) = \sum_{\ell=1}^n \frac{ \langle \phi_\ell,\cdot\rangle}{\lambda_\ell} \phi_\ell. $$ ($\pi_n$ is the projection onto $span(\phi_1,...,\phi_n)$). We assume that $$ D = \sum_{\ell=1}^\infty \frac{ \|A(\phi_{\ell})\|^2}{\lambda_\ell} < \infty. $$ What I want to then show is that

$$ \xi_m =\|L^{1/2}B_mL^{-1}\pi_n A^* \|_{HS}^2 = \sum_{j=1}^\infty \|L^{1/2}B_mL^{-1}\pi_n A^*(\phi_j)\|^2 \le Const. \rho^m. $$ It is relatively easy to show this if $B_m$ and $L^{1/2}$ commute, but in general it is not true. https://math.stackexchange.com/users/22016/stephen-montgomery-smith devised a very nice counter example to this.

My question now is: is it possible to establish this bound if instead $B_m = R^m$, where $R$ is a linear operator satisfying $\|R\|_{op}<1$?

This seems more likely to hold now since, in reference to the counter example devised by @Stephen Montgomery-Smith, the range space for $B_m$ cannot be shifting around as before to meet spaces with larger eigenvalues. Any ideas by you smart folks are much appreciated!

Frederik vom Ende · Accepted Answer · 2023-01-19T07:07:54.173

I strongly suspect that it is not enough to assume that $\sum_{\ell=1}^\infty \frac{ \|A\phi_{\ell}\|^2}{\lambda_\ell} < \infty$, but one rather needs $\sum_{\ell=1}^\infty \frac{ \|A\phi_{\ell}\|^2}{\lambda_\ell^{\bf 2}} < \infty$ (similar to what you state in your original post). This claim will be based on the following explicit calculation for the general case.

Because $L$ in particular is a normal trace-class operator (as you assume $L$ is compact with $\sum_\ell|\lambda_\ell|=\sum_\ell\lambda_\ell<\infty$) between separable Hilbert spaces, there exists an orthonormal basis $(\phi_j)_{j=1}^\infty$ of $\mathcal H$ and an $\ell^1$-sequence $\lambda$ such that $L=\sum_{j=1}^\infty\lambda_j |\phi_j\rangle\langle\phi_j|$. The basis property is important for evaluating the Hilbert-Schmidt norm as for any bounded operator $X$ between separable Hilbert spaces one has $\|X\|_\mathrm{HS}^2=\sum_{j,k}|\langle h_j,Xg_k\rangle|^2$ where $(g_j)_j,(h_k)_k$ is an arbitrary orthonormal basis of the respective space (cf. Proposition 16.10 in "Introduction to Functional Analysis" (1997) by Meise & Vogt). Now define $D:=\sum_{\ell=1}^\infty \frac{ \|A\phi_{\ell}\|^2}{\lambda_\ell^{\bf 2}} $ and assume $D$ is finite. We for all $m,n$ compute \begin{align*} \|L^{1/2}B_mL^{-1}\pi_n A^* \|_{\mathrm{HS}}^2 &=\sum_{j,k=1}^\infty\big|\langle\phi_j, L^{1/2}B_m(L^{-1}\pi_n) A^* \phi_k\rangle\big|^2\\ &=\sum_{j,k=1}^\infty\Big|\Big\langle L^{1/2}\phi_j, B_m\Big( \sum_{\ell=1}^n \frac{ |\phi_\ell\rangle\langle \phi_\ell|}{\lambda_\ell} \Big) A^* \phi_k\Big\rangle\Big|^2\\ &= \sum_{j,k=1}^\infty\lambda_j\Big|\sum_{\ell=1}^n \langle\phi_j,B_m\phi_\ell\rangle\frac{\langle\phi_\ell,A^*\phi_k\rangle}{\lambda_\ell} \Big|^2\\ &\leq \sum_{j,k=1}^\infty\lambda_j\Big(\sum_{\ell=1}^n |\langle\phi_j,B_m\phi_\ell\rangle|^2\Big)\Big(\sum_{\ell=1}^n\frac{|\langle\phi_\ell,A^*\phi_k\rangle|^2}{\lambda_\ell^2}\Big) \end{align*} where in the last step we used the Cauchy-Schwarz inequality on $\mathbb C^n$. Because the numbers we are summing up are non-negative, by a version of Fubini's theorem we are free to interchange summation as we please. Hence, using Bessel's inequality for $\sum_{\ell=1}^n |\langle\phi_j,B_m\phi_\ell\rangle|^2=\sum_{\ell=1}^n |\langle\phi_\ell,B_m^*\phi_j\rangle|^2\leq\|B_m^*\phi_j\|^2$, respectively Parseval's identity we find \begin{align*} \|L^{1/2}B_mL^{-1}\pi_n A^* \|_{\mathrm{HS}}^2 &\leq \sum_{j=1}^\infty\sum_{\ell=1}^n\frac{\lambda_j}{\lambda_\ell^2} \|B_m^*\phi_j\|^2\sum_{k=1}^\infty|\langle\phi_k,A\phi_\ell\rangle|^2\\ &=\sum_{j=1}^\infty\sum_{\ell=1}^n\frac{\lambda_j}{\lambda_\ell^2} \|B_m^*\phi_j\|^2\|A\phi_\ell\|^2\\ &\leq \sum_{j,\ell=1}^\infty\frac{\lambda_j}{\lambda_\ell^2} \|B_m\|_{\mathrm{op}}^2\|A\phi_\ell\|^2\\ &=\Big( \sum_{j=1}^\infty\lambda_j\Big)\Big(\sum_{\ell=1}^\infty\frac{\|A\phi_\ell\|^2}{\lambda_\ell^2} \Big)\|B_m\|_{\mathrm{op}}^2=\|L\|_1D\|B_m\|_{\mathrm{op}}^2 \end{align*} Finally, if $\|B_m\|_{\mathrm{op}}\leq\rho^m$ for some $\rho\in(0,1)$ (so $\rho^2\leq\rho$), then one gets $$ \xi_m\leq D\|L\|_1\rho^{2m}=D\|L\|_1(\rho^2)^m\leq D\|L\|_1\rho^m $$ for all $m$ as desired.

The reason I believe $\lambda_\ell$ in the denominator of the constant $D$ is not enough in general is that $\xi_m$ inevitably features a factor $\lambda_{\ell}^{-2}$ (third step in the above calcuation). If the constant $D$ only absorbs one of these the second one has nowhere to go, thus making things diverge:

the front $L^{1/2}$ cannot cancel it because the $B_m$ denies it access to the corresponding eigenvector $\phi_\ell$
the $B_m$ cannot cancel it when bounding $|\langle\phi_j,B_m\phi_\ell\rangle|^2$ because there is no assumed connection between $B_m$ and the eigenvalues of $L$, also not if you choose $B_m:=R^m$ for some arbitrary (but fixed) $R$; see also the edit below

This is why, in my opinion, the only place the second $\lambda_\ell^{-1}$ can go is the constant $D$. This would also be in line with the counterexample Martin provided under your original post where he commented that the assumption $\sum_{\ell=1}^\infty \frac{ \|A\phi_{\ell}\|^2}{\lambda_\ell} < \infty$ is probably a typo and the denominator should feature $\lambda_\ell^2$ instead.

Edit 1: There is another way to support my claim using $\|A\|_\mathrm{HS}^2=\operatorname{tr}(A^*A)$. Combining this with cyclicity of the trace one formally gets \begin{align*} \|L^{1/2}B_m(L^{-1}\pi_n)A\|_\mathrm{HS}^2&=\operatorname{tr}\big((L^{1/2}B_m(L^{-1}\pi_n)A)^*L^{1/2}B_m(L^{-1}\pi_n)A\big)\\ &=\operatorname{tr}\big( A^*(L^{-1}\pi_n)B_m^*LB_m(L^{-1}\pi_n)A \big)\\ &=\operatorname{tr}\big( (L^{-1/2}\pi_n)AA^*(L^{-1/2}\pi_n)(L^{-1/2}\pi_n)B_m^*LB_m(L^{-1/2}\pi_n) \big)\\ &\leq\big\| (L^{-1/2}\pi_n)AA^*(L^{-1/2}\pi_n) \big\|_{1}\big\|(L^{-1/2}\pi_n)B_m^*LB_m(L^{-1/2}\pi_n) \big\|_{\mathrm{op}} \end{align*} Note that until the last step -- where we used the usual trace norm inequality for products -- this was an equality, so no "loss of information" yet. The first factor in this upper bound (almost, if $A$ is replaced by $A^*$) equals your constant $D$ as \begin{align*} \big\| (L^{-1/2}\pi_n)AA^*(L^{-1/2}\pi_n) \big\|_{1}&=\operatorname{tr}\big( (L^{-1/2}\pi_n)AA^*(L^{-1/2}\pi_n) \big)\\ &=\sum_{\ell=1}^n\frac{\langle\phi_\ell,AA^*\phi_\ell\rangle}{\lambda_\ell}=\sum_{\ell=1}^n\frac{\|A^*\phi_\ell\|^2}{\lambda_\ell}\,. \end{align*} As before the problem is the other factor: a similar calculation shows \begin{align*} \big\|(L^{-1/2}\pi_n)B_m^*LB_m(L^{-1/2}\pi_n) \big\|_{\mathrm{op}}&\leq \sum_{j=1}^\infty\lambda_j\Big(\sum_{\ell=1}^n\frac{|\langle\phi_j,B_m\phi_\ell\rangle|}{\sqrt{\lambda_\ell}}\Big)^2\\ &\leq\Big(\sum_{j=1}^\infty\lambda_j\Big)\Big(\sum_{\ell=1}^n\frac{\|B_m\phi_\ell\|}{\sqrt{\lambda_\ell}}\Big)^2 \end{align*} which diverges because, in general, there is no guarantee that $(\|B_m\phi_\ell\|\lambda_\ell^{-1/2})_{\ell=1}^\infty$ is summable. The (in my opinion only) way around this problem is the commutative case, i.e. $[B_m,L^{-1/2}\pi_n]=0$ for all $n\in\mathbb N$ because then \begin{align*} \big\|(L^{-1/2}\pi_n)B_m^*LB_m(L^{-1/2}\pi_n) \big\|_{\mathrm{op}}&=\big\|B_m^*(L^{-1/2}\pi_n)L(L^{-1/2}\pi_n)B_m \big\|_{\mathrm{op}}\\ &=\|B_m^*\pi_nB_m\|_\mathrm{op}\leq\|B_m\|_\mathrm{op}^2<\infty \end{align*}

Edit 2: One can re-formulate Edit 1 to show how the commutator $\Delta_{mn}:=[B_m,L^{-1/2}\pi_n]$ determines how good the upper bound for $\xi_m$ is (I'll leave out the ${}_\mathrm{op}$ for readability): \begin{align*} \big\|(L^{-1/2}\pi_n)B_m^*LB_m(L^{-1/2}\pi_n) \big\|&= \big\|\big(\Delta_{mn}+L^{-1/2}\pi_n)B_m\big)^*L\big(\Delta_{mn}+L^{-1/2}\pi_n)B_m\big) \big\|\\ &\leq \|\Delta_{mn}^*L\Delta_{mn}\|+2\|B_m^*\pi_nL^{1/2}\Delta_{mn}\|+\|B_m^*\pi_nB_m\|\\ &\leq \|\Delta_{mn}^*L^{1/2}L^{1/2}\Delta_{mn}\|+2\|B_m\|\|L^{1/2}\Delta_{mn}\|+\|B_m\|^2\\ &=(\|L^{1/2}\Delta_{mn}\|+\|B_m\|)^2 \end{align*} So this approach works whenever one can uniformly upper bound $\|L^{1/2}\Delta_{mn}\|$, e.g., in the case where the operators commute so $\Delta_{mn}=0$. Of course one could do something similar with the original $\xi_m$ and get an analogous (possibly even better) condition on $\Delta_{mn}$.

Thank you very much for the response and time thinking about this problem @Frederik vom Ende ! I agree it seems like it could be that what is required in general is that $\lambda_\ell$ should be replaced with $\lambda_\ell^2$ in the condition. — LostStatistician18, Jan 17 '23 at 14:46
The reason I hold back hope is in each case where an upperbound is obtained we use $|\langle \phi_\ell, B_m \phi_j \rangle| \le |B_m \phi_j|$. When $B_m$ and $L$ commute (mutually diagonalizable) then we get this is zero if $j \ne \ell$, which leads to big savings. Could it be possible that in the $B_m= R^m$ case that somehow $|\langle \phi_\ell, B_m \phi_j \rangle|$ is small when $|j-\ell|$ is large, so that we can bound the sum for $|j-\ell|$ is large and $|j-\ell|$ small. Perhaps rather than $[B_m,L^{1/2}\pi_n]=0$ we can use $[B_m,L^{1/2}\pi_n]$ small in the right places. — LostStatistician18, Jan 17 '23 at 14:51
Good point, I put an edit at the end of my answer which "quantifies" how the commutator $[B_m,L^{-1/2}\pi_n]$ connects to whether this approach works or not. — Frederik vom Ende, Jan 18 '23 at 12:21

Bounding the Hilbert-Schmidt norm of a certain linear operator

1 Answers1