0

I have been trying to work through the Wikipedia article titled compact operator on a Hilbert space. I have made it though to the section 'Spectral theorem', subsection 'the idea'. This section seems poorly written, and I would appreciate some clarification and, hopefully, someone can update the Wiki article so that its a bit more helpful.

Firstly, in their derivation of $h'(0)=0$, it seems to go round in a bit of a circle. What exactly are they trying to do? I would have thought that they would be trying to get to $d/dt \langle T(y+tz) , y+tz \rangle$ because this is known to be zero by the assertion that $y$ is extremal, and they get so close and then seem to go right back to the expression they started with. What's going on here?

Secondly, they define $m = \langle Ty , y \rangle / \langle y , y \rangle$ and claim that this can be rearranged to $Re \langle Ty - my , z \rangle = 0$ for arbitrary $z$. I can get to $\langle Ty - my , y \rangle = 0$ straightforwardly, what have they done to introduce the arbitrary $z$?

Eddy
  • 1,139
  • 7
  • 12

1 Answers1

1

Let us focus on finite-dimensional Hilbert spaces $H$. Suppose we have a self-adjoint linear map $T:H\to H$. In order to prove the existence of an orthonormal basis of eigenvectors, one starts off by proving that there is one eigenvalue-eigenvector pair $(\lambda,v)$ (and one can normalize $v$). Then, one shows that $T$ restricts to a linear map on the orthogonal complement $\{v\}^{\perp}\to\{v\}^{\perp}$ and the restriction is still self-adjoint. Then, one can continue inductively, and the process will terminate in a finite number of steps (since $H$ is finite-dimensional), which means we have found an orthonormal basis of eigenvectors, and hence we’re done.

So, it all boils down to proving the existence of a single eigenvalue-eigenvector pair. Now, consider the quantity \begin{align} \lambda:=\sup\limits_{\|y\|=1}\langle T(y),y\rangle=\sup\limits_{y\in H\setminus\{0\}}\frac{\langle T(y),y\rangle}{\|y\|^2}. \end{align} Keep in mind that since $T$ is self-adjoint, we’re taking the supremum of real-valued quantities. Next, since $T$ is a linear map on a finite-dimensional space, it is automatically a bounded operator (equivalently a continuous linear map). Hence, $y\mapsto \langle T(y),y\rangle$ is a continuous function on the compact (by finite-dimensionality!) unit sphere $S$ of $H$. Hence, by the extreme value theorem, this supremum is finite (i.e $\lambda\in\Bbb{R}$ is well-defined) and in fact this supremum is actually attained by some vector $v\in S$. We now claim that the number $\lambda$ we have defined above is actually an eigenvalue of $T$ and that this vector $v$ is an eigenvector.

One way to prove this is by differential calculus. Consider the function $f:H\setminus\{0\}\to \Bbb{R}$ defined as $f(y)=\frac{\langle T(y),y\rangle}{\|y\|^2}$. We have argued above that $\lambda$ is the maximum value of $f$ and that the unit vector $v$ is a maximum point. By basic differential calculus, it follows that $Df_v=0$ “derivative at a maximum point vanishes”. Hence, for all $z\in H$, we must have $Df_v(z)=0$. This is where the Wiki page seems to get tripped up with the computations. By the chain rule, the equation $Df_v(z)=0$ is equivalent to perhaps the more familiar directional derivative $\frac{d}{dt}\bigg|_{t=0}f(v+tz)=0$. I guess people are more comfortable with directional derivatives, which is why Wiki decided to write things in that manner, so here goes with the computation (keep in mind that $\|v\|=1$): for all $z\in H$, \begin{align} 0&=\frac{d}{dt}\bigg|_{t=0}f(v+tz)\\ &=\frac{d}{dt}\bigg|_{t=0}\frac{\langle T(v+tz),v+tz\rangle}{\|v+tz\|^2}\\ &=\frac{\|v+tz\|^2\left[\langle T(z),v+tz\rangle + \langle T(v+tz), z\rangle\right] - \langle T(v+tz),v+tz\rangle \cdot 2\text{Re}(\langle v,z\rangle)}{\|v+tz\|^4}\bigg|_{t=0}\\ &= \langle T(z),v\rangle+\langle T(v),z\rangle-2\lambda\text{Re}\left(\langle v,z\rangle\right)\tag{$*$}\\ &=2\text{Re}\left(\langle T(v),z\rangle\right) -2\lambda\text{Re}\left(\langle v,z\rangle\right)\tag{$T$ self adjoint}\\ &=2\text{Re}\left\langle T(v)-\lambda v,z\right\rangle. \end{align} I have used things like the quotient rule and ‘product rule’ (which is valid since the inner product is bilinear; see here for a general product rule). Once again, note that I’ve used $\|v\|=1$ and the definition $\lambda=\langle T(v),v\rangle$ in $(*)$, and in the last line I used that $\lambda$ is real, hence I was able to move it inside the real part. Notice that this equality holds for all $z$, so by applying this to $iz$, we get that the imaginary part also vanishes (of course on a real vector space we don’t need this extra step). Hence, for all $z\in H$ the inner product vanishes, and thus we must have $T(v)-\lambda v=0$, which proves that (since $v\neq 0$) $v$ is an eigenvector of $T$ with eigenvalue $\lambda$.


See Loomis and Sternberg’s Advanced Calculus for a slight variant of the proof. Once we define $v$ and $\lambda$ using the extreme-value theorem, they prove (page 258) that $(v,\lambda)$ are an eigenvalue-eigenvector pair slightly more algebraically, using a Cauchy-Schwarz type trickery (the necessary variant of Cauchy-Schwarz is proven on page 249, Theorem 1.1).

peek-a-boo
  • 55,725
  • 2
  • 45
  • 89