9

Let matrix $A \in \mathbb{R}^{n\times n}$ be positive semidefinite.

  • Is it then true to that $$ (A + \lambda I)^{-1} \to \mathbf{0} \quad (\lambda \to \infty) \quad ? $$

  • If so, is the fact that $A$ is positive definite irrelevant here?


My thoughts so far: $$ (A + \lambda I)^{-1} = \Big(\lambda( \frac{1}{\lambda}A + I ) \Big)^{-1} = \frac{1}{\lambda} \Big(\frac{1}{\lambda}A + I \Big)^{-1} $$ I think that $\lim_{\lambda \to \infty} \Big( \frac{1}{\lambda}A + I \Big)^{-1} = I^{-1} = I$, but I don't know if I can just pass the $\lim$ through the inverse $(\cdot)^{-1}$ like that. If this is the case, then $$ \lim_{\lambda \to \infty} (A + \lambda I)^{-1} = \lim_{\lambda \to \infty} (1/\lambda) \lim_{\lambda \to \infty} (A/\lambda + I)^{-1} = 0 \cdot I = \mathbf{0} $$ as I'd like to show.


Where this comes from:

I'm trying to justify a claim made in an econometrics lecture. Namely,

$$ \textrm{Var}(\hat{\beta}^{\textrm{ridge}}) = \sigma^2 (X^{T}X + \lambda I)^{-1} X^T X [(X^T X + \lambda I)^{-1}]^T \to \mathbf{0} $$ where $\hat{\beta}^\textrm{ridge}$ is the ridge estimator in a linear model, $X \in \mathbb{R}^{n \times p}$ is the design matrix, and the equality is known. The limit, however, wasn't justified.

zxmkn
  • 1,530
  • 6
    $A$ can be any matrix above. The point is, the inverse of a matrix is a continuous function in a neighbourhood of the identity, therefore since $A - \lambda I$ is going to eventually be invertible, we may pass the limit inside the inverse by continuity, giving the desired result by the continuity of scalar multiplication. – Sarvesh Ravichandran Iyer Jan 24 '19 at 17:27
  • 5
    If $|\cdot|$ is a matrix norm, then the Neumann series guarantees that $A+\lambda I$ is invertible with $$(A+\lambda I)^{-1} = \sum_{n=0}^{\infty} \frac{(-1)^n}{\lambda^{n+1}}A^n, $$ which converges uniformly on the region $|\lambda| \geq |A|+\delta$ for any given $\delta > 0$. By the Weierstrass M-test, the limit as $\lambda\to\infty$ can be evaluated term-wise, proving the desired claim. – Sangchul Lee Jan 24 '19 at 17:34
  • @астонвіллаолофмэллбэрг Great! That completes my line of reasoning. For others looking on, here's why there is a neighborhood of $I$ in $M_n(\mathbb{R})$ in which $(\cdot)^{-1}$ is continuous:

    $(\cdot)^{-1} : GL_n(\mathbb{R}) \to GL_n(\mathbb{R})$ is continuous and $GL_n(\mathbb{R})$ is open in $M_n(\mathbb{R})$ (see: https://math.stackexchange.com/a/810675/369800). [To understand the proof just linked: determinant continuous (see: https://math.stackexchange.com/a/121834/369800) and adjoint continuous (see: https://math.stackexchange.com/a/2031642/369800)]

    – zxmkn Jan 24 '19 at 19:07
  • Recall that the inverse matrix is the adjugate matrix divided by the discriminant. Thus a "singularity" of the inversion only happens when the discriminant vanishes. – Alexey Jan 26 '19 at 20:45
  • @TeresaLisbon Can we use your reasoning to exchange limit and inverse order in $\lim_{\lambda\to\infty} (A+ \lambda B^TB)^{-1}$? $A$ is positive definite and B is a full row rank $m\times n$ matrix. – Mah Feb 21 '21 at 23:01
  • 1
    @Mah I would like to think we can do so, but the argument is likely to be more convoluted. I think because $B^TB$ is positive definite, we can lower bound the smallest eigenvalue of $A+\lambda B^TB$ so that it goes to infinity with $\lambda$, then we can be done. – Sarvesh Ravichandran Iyer Feb 22 '21 at 04:09
  • @TeresaLisbon Thank you for your answer. $B^TB$ is positive semi-definite because $m<n$ and thus $B^TB$ is not invertible. But I agree with you we can lower bound the smallest eigenvalue of $A+\lambda B^TB$. I am more interested in moving the limit inside the inverse and then moving the whole thing to the other side of a linear equation: \ I have $x=\lim_{\lambda\to\infty}(A+\lambda B^TB)^{-1} y$, I want: $\lim_{\lambda\to\infty}(A+\lambda B^TB)x= y$. Do you think it is possibly correct? – Mah Feb 22 '21 at 04:32
  • @Mah Oh, then I think it may happen a lot more often. A sufficient, but I don't think necessary, condition is that $(A+\lambda B^TB)$ has uniformly bounded smallest eigenvalue away from zero. That allows invertibility for each matrix, but also in the limit, which means you can switch the limit to the LHS. I don't think this is sufficient, since there may be conditions where the eigenvalues could go to zero but those that pertain to $x$ and $y$ (in some basis expansion) behave well. I am not being precise about the second part, but the first part should work. – Sarvesh Ravichandran Iyer Feb 22 '21 at 04:47

2 Answers2

4

The eigenvalues of $A+\lambda I$ are of the form $\lambda+\mu$, where $\mu$ is an eigenvalue of $A$ (necessarily real). Then, for $\lambda$ sufficiently large, the eigenvalues of $A+\lambda I$ are all $>1$.

Note that a matrix $S$ that diagonalizes $A$ also diagonalizes $A+\lambda I$, let $A=SDS^{-1}$, with $D$ diagonal.

Then $(A+\lambda I)^{-1}$ is diagonalizable with eigenvalues in $(0,1)$ and therefore $$ \lim_{\lambda\to\infty}(A+\lambda I)^{-1}= S\Bigl(\,\lim_{\lambda\to\infty}(D+\lambda I)^{-1}\Bigr)S^{-1}=0 $$ It is not necessary that $A$ is semipositive definite. Any symmetric matrix will do.

egreg
  • 238,574
3

The answer I liked the best was left in the comments by астон вілла олоф мэллбэрг, since it shows that $A$ does not need any special structure. Here I'm pulling his answer down and including a bit more detail.


We have $$ (A + \lambda I)^{-1} = \Big(\lambda( \frac{1}{\lambda}A + I ) \Big)^{-1} = \frac{1}{\lambda} \Big(\frac{1}{\lambda}A + I \Big)^{-1}, $$ and we claim that $\Big(\frac{1}{\lambda}A + I \Big)^{-1} \to I^{-1} = I \quad (\lambda \to \infty)$. Therefore, $$ (A + \lambda I)^{-1} = \frac{1}{\lambda} \Big(\frac{1}{\lambda}A + I \Big)^{-1} \to 0 \cdot I = \mathbf{0} \quad (\lambda \to \infty), $$ which was the desired result.

We complete the proof by showing the claim. Since $GL_n(\mathbb{R})$ is open in $M_n(\mathbb{R})$, we find some $\epsilon > 0 $ such that the open ball $B(I, \epsilon) \subseteq GL_n(\mathbb{R})$. Hence, for sufficiently large $\lambda$, we know that $(A/\lambda + I) \in B(I, \epsilon) \subseteq GL_n(\mathbb{R})$. Also knowing that $(\cdot)^{-1} : GL_n \to GL_n$ is continuous, we have $$ \lim_{\lambda \to \infty}\Big(\frac{1}{\lambda}A + I \Big)^{-1} = \Big(\lim_{\lambda \to \infty} \frac{1}{\lambda}A + I \Big)^{-1}= (I)^{-1} = I, $$ which completes the proof.


To understand the linked proof of the continuity of $(\cdot)^{-1}$, see here for justification that the determinant operator is continuous and here for justification that the adjoint operator is continuous.

zxmkn
  • 1,530