Convergence of a sequence of eigenvectors (nonnegative matrix)

Question

Let $A$ be a $n\times n$ matrix with coefficients in $ [0,1] $. Let $ B $ be the matrix filled up only with the value $ \frac{1}{2} $: $$B = \begin{pmatrix} \frac{1}{2} & \dots & \frac{1}{2} \\ \vdots & \ddots & \vdots \\ \frac{1}{2} & \dots & \frac{1}{2} \end{pmatrix}\,.$$ For all $ t \in ]0,1] $ let $ A(t) = tB + (1-t)A $. The matrix $ A(t) $ is primitive for any fixed $ t $. Hence, from the Perron-Frobenius theorem, we have that $ \rho(t) $ (i.e. the spectral radius of the matrix $ A(t) $) is a simple eigenvalue, with the relative eigenvector that can be taken positive (i.e. every component is strictly positive). Let me call it $ x(t) $, choosing it such that $ \|x(t)\|_1 = \rho(t) $ (i.e. the sum of all the components is equal to the spectral radius of the matrix). In this way, I obtain the following properties: \begin{matrix} A(t)x(t) = \rho(t)x(t) \\ \|x(t)\|_1 = \rho(t)\\ x(t) > 0 \end{matrix} My question is: does there exist $ \lim_{t \to 0^+}x(t) $?

I have read that spectral radius is continuous with respect to any matrix norm, and then I would have: $ \lim_{t \to 0^+}A(t) = A \Rightarrow \lim_{t \to 0^+}\rho(t) = \rho(0) $. Is it correct?

Anyway, passing to the sequences with $ t = \frac{1}{n} $, I did not succeed in proving that the generated sequence $ x_n = x(\frac{1}{n}) $ is a Cauchy sequence (this fact would imply that $ x_n $ is convergent, because of the sequentially compactness of $ [0,1]^n $).

I think that the better way in order to prove the existence of the limit above is to prove the monotonicity of the components of $ x_n $ using the monotonicity of the coefficients $ a_{ij}(t) $ of $ A(t) $. It is only an idea but I do not know if it works.

I really thank you in advance.

You can prove that any converging subsequence of $x(t)$ will converge towards an eigenvector of $A$ corresponding to a maximal eigenvalue. To prove that the whole sequence converges, you may need further assumptions on $A$ (but I don't have an explicit example where the sequence does not converges at the moment). — Surb, May 13 '20 at 05:42
The special case where $A$ is stochastic was asked on MO four years ago without any definite answers. — user1551, May 15 '20 at 15:19

score 3 · Answer 1 · 2020-05-17T15:13:29.923

For $4$ days, I have been stuck essentially at the place where Surb arrived. I detail the basic reasoning

Note that, when $A(t)$ is continuous, the spectral radius $\rho(t)$ of $A(t)$, is always continuous on $[0,1]$.

When $t>0$: $A(t)$ is an analytic function of $t$ and a positive matrix; moreover $\rho(t)>0$ is a simple eigenvalue s.t. $A(t)x(t)=\rho(t)x(t)$ where $x(t)$ is a positive vector satisfying $||x(t)||=1$ for some norm. Then, according to

https://www.math.upenn.edu/~kazdan/504/eigenv.pdf

$\rho(t)$ and $x(t)$ are analytic functions on $(0,1]$.

$\textbf{Proposition 1.}$ If $A$ is symmetric, or more generally, $A$ is normal and $(A-A^T)B=B(A-A^T)$, then $x(t)$ converges.

$\textbf{Proof.}$ Since $A(t)$ is normal over $[0,1]$, $\rho(t)$ and $x(t)$ are analytic over $[0,1]$. cf. Theorem(A) in

https://arxiv.org/pdf/1111.4475v2.pdf

$\square$

$\textbf{Proposition 2.}$ The cluster points of $x(t)$ are some non-negative vectors in $\ker(A-\rho(A)I)$.

$\textbf{Proof}.$ Consider a sequence $(t_p)$ that tends to $0$ s.t. $x(t_p)\rightarrow x\;(\geq 0)$; then $Ax=\rho(A)x$. $\square$

$\textbf{Remark.}$ It "remains " to show that all these subsequences tend to the same limit.

$\textbf{Corollary.}$ If $dim(\ker(A-\rho(A)I))=1$, then $x(t)$ converges.

EDIT.

$\textbf {Proposition 3.}$ $\lim x(t)$ (if it exists) is not a continuous function of $A$.

$\textbf{Proof.}$ Let $A=\begin{pmatrix}1&a&0\\0&4/5&1/10\\0&2/5&4/5\end{pmatrix}$ where $\rho(A)=1$. We use $||.||_{\infty}$.

If $a=0$, then $dim(\ker(A-I))=2$ and $x(t)\rightarrow [2/3,1/2,1]^T$.

If $a\not= 0$, then $dim(\ker(A-I))=1$ and $x(t)\rightarrow [1,0,0]^T$. $\square$

EDIT 2. About the sasquires' post and the following reference that user1551 gave (indirectly)

[1] A note on perturbations of stochastic matrices, Huppert, Willems, Journal of Algebra (2000). -in free access-

https://reader.elsevier.com/reader/sd/pii/S0021869300985442?token=A1062A5C61A49B95037719BF2E60DBE3462EA3D00AD1980BC8CFC9C479C5C1250AC05536F459134264505903FCFFA812

One has $2$ questions. i) Does $x(t)$ converge ? ii) If yes, towards what ?

i) For almost stochastic matrices, the authors of [1] claim yes in Theorem 3-2 (a). Yet, I am not convinced by the end of the proof of (a); could someone read this proof and say what he/she thinks ?

ii) sasquires claims, in his post, that the following is true (I think that too)

$\textbf{Conjecture.}$ Let $A\in M_n$ be non-negative where $\rho(A)>0$ is semi simple and (do we need that ?) is the unique eigenvalue with modulus $\rho(A)$. IF $x(t)$ CONVERGES, then the limit can be explicitly calculated as follows

Let $(v_i)_{i\leq k},(u_i)_{i\leq k}$ be bases of $\ker(A-\rho(A) I)$ and $\ker(A^T-\rho(A)I)$ that satisfy $u_i^Tv_i=1$ and, when $i\not= j$, $u_i^Tv_j=0$.

-Of course, one must prove that such bases exist-

$R$ is the $k\times k$ matrix $R_{i,j}=u_i^T(B-A)v_j$; $c$ is the eigenvector associated to $\max( spectrum(R))$; finally -up to a factor- $\lim x(t)=\sum_i c_iv_i$.

sasquires, it's your idea; then, it's your job to write a flawless proof.

$\textbf{Remark.}$ If $x(t)$ converges and the above conjecture is true, then

$\textbf{Lemma.}$ Let $a,b>0$ and let $y(t)$ be the eigenvector associated to $(aA,bB)$. Then $\lim y(t)=\lim x(t)$.

$\textbf{Proof.}$ We may assume that $b=1$. Then the new matrix $R$ is $S_{i,j}=u_i^T(B-aA)v_j=u_i^TBv_j-au_i^TAv_j$ and $S=[u_i^TBv_j]-a\rho(A)I$. Finally we may replace $R$ with the matrix $[u_i^TBv_j]$. $\square$

In particular, we may replace $B$ with the matrix $[1/n]$, that is, a stochastic matrix and $A$ with a matrix satisfying $\rho(A)=1$.

I just saw this and laughed out loud at "it's your job." You and I are volunteers. My wife and I both work full time, and, during coronavirus, we have to take care of our two-year-old 24/7 on top of that, which (if you don't have a two-year-old, or at least one like ours) is the equivalent of another two full time jobs. Stackexchange is not my "job." If you were lost in the forest and someone gave you directions for how to get back to civilization, would you then tell them "No, it's your job to hold my hand and take me all the way there!"? — sasquires, May 18 '20 at 22:31
I gave some ideas, but it's U.G.'s job to solve the problem, because presumably he/she has something invested in this problem. Nevertheless, I'm also interested in the answer, and I have some thoughts on a formal proof, so I will try it sometime in the next few days when I can find the time. — sasquires, May 18 '20 at 22:32
@sasquires , to fight against the corona, I asked Cedric Villani to lend me his spider; about little child, I've already contributed. There are $2$ cases: i) the method to find the limit -when there exists- is known and there is nothing interesting to do (note that the authors of [1] write in 3.5 that they don't know any method to find $x(0)$). ii) Otherwise, in my opinion, it's in your interest to write a proof; so the result will be your... — , May 19 '20 at 12:47
About the existence theorem in [1], I would be interested in a reader's opinion about the end of the proof of theorem 3-2 (a). — , May 19 '20 at 12:51
I just had a chance to look at this reference while waiting for some code to compile. I just wanted to tell you that the end of the proof of Theorem 3.2(a) does indeed sound fishy to me---and I am the one who has been giving hand-wavy arguments in this discussion! But I haven't taken time to think about it carefully, so maybe I am just missing something. — sasquires, May 19 '20 at 23:17
I finally went back and re-read the proof of Theorem 3-2(a) in [1]. I think that I buy it this time. Comments: 1. The end of the proof is stated very badly, but the idea is that if there were a singularity in $v_i(t)$, then there would be some nearby $\tilde{t}$ such that $|v_i(\tilde{t})| > 1$, but this is a contradiction. 2. When I first read it, I thought that there was some circular logic going on where $v$ was assumed to have certain properties that were used later in the proof, but upon re-reading I don't think that's the case. — sasquires, Jun 10 '20 at 04:14

sasquires · Answer 2 · 2020-07-28T05:23:32.517

Edit: This answer has been mostly rewritten, since the previous versions contain a lot of text, most of which is no longer valuable (given that this answer is more comprehensive).

I read a paper that reminded me of this problem and decided to revisit it recently. This problem is not nearly as difficult as I thought it was at the time of my last answer. (I had previously considered an argument along the lines of the following, but I rejected it before thinking it through properly.)

This answer should be rigorous (and in danger of being pedantic), so I hope it will satisfy the OP's request. But please let me know if you find any errors in the reasoning.

Introduction: Instead of considering the problem as stated by the OP, I will consider a similar and more general problem. Let $$ X(t) = A + t B $$ where $A$ is any square nonnegative matrix and $B$ is any positive matrix of the same dimensions.

The OP's original question can be restored by substituting $t \to \frac{t}{1-t}$, $B = \tfrac{1}{2} 1 1^T$, and $A(t) = (1 - t)X(t)$. (Obviously this would pose a problem if we cared what happens at $t=1$, but the question is only concerned with what happens in a neighborhood of $t=0$.)

Definitions: By the Frobenius-Perron theorem, the spectral radius of $A$, $\rho(A)$, is associated with one or more real nonnegative eigenvectors. Let $R$ denote the right eigenspace associated to $\rho(A)$ and $L$ denote the associated left eigenspace.

Proposition 1: $\rho(X(t))$ is continuous as a function of $t$ on $[0, 1]$.

Proof: This is an immediate consequence of the fact that any eigenvalue of a continuously varying matrix can be expressed as a continuous function. See this answer, for example.

Definitions: Let the vectors $v_i$ be a basis for $R$ and $u_i$ be a basis for $L$. By the Frobenius-Perron theorem, all $u_i$ and $v_i$ must be nonnegative. (Note that $\dim R = \dim L$ since every matrix is similar to its transpose.)

Proposition 2: The $u_i$ can be chosen such that $u_i^T v_j = 0$ when $i \ne j$ and $u_i^T v_i = 1$. We will assume that this has been done below.

Proof: This is a well-known theorem. See this answer, for example.

Definitions: Let $V$ be a matrix containing $v_i$ as columns and $U$ be a matrix containing $u_i$ as columns. Let $P = V U^T$ and $Q = I - P$. Note that $P$ and $Q$ have the same dimensions as $A$ and $B$.

Proposition 3: $P$ is a projection operator onto $R$ and $P r = r$ for any $r \in R$.

Proof: This is also well-known, but here is a quick proof. $$ P^2 = (V U^T) (V U^T) = V (U^T V) U^T = V I U^T = V U^T = P$$ The main step $U^T V = I$ is a consequence of Proposition 2 and the normalization chosen there. For any $r \in R$, $r = \sum_j c_j v_j$, so $(U^T r)_i = \sum_j c_j u_i^T v_j = c_i$ and $P r = \sum_i c_i v_i = r$.

Definitions: Since $X(t)$ is a positive matrix for $t>0$, then for all $t>0$ there is a unique, positive eigenvector associated with $\rho(X(t))$. Denote this $x(t)$.

Note: All limits considered below are from the right (i.e., replace $t \to 0$ with $t \to 0^+$).

Proposition 4: $\lim_{t \to 0} Q x(t) = 0$.

Proof: This is the same as @loupblanc's Proposition 2 on this page. The proof is as follows. First, note that any limit point of $x(t)$ must be in $R$. Consider any sequence $t_i \to 0$ such that $x(t_i)$ converges to some vector $x_*$. Considering the limit in the eigenvalue equation, we find that $X(0) x_* = \rho(X(0)) x_*$, or $A x_* = \rho(A) x_*$, so $x_* \in R$. This implies $P x_* = x_*$ by Proposition 3, so $Q x_* = 0$.

Definition: Let $Y = P B$.

Proposition 5: $Y$ is either a positive matrix or it has some zero rows and contains a positive submatrix.

Proof: $P$ is nonnegative (being the sum of products of nonnegative elements) and $B$ is positive. $(PB)_{ij} = \sum_k P_{ik} B_{kj} > 0$ if there is any $k$ where $P_{ik} > 0$, so the only way for an element to be zero is if $P$ has an entire row that is zero. Since the rank of $P$ is equal to $\dim R \ge 1$, then not all the rows of $P$ can be zero. Excluding the indices where the rows of $P$ are zero, then the remaining submatrix must be positive.

Comment: Given Proposition 5, we can without loss of generality assume that $Y$ is a positive matrix. If not, we can write $Y = \begin{pmatrix} Y_1 & Y_2 \\ 0 & 0 \end{pmatrix}$ where $Y_1$ and $Y_2$ are positive submatrices. We can restrict the entire following discussion to the basis elements with positive entries, effectively replacing $Y$ with $Y_1$. Note that, in this case, the convergence of the bottom entries of $x(t)$ (on the rows where $P$ is zero) to zero is guaranteed by the fact that $\lim_{t \to 0} Qx(t) = 0$ and $Q = I - P$.

Definition: Let $w^T$ and $y$ be the left and right Frobenius-Perron eigenvectors of $Y$, respectively. Note that since $Y$ is positive, both of these are positive.

Proposition 6. $\lim_{t \to 0} \frac{\rho(X(t)) - \rho(A)}{t} = \rho(Y)$.

Proof: $$ \rho(X(t)) P x(t) = PX(t) x(t) = P \left( A + tB \right) x(t) = P \rho(A) + tY x(t) $$ Rearranging gives $$ Y x(t) = \frac{\rho(X(t)) - \rho(A)}{t} P x(t) $$ Multiplying through by $w^T$ and dividing by $w^T P x(t)$ gives $$ \frac{\rho(X(t)) - \rho(A)}{t} = \rho(Y) \frac{w^T x(t)}{w^T P x(t)} = \rho(Y) \frac{w^T (P + Q) x(t)}{w^T P x(t)} = \rho(Y) \left( 1 + \frac{w^T Q x(t)}{w^T P x(t)} \right) $$

As $t \to 0$, $Qx(t) \to 0$. Moreover, $w^T P x(t)$ cannot go to $0$ because $P x(t)$ is nonnegative and always has some nonzero elements and $w$ is positive. Taking the limit on both sides proves the result.

Proposition 7: If $M(t)$ is a matrix varying with parameter $t$ that has a limit $L$ at $t=0$, and $m(t)$ is a vector which has a bounded norm for $t > 0$, and $\lim_{t \to 0} L m(t)$ exists, then $\lim_{t \to 0} M(t) m(t) = \lim_{t \to 0} L m(t)$ (i.e., the limit can be substituted for $M$).

Proof: There is probably an easier proof, but this is easy enough to prove directly. Using any induced matrix norm, $$ 0 \le \| M(t) m(t) - L m(t) \| \le \| M(t) - L \| \| m(t) \| \to 0 $$ as $t \to 0$, where the last step uses the fact that $m(t)$ is bounded. The result follows from the squeeze theorem.

Main theorem: $\lim_{t \to 0} x(t)$ exists and is equal to $y$.

Proof: From the proof to Proposition 6, we have $$ \left[ Y - P \left( \frac{\rho(X(t))-\rho(A)}{t} \right) \right] x(t) = 0 $$ Taking a limit and applying Proposition 7, we get that $$ \lim_{t \to 0} \left[ \left( Y - \rho(Y) P \right) x(t) \right] = 0 $$ We can subtract $\lim_{t \to 0} \rho(Y) Q x(t)$ from both sides and apply the theorem on limits of sums and $P+Q=I$ to get rid of $P$, i.e., $$ \lim_{t \to 0} \left[ \left( Y - \rho(Y) \right) x(t) \right] = 0 $$

This means that any limit points of $x(t)$ must lie in the eigenspace of $Y$ associated to $\rho(Y)$. Since $Y$ is a positive matrix, then this eigenspace is one-dimensional. Since $x(t)$ and $y$ are both normalized positive vectors belonging to the same one-dimensional eigenspace, they must be equal.

I am interested in those cases when it is impossible to apply the Frobenius-Perron Theorem. I have done some numerical tests, and it seems that the sequence $ x(t) $ always converges (in particular, in the irreducible case it converges to the Frobenius-Perron eigenvector of $ A $), so I do not know how to construct a counterexample with more 'pathological' matrices (reducible, for instance). Thank you for your ideas. — U.G., May 14 '20 at 21:15
@U.G. I think that the overwhelming majority of reducible matrices are also covered by the argument above. For a reducible matrix, the eigenvalues are just the union of the eigenvalues of each of the irreducible diagonal blocks, which individually obey the Frobenius-Perron theorem. Unless the two of these blocks have FP eigenvalues that are equal, then the argument applies in its current form. Finally, when the maximal eigenvalue of $A(0)$ has dimension two or greater, I strongly suspect we can use perturbation theory to figure out the limit of $x(t)$; I will try it when I get a chance. — sasquires, May 14 '20 at 22:51
@U.G. Ok, I delivered the solution for the case of reducible matrices as well. It sounds like you already have some numerical testcases, so I would encourage you to try it numerically. — sasquires, May 15 '20 at 06:01
@sasquire , I don't understand your last edit. Your calculation is well-known but it is valid only when the considered eigenvalue is simple or when the matrix is normal and the eigenvalue is multiple -that is not the case here in general- For a general $A$, we don't even know if the eigenvector is continuous; unfortunately, we are in the last case. Despite this, you consider $y(0)$ and $y'(0)$ ! Incomprehensible! That said, the limit you suggest may be (perhaps ?) the one we are looking for. Test it on the matrix $A$ of my post with $a\not= 0$. — , May 15 '20 at 11:48
@loupblanc "...when the matrix is normal and the eigenvalue is multiple..." That is precisely the case that I treated in the edit. I explicitly assumed this, if you read it. (I think that it might be that all that is necessary is that the relevant eigenspace is of full rank, though; need to think about it.) Since defective matrices are of Lebesgue measure zero in the set of all matrices, then this result handles "almost all" matrices. Your counterexample was expected given the explicit assumption that I made, but handling "almost all" matrices is quite valuable in any applied context. — sasquires, May 15 '20 at 15:58
@loupblanc In my last comment, I forgot to respond to your objection to using $y'(0)$. You are correct that this is something that needs to be proved, and I have only provided the outline of a full proof (as I stated). But the proof of this point is simple. There is an intermediate step that I did not write explicitly above, where we have taken a derivative, but not set $t=0$ yet. This gives a differential equation for $y'(t)$. This has a solution in the neighborhood of zero in precisely the case that I handled (the eigenspace has full rank), in which case referring to $y'(0)$ is appropriate. — sasquires, May 15 '20 at 21:55
@U.G. I just had a chance to try my solution on the matrix in loupblanc's Proposition 3. It works in both cases ($a=0$, where the degenerate solution is needed, and of course $a \ne 0$, where it is not). — sasquires, May 16 '20 at 03:09
@loupblanc My apologies for using the word "counterexample" in a previous comment. I had misread that comment as suggesting that my method would not work on this matrix, and I was asserting that it would as long as it obeys the assumption that I made, which it does. I should have read your comment more carefully. — sasquires, May 16 '20 at 03:11
Future readers, please ignore a few of my comments above. In particular, even if the differential equation I referred to has a solution, that does not mean that the solution is $x(t)$ (as defined by the OP). (There are also some minor mistakes, such as misreading "normal" as "nondefective," although I don't think that normality is necessary.) — sasquires, Jun 10 '20 at 04:19
I was just skimming Numerical Methods for Large Eigenvalue Problems by Yousef Saad. Proposition 3.5 is enticing with regard to our problem. It asserts that the sine of the angle between the invariant subspace of an eigenvalue and the invariant subspace of a perturbed eigenvalue in a perturbed matrix is $O(t)$. However, "invariant subspace" is defined as $\textrm{Null}(A - \lambda I)^{l_i}$, where $l_i$ is the size of the Jordan block, so for example $\begin{pmatrix}0&1\0&0\end{pmatrix}$ has everything as an invariant subspace. But this may still be useful with an extra condition on $A$. — sasquires, Jun 12 '20 at 02:46
Okay, @loupblanc, I think this is a rigorous proof of a more general result. Let me know if you can find any errors. — sasquires, Jul 22 '20 at 04:04
Okay, @U.G., I think this solves your problem. Let me know if you can find any errors. — sasquires, Jul 22 '20 at 04:04
thank you very much for your contribution. I will read your own answer as soon as possible (unfortunately, these days I have been very busy, and I cannot follow the developments you kindly pointed out), giving you a feedback. Thanks. — U.G., Jul 22 '20 at 10:59
I just noticed that loupblanc is now @user91684. This solves the problem, and I was hoping you could check to see if you can find any errors above. — sasquires, Jul 23 '20 at 00:12
I read your proof: it seems to me correct, and I also deem that it contains a lot of good ideas; so, thank you again. I have no objections, but only a question. Suppose that $ Y_1 $ is a $ k \times k $ matrix, with $ k < dim A $. At the end of the day, you will find only $ k $ components for any limit point of $ x(t) $, because of the reduction argument you employed after Prop. 5. What about the other components? Can we say anything about them? If not, we could not guarantee the convergence of $ x(t) $ for $ t $ approaching to zero... am I right? — U.G., Jul 26 '20 at 11:06
@U.G. Thanks! I think that your concern is addressed by the fact that $\lim_{t \to 0} Qx(t) = 0$. Suppose that, using parallel notation to what is above, $P$ has the form $\begin{pmatrix} P_1 & P_2 \ 0 & 0 \end{pmatrix}$. Also write $x(t) = \begin{pmatrix} x_\textrm{top}(t) \ x_\textrm{bottom}(t) \end{pmatrix}$. Since $P+Q = I$, then $Q = \begin{pmatrix} Q_1 & Q_2 \ 0 & I \end{pmatrix}$ (where the latter $I$ is a smaller identity matrix). Substituting this into $\lim_{t \to 0} Qx(t) 0$, shows that $\lim_{t \to 0} x_\textrm{bottom}(t) = 0$. I will add an edit to the relevant comment. — sasquires, Jul 28 '20 at 05:20

Convergence of a sequence of eigenvectors (nonnegative matrix)

2 Answers2