Why does a diagonalization of a matrix B with the basis of a commuting matrix A give a block diagonal matrix?

Question

I am trying to understand a proof concerning commuting matrices and simultaneous diagonalization of these.

It seems to be a well known result that when you take the eigenvectors of $A$ as a basis and diagonalize $B$ with it then you get a block diagonal matrix:

$$B= \begin{pmatrix} B_{1} & 0 & \cdots & 0 \\ 0 & B_{2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & B_{m} \end{pmatrix},$$

where each $B_{i}$ is an $m_{g}(\lambda_{i}) \times m_{g}(\lambda_{i})$ block ($m_{g}(\lambda_{i})$ being the geometric multiplicity of $\lambda_{i}$).

My question
Why is this so? I calculated an example and, lo and behold, it really works :-) But I don't understand how it works out so neatly.

Can you please explain this result to me in an intuitive and step-by-step manner - Thank you!

"diagonalize B with it" doesn't seem to be what you want to say, does it? — Rasmus, Jun 20 '11 at 20:49
So, we are assuming that $A$ and $B$ are both diagonalizable, and that $AB=BA$. Then you take a basis $\beta$ of eigenvectors of $A$ (listed with all the vectors corresponding to the same eigenvalue together), and you look at the coordinate matrix of $B$ relative to $\beta$, and you wonder why this matrix is a block-diagonal matrix. Is this correct? — Arturo Magidin, Jun 20 '11 at 21:04
@ArturoMagidin: You do not need to assume both $A$ and $B$ are diagonalizable. You only need to assume one of them, say $A$ is diagonalizable. The diagonalizability of $B$ is a consequence. — Hans, Feb 14 '23 at 07:40
@Hans You are replying to a comment that is almost 12 years old, which was seeking to clarify what someone meant back then. — Arturo Magidin, Feb 14 '23 at 13:11
@ArturoMagidin: Well, that is the advantage of the "memory", isn't it? The age of the comments is less relevant than the idea that they convey. The original question is indeed vague and badly worded, and it still is. Your previous comment clarified it and is correct. I am only saying it can be improved and strengthened. I think it is great that successive comments add valuable information and improvements. — Hans, Feb 14 '23 at 18:33

score 16 · Accepted Answer · edited Oct 12 '19 at 18:20

Suppose that $A$ and $B$ are matrices that commute. Let $\lambda$ be an eigenvalue for $A$, and let $E_{\lambda}$ be the eigenspace of $A$ corresponding to $\lambda$. Let $\mathbf{v}_1,\ldots,\mathbf{v}_k$ be a basis for $E_{\lambda}$.

I claim that $B$ maps $E_{\lambda}$ to itself; in particular, $B\mathbf{v}_i$ can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$, for $i=1,\ldots,k$.

To show that $B$ maps $E_{\lambda}$ to itself, it is enough to show that $B\mathbf{v}_i$ lies in $E_{\lambda}$; that is, that if we apply $A$ to $B\mathbf{v}_i$, the result ill be $\lambda(B\mathbf{v}_i)$. This is where the fact that $A$ and $B$ commute comes in. We have: $$A\Bigl(B\mathbf{v}_i\Bigr) = (AB)\mathbf{v}_i = (BA)\mathbf{v}_i = B\Bigl(A\mathbf{v}_i\Bigr) = B(\lambda\mathbf{v}_i) = \lambda(B\mathbf{v}_i).$$ Therefore, $B\mathbf{v}_i\in E_{\lambda}$, as claimed.

So, now take the basis $\mathbf{v}_1,\ldots,\mathbf{v}_k$, and extend it to a basis for $\mathbf{V}$, $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_k,\mathbf{v}_{k+1},\ldots,\mathbf{v}_n]$. To find the coordinate matrix of $B$ relative to $\beta$, we compute $B\mathbf{v}_i$ for each $i$, write $B\mathbf{v}_i$ as a linear combination of the vectors in $\beta$, and then place the corresponding coefficients in the $i$th column of the matrix.

When we compute $B\mathbf{v}_1,\ldots,B\mathbf{v}_k$, each of these will lie in $E_{\lambda}$. Therefore, each of these can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$ (since they form a basis for $E_{\lambda}$. So, to express them as linear combinations of $\beta$, we just add $0$s; we will have: $$\begin{align*} B\mathbf{v}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2+\cdots+b_{k1}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\ B\mathbf{v}_2 &= b_{12}\mathbf{v}_1 + b_{22}\mathbf{v}_2 + \cdots +b_{k2}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\ &\vdots\\ B\mathbf{v}_k &= b_{1k}\mathbf{v}_1 + b_{2k}\mathbf{v}_2 + \cdots + b_{kk}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n \end{align*}$$ where $b_{ij}$ are some scalars (some possibly equal to $0$). So the matrix of $B$ relative to $\beta$ would start off something like: $$\left(\begin{array}{ccccccc} b_{11} & b_{12} & \cdots & b_{1k} & * & \cdots & *\\ b_{21} & b_{22} & \cdots & b_{2k} & * & \cdots & *\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots\\ b_{k1} & b_{k2} & \cdots & b_{kk} & * & \cdots & *\\ 0 & 0 & \cdots & 0 & * & \cdots & *\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 & * & \cdots & * \end{array}\right).$$

So, now suppose that you have a basis for $\mathbf{V}$ that consists entirely of eigenvectors of $A$; let $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_n]$ be this basis, with $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$ corresponding to $\lambda_1$ (with $m_1$ the algebraic multiplicity of $\lambda_1$, which equals the geometric multiplicity of $\lambda_1$); $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ the eigenvectors corresponding to $\lambda_2$, and so on until we get to $\mathbf{v}_{m_1+\cdots+m_{k-1}+1},\ldots,\mathbf{v}_{m_1+\cdots+m_k}$ corresponding to $\lambda_k$. Note that $\mathbf{v}_{1},\ldots,\mathbf{v}_{m_1}$ are a basis for $E_{\lambda_1}$; that $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ are a basis for $E_{\lambda_2}$, etc.

By what we just saw, each of $B\mathbf{v}_1,\ldots,B\mathbf{v}_{m_1}$ lies in $E_{\lambda_1}$, and so when we express it as a linear combination of vectors in $\beta$, the only vectors with nonzero coefficients are $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$, because they are a basis for $E_{\lambda_1}$. So in the first $m_1$ columns of $[B]_{\beta}^{\beta}$ (the coordinate matrix of $B$ relative to $\beta$), the only nonzero entries in the first $m_1$ columns occur in the first $m_1$ rows.

Likewise, each of $B\mathbf{v}_{m_1+1},\ldots,B\mathbf{v}_{m_1+m_2}$ lies in $E_{\lambda_2}$, so when we express them as linear combinations of $\beta$, the only places where you can have nonzero coefficients are in the coefficients of $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$. So the $(m_1+1)$st through $(m_1+m_2)$st column of $[B]_{\beta}^{\beta}$ can only have nonzero entries in the $(m_1+1)$st through $(m_1+m_2)$st rows. And so on.

That means that $[B]_{\beta}^{\beta}$ is in fact block-diagonal, with the blocks corresponding to the eigenspaces $E_{\lambda_i}$ of $A$, exactly as described.

Could you please clarify if you are assuming both $A$ and $B$ are diagonalizable? If so, it would be better to state the assumption at the outset. — Hans, Feb 14 '23 at 06:10
@Hans Perhaps you should read the question carefully? It begins with the explicit assumption that $A$ is diagonalizable: "if you take eigenvectors of $A$ and use them as a basis..." The assumption that $B$ is diagonalizableisn't used yet. And see the comments below the question, too, to see why I don't have to repeat the assumptions already present in the question. — Arturo Magidin, Feb 14 '23 at 06:23
OK. That is a bit roundabout for my taste, since if $A$ is not diagonalizable, its eigenvectors cannot be used as a basis. The statement may be better put as "when you take the eigenvectors of a diagonalizable $A$ as a basis". I think the question is not worded the best and is a bit vague and can use some editing. — Hans, Feb 14 '23 at 06:43
@Hans The question is over 10 years old (as is the answer). Don't bump it just because it isn't fully to your taste. — Arturo Magidin, Feb 14 '23 at 13:09

score 5 · Answer 2 · answered Jun 20 '11 at 21:11

I will write $k_i=m_g(\lambda_i)$.

You are looking for the general form of a matrix $B$ that commutes with $$A= \begin{pmatrix} \lambda_1 I_{k_1} & 0 & \cdots & 0 \\ 0 & \lambda_2 I_{k_2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_m I_{k_m} \end{pmatrix}.$$

If you put $B$ in the same block structure, you have

$$B= \begin{pmatrix} B_{11} & B_{12} & \cdots & B_{1m} \\ B_{21} & B_{22} & \cdots & B_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ B_{m1} & B_{m2} & \cdots & B_{mm} \end{pmatrix},$$ where $B_{ij}$ is a $k_i$-by-$k_j$ matrix.

Then $$AB= \begin{pmatrix} \lambda_1 B_{11} & \lambda_1 B_{12} & \cdots & \lambda_1 B_{1m} \\ \lambda_2 B_{21} & \lambda_2 B_{22} & \cdots & \lambda_2 B_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ \lambda_m B_{m1} & \lambda_m B_{m2} & \cdots & \lambda_m B_{mm} \end{pmatrix},$$ while $$BA= \begin{pmatrix} \lambda_1 B_{11} & \lambda_2 B_{12} & \cdots & \lambda_m B_{1m} \\ \lambda_1 B_{21} & \lambda_2 B_{22} & \cdots & \lambda_m B_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ \lambda_1 B_{m1} & \lambda_2 B_{m2} & \cdots & \lambda_m B_{mm} \end{pmatrix}.$$

You can compare off-diagonal blocks to see that $B$ must have the desired form if $BA=AB$, because $\lambda_i\neq \lambda j$ if $i\neq j$.

score 4 · Answer 3 · answered Feb 23 '14 at 05:31

Getting a block diagonal matrix expressing$~B$ just means that each eigenspace for$~A$ (whose direct sum fills the entire space since $A$ is supposed diagonalisable) is $B$-stable. And a useful fact that applies here is that when two linear operators commute, then every subspace that is the kernel or image of a polynomial in one of the operators is automatically stable for the other operator. A polynomial in the first operator is just another operator$~\psi$ that commutes with the second operator$~\phi$, so it suffices to show that for the kernel and the image of $\psi$ are $\phi$-stable when $\psi$ and $\phi$ commute:

Kernel: if $v\in\ker\psi$ then $\psi(\phi(v))=\phi(\psi(v))=\phi(0)=0$, so indeed $\phi(v)\in\ker\psi$.
Image: if $v=\psi(w)$ then $\phi(v)=\phi(\psi(w))=\psi(\phi(w))$, which indeed is in the image of $\psi$.

The eigenspace of$~A$ for $\lambda$ is of course just the special case of the kernel of the polynomial $A-\lambda I$ in$~A$.

Why does a diagonalization of a matrix B with the basis of a commuting matrix A give a block diagonal matrix?

3 Answers3

Linked