Why did someone come up with the formula $\mathbf{P}^{-1}\mathbf{AP}=\mathbf{D}$?

Question

For Diagonalizable matrix

I wonder why/how someone came up with the formula:

$$\mathbf{P}^{-1}\mathbf{AP}=\mathbf{D}$$

and not something else like following:

$$\mathbf{AP}=\mathbf{D} $$

$$\text{or}$$

$$\mathbf{PA}=\mathbf{D}$$

Can you give some more context? What is P, D and A? – Klangen Jun 09 '19 at 08:50 — Klangen, Jun 09 '19 at 08:50
@sdsad: do you mean intuition behind diagonalization? – Chinnapparaj R Jun 09 '19 at 08:53 — Chinnapparaj R, Jun 09 '19 at 08:53
@ChinnapparajR yeah i think that's what he's asking – Saketh Malyala Jun 09 '19 at 08:54 — Saketh Malyala, Jun 09 '19 at 08:54

score 2 · Answer 1 · answered Jun 09 '19 at 09:09

2

Let ${X_1}, {X_2} \cdots {X_n} $ be independent Eigen vectors of a square matrix A corresponding to the Eigen values $\lambda_1, \lambda_2, \cdots , \lambda_n$.

Let $P = [{X_1} \ {X_2} \cdots {X_n}]$ be the matrix with columns containing Eigen vectors.

Also we have, $$A{X_i} = \lambda_i{X_i}$$

So, $$AP = [A{X_1} \ A{X_2} \cdots A{X_n}] = [\lambda_1{X_1} \ \lambda_2{X_2} \cdots \lambda_n{X_n}]$$

$$AP = [{X_1} \ {X_2} \cdots {X_n}]\begin{bmatrix}\lambda_1 & 0 & \cdots & 0 \\0 &\lambda_2 &\cdots & 0 \\ \vdots & \vdots& \vdots & \vdots \\ 0 & 0 & \cdots &\lambda_n \end{bmatrix} $$

$$AP = PD$$

where $D$ is the diagonal matrix. $P$ is invertible as the Eigen vectors are independent and $|P| \not= 0$.

So,

$$P^{-1}AP = D$$

We can compute the higher indices of $A$ from the above equation.

$$D^2 = P^{-1}A^2P$$

$$D^3 = P^{-1}A^3P$$

$$\vdots$$

$$D^k = P^{-1}A^kP$$

or

$$A^k = PD^kP^{-1}$$

So, it is conventional to write it as $P^{-1}AP = D$

answered Jun 09 '19 at 09:09

19aksh

12,768

Thanks. But still, why would you think about eigenvectors in the first place? – That Guy Jun 09 '19 at 09:35
Are you asking about the isolation in $$AP = [{X_1} \ {X_2} \cdots {X_n}]\begin{bmatrix}\lambda_1 & 0 & \cdots & 0 \0 &\lambda_2 &\cdots & 0 \ \vdots & \vdots& \vdots & \vdots \ 0 & 0 & \cdots &\lambda_n \end{bmatrix}$$ ? – 19aksh Jun 09 '19 at 09:37
1

Not really. You derive the formula by using eigenvectors. But why do you even think about eigenvectors and not anything else? My question is more about the intuition for how someone ended up with this formula - like why did they think about eigenvectors and not any other construct – That Guy Jun 09 '19 at 10:06
Possibly worth reading: What is the importance of eigenvalues/eigenvectors? – Minus One-Twelfth Jun 10 '19 at 00:25

score 1 · Answer 2 · answered Jun 09 '19 at 13:30

For a linear transformation $A$ on a vector space (finite) $U$, we know that $A$ can be `represented' by a matrix. We usually denote this matrix by $[A]_{\mathcal{B}}$, where $\mathcal{B}$ is a basis for $U$.

Taking the basis for $U$ consisting of eigenvectors of $A$ (when this is possible, i.e. when $A$ has $dim(U)$ linearly independent eigenvectors), gives a particularly simple (diagonal) $[A]_{\mathcal{B}}$ (this is your $D$).

score 1 · Answer 3 · answered Jun 09 '19 at 21:41

The defining equation for eigenvectors is $A\mathbf v=\lambda\mathbf v$. If we have a bunch of eigenvectors $\mathbf v_i$ with associated eigenvalues $\lambda_i$, using basic properties of matrix multiplication we can collect up all of the individual equations into a single “bulk” equation $$A\begin{bmatrix}\mathbf v_1 & \cdots & \mathbf v_n\end{bmatrix} = \begin{bmatrix}\mathbf v_1 & \mathbf v_2 & \cdots & \mathbf v_n\end{bmatrix} \begin{bmatrix}\lambda_1 & 0 & \cdots & 0 \\ 0&\lambda_2&\cdots&0 \\ \vdots & \vdots & \ddots & \vdots \\ 0&0&\cdots&\lambda_n\end{bmatrix},$$ or $AP=PD$ for short. Note that the matrix of eigenvalues right-multiplies the eigenvector matrix: we want to multiply each column of $P$ by the appropriate value.

If $A$ is diagonalizable, we can choose eigenvectors $\mathbf v_n$ such that they form a basis for the space, in which case $P$ will be square and invertible, so we can multiply both sides by $P^{-1}$ to get $P^{-1}AP=D$. I often use the equation $AP=PD$ as a starting point because I can never remember which side the inverse matrix goes on and it’s easy to derive from the fundamental eigenvector equation.

An easy way to remember is by noting that $D $ operates on coordinate vectors with respect to basis of eigenvectors. So on the LHS of the matrix equation we need a vector expressed as coordinate vector wrt basis consisting of eigenvectirs The columns of this transition matrix are the eigenvectors as coordinate vectors wrt to the standard basis. Hence we apply $P $ to get a vector wrt standard basis, so that $A $ can be applied to it (if $A $ is standard matrix representation of our linear transfirmation). — AnyAD, Jun 09 '19 at 22:36
@AnyAD As Señor Wences used to say, “Easy for you; hard for me.” $A\mathbf v=\lambda\mathbf v$ is ingrained in stone. The rest is an almost trivial symbolic manipulation. For other things, remembering “input” and “output” bases is indeed easy. — amd, Jun 09 '19 at 22:43
Don't disagree. (Thinking about transition matrices is sometimes useful, obviously depending on what one is trying to highlight) — AnyAD, Jun 09 '19 at 22:47

Why did someone come up with the formula $\mathbf{P}^{-1}\mathbf{AP}=\mathbf{D}$?

3 Answers3