Finding $P$ such that $P^TAP$ is a diagonal matrix

Question

Let $$A = \left(\begin{array}{cc} 2&3 \\ 3&4 \end{array}\right) \in M_2(\mathbb{C})$$

Find $P$ such that $P^TAP = D$ where $D$ is a diagonal matrix.

So here's the solution:

$$A = \left(\begin{array}{cc|cc} 2&3&1&0\\ 3&4&0&1 \end{array}\right) \sim \left(\begin{array}{cc|cc} 2&0&1&-3/2\\ 0&-1/2&0&1 \end{array}\right)$$

Therefore, $$P = \left(\begin{array}{cc} 1&-3/2\\ 0&1 \end{array}\right) \\ P^TAP = \left(\begin{array}{cc} 2&0\\ 0&-1/2 \end{array}\right) $$

What was done here exactly? I'd be glad elaborate about the process.

Thanks.

Please improve the question by providing additional context, which ideally includes your thoughts on the problem and any attempts you have made to solve it. This information helps others identify where you have difficulties and helps them write answers appropriate to your experience level. — Fly by Night, Aug 02 '15 at 19:41
I disagree. If I were given a solutions key and this was how the solution was written, I'd be confused too. Personally, I do not understand the solution as it is presented either. The method that I had learned and teach is to find the eigenvalues and corresponding eigenvectors. As per the spectral theorem, since $A$ is a real symmetric matrix, we are guaranteed that such a $P$ that diagonalizes $A$ can be chosen to be orthogonal, which implies that $P^{-1}= P^T$. I'll write up my process below. — JMoravitz, Aug 02 '15 at 19:43
I've understood that you want to row-reduce $A$ to some $D$, a diagonal matrix and apply the same operations for the columns of the identity. — jmiller, Aug 02 '15 at 19:48
Found some info here: https://books.google.co.il/books?id=23PmtnVdZWcC&pg=PA100&lpg=PA100&dq=P%5ETAP+Diagonal+row+reduce&source=bl&ots=wV5z7XiV2o&sig=CeasZfxZhnWNj1DlE2JYGXHftFY&hl=en&sa=X&ved=0CCwQ6AEwBGoVChMIyNPljpiLxwIVA24UCh0Olg5V#v=onepage&q=P%5ETAP%20Diagonal%20row%20reduce&f=false — jmiller, Aug 02 '15 at 20:22
funny, the second time I stumble across this Hermite method...the first time was also answered by Will Jagy as well, you can check it here — user190080, Aug 02 '15 at 20:24
I am really interested to understand the process I've presented. I'd be glad if you could, @user190080 to help me with that. — jmiller, Aug 02 '15 at 20:26
I think that what was done here is if row-reduce by a row-operation then it doesn't afftet the matrix whereas if you make a column-operation then you apply it the identity-matrix as well. — jmiller, Aug 02 '15 at 20:34
first of all it is very interesting, that there are a lot methods which lead to a diagonal representation of matrix $A$, the most general is the one from JMoravitz, where you get that $P^tP=\mathrm{Id}$, then there is the one from Will Jagy, which uses quadratic forms and might quite fast to be computed, and then there is your solution manual which pretty much works exactly as in your linked reference - if you want to make use of the power of diagonalization while calculating exponents etc., then you probably need to go with the spectral theorem (although check the other example) — user190080, Aug 02 '15 at 20:37
@user190080, could you explain in simple words what the algorithm is? I am confused. Added another example above.. — jmiller, Aug 02 '15 at 20:40
In your first example, you use the fact that you can express row and column manipulation with the help of the multiplication of so called elementary matrices, left multiplication is row and right multiplication is column manipulation here for wiki, then you transform as until you reached your diagonal structure and your done (this is not a diagonalization in the common sense!) - your second examples confuses me too... — user190080, Aug 02 '15 at 20:56
So basically to find $P$ I need to apply to the right-identity matrix only column operations. Got it. Thanks. — jmiller, Aug 02 '15 at 21:03
if you still have a question, just write here and I'll try to post an answer on this! — user190080, Aug 02 '15 at 21:26
@user190080, the OP is being taught to run Hermite backwards, one operation at a time. I put a summary in my answer. My way, which i find easy to remember, finds $Q^T D Q = A,$ so i then have a separate step as $P=Q^{-1}.$ — Will Jagy, Aug 02 '15 at 22:23
@WillJagy ah ok, I thought it looked more like the way of multiplication of elementary matrices as described in his link, but especially the transformation in the 2nd example looked a bit strange to me...as long the OP understood how it works it's fine with me (and I learned just another method) — user190080, Aug 03 '15 at 00:18
@user190080, I have no idea what the OP understands, since he is asking us to explain some unknown book... however, his method is similar to Gauss reduction of binary quadratic forms, one step at a time. We start with a symmetric matrix $A_0.$ At each step, we are going to use some elementary matrix $E,$ same as in row reduction, such that $A_{n+1}=E^T A_n E$ has one fewer pair of off-diagonal nonzero entries. We also began with $P_0=I,$ and $P_{n+1}=P_n E.$ Eventually we get to $A_M=D$ and $P_M=P,$ with $P^T A P = D$ by construction. — Will Jagy, Aug 03 '15 at 00:30
@user190080 please take a look at http://math.stackexchange.com/questions/1388421/reference-for-linear-algebra-books-that-teach-reverse-hermite-method-for-symmetr where I carefully present a 2 by 2 example that requires two "steps" — Will Jagy, Aug 09 '15 at 18:00

score 4 · Answer 1 · edited Apr 13 '17 at 12:21

Hermite Reduction.

SEE ALSO Orthogonal basis for this indefinite symmetric bilinear form

Transforming quadratic forms, how is this theorem called?

What is the difference between using $PAP^{-1}$ and $PAP^{T}$ to diagonalize a matrix?

When you have a symmetric matrix of integers, you may use Hermite's method for diagonalizing, the order they want is $P^t A P = D.$ Alright, I will need to do an inverse at the end.

Make a column vector $$ V = \left( \begin{array}{c} x \\ y \end{array} \right) $$ and write out $$ V^T A V = 2 x^2 + 6 xy + 4 y^2 $$ Next, we cancel out all $x$ terms using $$ \left( x + \frac{3}{2} y \right)^2 = x^2 + 3 xy + \frac{9}{4} y^2, $$ and $$ 2 \left( x + \frac{3}{2} y \right)^2 = 2x^2 + 6 xy + \frac{9}{2} y^2. $$ As a result, $$ 2 \left( x + \frac{3}{2} y \right)^2 - \frac{1}{2} y^2 = 2 x^2 + 6 xy + 4 y^2 . $$

MORE TYPING TO COME !!!!

In matrices, the direction I did is $$ \left( \begin{array}{cc} 1 & 0 \\ \frac{3}{2} & 1 \end{array} \right) \left( \begin{array}{cc} 2 & 0 \\ 0 & -\frac{1}{2} \end{array} \right) \left( \begin{array}{cc} 1 & \frac{3}{2} \\ 0 & 1 \end{array} \right) = \left( \begin{array}{cc} 2 & 3 \\ 3 & 4 \end{array} \right) $$

With $$ Q = \left( \begin{array}{cc} 1 & \frac{3}{2} \\ 0 & 1 \end{array} \right) $$ notice that the rows correspond exactly to the linear substitutions, the first row means $x + \frac{3}{2} y$ and the second row means $y.$

EVEN MORE EXCITING TYPING ANY MINUTE !!!!!!!!!! What I did so far is in the order $Q^T D Q = A.$ All we need to do is take $p= Q^{-1},$ which is easier than usual because $\det Q = 1.$ The result is $$ \left( \begin{array}{cc} 1 & 0 \\ -\frac{3}{2} & 1 \end{array} \right) \left( \begin{array}{cc} 2 & 3 \\ 3 & 4 \end{array} \right) \left( \begin{array}{cc} 1 & -\frac{3}{2} \\ 0 & 1 \end{array} \right) = \left( \begin{array}{cc} 2 & 0 \\ 0 & -\frac{1}{2} \end{array} \right) $$

The second example in the question, with 3 by 3 matrix, is $$ x^2 + 4 y^2 + 4 z^2 + 16 yz + 4 zx + 4 xy. $$ This is an example where an extra trick must be used: $$ (x+2y+2z)^2 = x^2 + 4 y^2 + 4 z^2 + 8 yz + 4 zx + 4 xy. $$ All that remains to construct is $8yz$ because we used up the $y^2$ and $z^2.$ The trick is that $(y+z)^2 - (y-z)^2 = 4yz,$ so $$ (x+2y+2z)^2 + 2 (y+z)^2 -2 (y-z)^2= x^2 + 4 y^2 + 4 z^2 + 16 yz + 4 zx + 4 xy. $$ Thus the diagonal matrix gets entries $1,2,-2$ and, in this direction,

$$ Q = \left( \begin{array}{ccc} 1 & 2 & 2 \\ 0 & 1 & 1 \\ 0 & 1 & -1 \end{array} \right) $$ and then $P = Q^{-1}$

$$ P = \left( \begin{array}{ccc} 1 & -2 & 0 \\ 0 & \frac{1}{2} & \frac{1}{2} \\ 0 & \frac{1}{2} & - \frac{1}{2} \end{array} \right) $$

This is interesting, and clearly a different method and result than that which I used. It also arrives at the same answer as given in the OP's solution key. Are the values in $D$ in this "diagonalization" important in any way? Are there multiple such "diagonalizations"? — JMoravitz, Aug 02 '15 at 20:13
@JMoravitz, there are multiple such diagonalizations, the main restriction is Sylvester's law of inertia. By different choices, one may change the diagonal entries. No eigenvalues are needed, which is nice. You will find this in any book on quadratic forms; in particular, pages 17-19 in G. L. Watson, Integral Quadratic Forms. — Will Jagy, Aug 02 '15 at 20:19

score 2 · Answer 2 · answered Aug 02 '15 at 20:10

Generally, in the process of diagonalization, it is easiest to approach via calculating the eigenvalues and corresponding eigenvectors to form an orthonormal eigenbasis.

Such an orthogonal matrix is guaranteed to exist by the Spectral Theorem since our matrix, $A$, is a real symmetric matrix.

step 1: calculate eigenvalues

Find the eigenvalues by finding the characteristic polynomial: $\det(A-\lambda I) = (2-\lambda)(4-\lambda) - 3\cdot 3 = 8-6\lambda + \lambda^2 - 9 = \lambda^2 - 6\lambda - 1$

Finding the roots of the characteristic polynomial will find our eigenvalues. Solving via the quadratic formula gives us $\frac{6\pm\sqrt{36+4}}{2}=3\pm \sqrt{10}$

step 2: find the eigenvectors

Now, we try to find the eigenvectors.

Eigenvector for $\lambda_1=3+\sqrt{10}$ would be a vector in the kernel of $A-\lambda_1 I$.

$rref\left(\begin{bmatrix} 2-3-\sqrt{10}&3\\3&4-3-\sqrt{10}\end{bmatrix}\right) = \begin{bmatrix}1&\frac{1-\sqrt{10}}{3}\\0&0\end{bmatrix}$, so the eigenvector $v_1$ is $\begin{bmatrix}\frac{-1+\sqrt{10}}{3}\\1\end{bmatrix}$.

Similarly, the eigenvector for $\lambda_2=3-\sqrt{10}$ would be a vector in the kernel of $A-\lambda_2 I$.

$rref\left(\begin{bmatrix} 2-3+\sqrt{10}&3\\3&4-3+\sqrt{10}\end{bmatrix}\right) = \begin{bmatrix}1&\frac{1+\sqrt{10}}{3}\\0&0\end{bmatrix}$, so the eigenvector $v_2$ is $\begin{bmatrix}\frac{-1-\sqrt{10}}{3}\\1\end{bmatrix}$

step 3: form an orthonormal basis for each eigenspace

A convenient thing about this situation is, thanks to the spectral theorem and the fact that our $A$ is real symmetric, vectors in different eigenspaces are already guaranteed to be orthogonal. Indeed $\langle v_1, v_2\rangle = (\frac{-1+\sqrt{10}}{3})(\frac{-1-\sqrt{10}}{3})+1\cdot 1 = 0$

If we had a repeated eigenvalue, then we would need to apply the gram-schmidt process to the basis vectors of its corresponding eigenspace. In our case, each eigenvalue is of multiplicity one, so we only need to normalize the vectors.

$u_1 = \frac{v_1}{\|v_1\|} = \begin{bmatrix} ((1+\sqrt{10})/(3 \sqrt{1+1/9 (1+\sqrt{10})^2)}\\ 1/\sqrt{1+1/9 (1+\sqrt{10})^2)}\end{bmatrix}$

These numbers were not very pretty to work with... oh well.

You have then $A = PDP^T$ where $P=[u_1,u_2]$ and $D=\begin{bmatrix}\lambda_1&0\\0&\lambda_2\end{bmatrix}$. $P$ is an orthogonal matrix, so $P^T=P^{-1}$ and we have $P^T A P=D$

Thank you for your efforts, and I do familiar with this method, altough it seems like an over-head in this particular case and I'd really like to understand the process I've introduced in my question. — jmiller, Aug 02 '15 at 20:15
As would I. I was unaware of another method and result existing. As shown in my work here, the values in $D$ that are used in your original post and in Will's answer are not the eigenvalues of the matrix $A$, and so I am curious as to the use and function of such an alternate diagonalization. The form I show here is useful since $|Px|=|x|$ as well as $P^TP=I$. This allows for easy calculation of $A^k$ for any positive integer $k$, whereas that is not the case with the other diagonalization, and the information contained in $D$ and $P$ are the eigenvalues and corresponding eigenvectors. — JMoravitz, Aug 02 '15 at 20:23
JMoravitz, all tht is being done in the question is run Hermite backward. We start with a symmetric matrix $A_0.$ At each step, we are going to use some elementary matrix $E,$ same as in row reduction, such that $A_{n+1} = E^T A E$ has one fewer off diagonal entries. We also began with $P_0 = I,$ and $P_{n+1} = P_n E.$ Eventually we get to $A_M = D$ and $P_M = P.$ — Will Jagy, Aug 02 '15 at 22:16
JMoravitz, please take a look at http://math.stackexchange.com/questions/1388421/reference-for-linear-algebra-books-that-teach-reverse-hermite-method-for-symmetr where I carefully present a 2 by 2 example that requires two "steps" — Will Jagy, Aug 09 '15 at 18:00

wmora2 · Answer 3 · 2022-06-29T14:44:31.373

When you parameterize conics with rotation, its usual make the variable change $\mathbf {X=P^TY}$ so that $P^T\mathbf{A}P=D$ where $D$ is a diagonal matrix and $\mathbf{A}$ is the matrix associate to the conics. The goal is eliminate cross terms $Bxy$.

The change of variable $\mathbf {X=P^TY}$ is deduced by completing the square (with respect to $x$ and $y$) in the conic of equation $Ax^2+Bxy+Cy^2+Dx+Ey+F=0$

If $B \neq 0 $ and $C \neq 0, $ the matrix $\mathbf {A} = \begin{pmatrix} A & B/2 \\ B/ 2 & A\end{pmatrix} $ is "diagonalized" by $\mathbf {P} = \ \begin {pmatrix} 1 & 0 \\ - B/(2 C) & 1\end {pmatrix} $ because $$\mathbf {P}^ T\mathbf {A}\mathbf {P} = \left( \begin{array}{cc} A-\frac{B^2}{4 C} & 0 \\ 0 & C \\ \end{array} \right)$$

If $B \neq 0 $ and $A \neq 0, $ the matrix $\mathbf {A} = \begin{pmatrix} A & B/2 \\ B/ 2 & A\end{pmatrix} $ is "diagonalized" by $\mathbf {P} = \ \begin {pmatrix} 1 & - B/(2 C) \\ 0 & 1\end {pmatrix} $ because $$\mathbf {P}^ T\mathbf {A}\mathbf {P} = \left( \begin{array}{cc} A & 0 \\ 0 & C-\frac{B^2}{4 A} \\ \end{array} \right) $$

As special case, If $A=C$ then $\mathbf{P}=\left( \begin{array}{cc} 1 & 1 \\ 1 & -1 \\ \end{array} \right)$.

$$\mathbf {P}^ T\mathbf {A}\mathbf {P}=\left( \begin{array}{cc} 2 A+B & 0 \\ 0 & 2 A-B \\ \end{array} \right)$$

In this last case, if you take $\frac{1}{\sqrt{2}}\mathbf P$, you get eingevalues in diagonal

Edit: For a general method you can see: "Linear Algebra Done Wrong" by Sergei Treil ).

Finding $P$ such that $P^TAP$ is a diagonal matrix

3 Answers3

Linked