Intuition for the Product of Vector and Matrices: $x^TAx $

Question

When I took linear algebra, I had no trouble with the mechanical multiplication of matrices. Given the time to write things out and mumble a bit about $i$th and $j$th rows, I can do the products no problem. However, as I expand my interests and read more advanced texts, the pause to mumble and scratch is becoming a significant barrier to my comprehension.

So, I ask those more experienced how best to build intuition for matrix multiplication, especially large or arbitrary matrices. Are there any good tricks or rules of thumb that I've missed? Does it just come with constant exposure/repetition? How would you go about quickly interpreting (for example) the statement: $$x^TAx $$ where $A$ is a $n$ by $n$ matrix and $x$ is a $n\times 1$ matrix?

See http://math.stackexchange.com/q/131889/29966 and http://math.stackexchange.com/q/192835/29966 , for example — Ben Millwood, Sep 17 '12 at 23:10
Are you asking what matrix multiplication means, or just how to perform it without having to stop and think? — Alex Becker, Sep 17 '12 at 23:11
@Alex I know what matrix multiplication means, I'm asking for ways to perform it quickly, or at least 'know' categorical/broad stroke information about the products at a glance. — Andrew Christianson, Sep 18 '12 at 03:03
@Drew: "How to perform arithmetic quickly" and "knowing broad information about algebra" are almost totally different topics. — , Sep 21 '12 at 17:34

kjetil b halvorsen · Accepted Answer · 2019-01-06T15:54:52.877

To add to the earlier response, and interpreting your question to ask for heuristics for how to manipulate matrices, and not specifically for what matrix multiplication means.

I assume here we interpret vectors as columns vectors, so $x^T$ would refer to a row vector, and capitals for matrices. When $A=(a_{ij})$ then $A^T=(a_{ji})$ so transposition, that is, interchange of rows and columns, corresponds to switching the indices! Remembering that, you can easily convert a symbolic matrix product to a sum over indexed expressions, manipulate, and reconvert to a symbolic matrix product.

One useful trick is pre- and post-multiplication by diagonal matrices: premultiplication corresponds to operations on the rows, while post-multiplication corresponds to operations on the columns. That is, letting $D$ be a diagonal matrix, in $DA$ each row of $A$ is multiplied by the corresponding diagonal element of $D$, while in $AD$ each column of $A$ is multiplied with the corresponding diagonal element.

Now an example to show how to use this manipulative tricks. Suppose $X$ is an $n\times n$ matrix such that there exists an basis for $\mathbb R^n$ consisting of eigenvectors of $X$ (we assume all elements are real here). That is, the eigenvalue/eigenvector equation $Xx=\lambda x$ has $n$ linearly independent solutions, call them (or some choice of them if they are not unique) $x_1, \dots, x_n$. with corresponding eigenvalues $\lambda_i$, the elements of the diagonal matrix $\Lambda$. Write $$ X x_i = \lambda_i x_i $$ Now let $P$ be a matrix with the $x_i$ as columns. How can we write the equations above as one matrix equation? Note that the constants $\lambda_i$ are multiplying columns, we know that in the matrix representation the diagonal matrix $\Lambda$ must postmultiply $P$. That is, we get $$ X P = P \Lambda $$ Premultiplying on both sides with the inverse of $P$, we get $$ P^{-1} X P = \Lambda $$ That is, we can se that $X$ is similar to the diagonal matrix consisting of its eigenvalues.

One more example: If $S$ is a sample covariance matrix, how can we convert it to a sample correlation matrix? The correlation between variable $i$ and $j$ is the covariance divided into the standard deviations of variable $i$ and of variable $j$: $$ \text{cor}(X_i,X_J) = \frac{\text{cov}(X_i, X_j)} {\sqrt{\text{var}(X_i) \text{var}(X_j) }} $$

Looking at this with matrix eyes, we are dividing the $(i,j)$-element of the matrix $S$ with the square roots of the $i$th and $j$th diagonal elements! We are dividing each row of $S$ and each column of $S$ with the same diagonal elements, so it can be expressed as pre- and post-multiplication by the (same) diagonal matrix, that holding the square roots of the diagonal elements of $S$. We have found: $$ R = D^{-1/2} S D^{-1/2} $$ where $R$ is the sample correlation matrix, and $D$ is a diagonal matrix holding the diagonal elements of $S$.

There are a lots of applications of this kind of tricks, and I find it so useful that textbooks should include them. One other example: Now let P be a permutation matrix, that is an $n\times n$ matrix representing a permutation on $n$ symbols. Such a matrix has one 1 and $n-1$ zeros in each row and each column, and can be obtained by permuting (in the same way!) the rows and columns of an identity matrix. Now $AP$ (since it is a post-multiplication) permutes the columns of $A$, while $PA$ permutes the rows of $A$.

Berci · Answer 2 · 2012-09-17T23:36:51.310

Forgetting the coordinates, or even that these are matrices at all, can be a very efficient way of thinking in some circumstances.

$x^tAy$ can mean any bilinear mapping $\beta: (x,y) \mapsto x^tAy$, something like a 'generalized scalar multiplication'. If you have fixed a basis $e_1,\ldots,e_n$, then $\beta(e_i,e_j) = e_i^tA e_j =$ (the matrix element of $A$ at the i_th row and j_th column), and this information uniquely determines all the generalized scalar products.

Anyway, besides this, more frequently, $n\times m$ matrices are identified with the $\mathbb R^m\to\mathbb R^n$ linear mappings, those are, roughly speaking, origo and line preserving geometrical transformations. [The correspondence is coming again by applying any fixed basis, usually the standard one]. And the matrix multiplication corresponds to the composition of these transformations.

Intuition for the Product of Vector and Matrices: $x^TAx $

2 Answers2

Linked