0

I'm following some lecture notes regarding the quadratic form of matrices: $$ x^{\prime} A x=\left[x_{1} x_{2} \cdots x_{n}\right]\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \\ a_{21} & a_{22} & \cdots & a_{2 n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n 1} & a_{n 2} & \cdots & a_{n n}\end{array}\right]\left[\begin{array}{c}x_{1} \\ x_{2} \\ \vdots \\ x_{n}\end{array}\right]=\sum_{i=1}^{n} \sum_{j=1}^{n} x_{i} a_{i j} x_{j} $$

When doing this $x^{\prime} A x$ matrix multiplication, I had to actually multiply the matrices out to see at the very end that it is indeed the same as $\sum_{i=1}^{n} \sum_{j=1}^{n} x_{i} a_{i j} x_{j}$.

I know the general entry by entry rule, but is there a way of seeing that immediately? Or is there a rule that nicely specifies a summation formula like this for any matrix multiplication?

The problem goes on to take the partial derivative as follows: $\frac{\partial f}{\partial x_{k}}=2 x_{k} a_{k k}+\sum_{j=1, j \neq k}^{n} a_{k j} x_{j}+\sum_{i=1, i \neq k}^{n} x_{i} a_{i k}=2 x_{k} a_{k k}+2 \sum_{j=1, j \neq k}^{n} a_{k j} x_{j}$

Same question here. I had to explicitly specify the long version of the equation, then take the partial derivative to see the that the summation simplification works. How might I take a derivative of the double summation here without having to explicitly use the long expression?


Edit: My work and the way I did it enter image description here

aisync
  • 365
  • You might like Einstein notation. – Eric Apr 24 '21 at 02:43
  • @Eric ill look into it. My main question is that I'm not missing something that would've made the calculation much easier. The more I'm reading, it seems like the summation was just to simplify the answer, and not so much the way the answer was derived. – aisync Apr 24 '21 at 03:00
  • 1
    You did an extra step in splitting out the $\neq k$ competent, so you could’ve gone from the sum to the derivative on each of the three parts (the middle one is 0) to $x’Ae_k + e_k’ A x$ – Eric Apr 24 '21 at 03:06
  • Warning: in the formula for the derivative you are implicitly assuming that the matrix $A$ is symmetric. [Look at $A=\begin{pmatrix}0 & 1\ 0 & 0\end{pmatrix}$, where $f=x_1 x_2$ to see what I mean.] – ancient mathematician Apr 24 '21 at 07:12
  • As to your first question I think that honestly it's just a matter of experience. But surely we all know that $Ax$ represents a set of equations $\sum_{j=1}^n a_{ij}x_j$, and pre-multiplying by $x^T$ just adds these up with weights $x_i$? – ancient mathematician Apr 24 '21 at 07:15
  • @ancientmathematician ya part of the problem set is proving why we can make that assumption. We assume A is not symmetric and define $ B = (A' + A)/2 $. B is symmetric and after simplification is the same as the original A. I'm going to include a screenshot of my work in an edit above. I believe it is the same as the answer, but please let me know otherwise. – aisync Apr 24 '21 at 07:16
  • Of course I know that, but you didn't make the assumption in your question. – ancient mathematician Apr 24 '21 at 07:18
  • @aisync Now that I see your working I'd say this: learn to use the summation notation. Your working is full of "dot dot dot" which means "make a good guess at what I have missed out"; the summation notation is precise. It may not matter in easy cases, but it can lead to serious problems. – ancient mathematician Apr 24 '21 at 09:28

1 Answers1

0

In Einstein notation, which differs from what you use only in its hiding the $\sum$s because we can infer them from which indices are repeated, matrix multiplication is defined by $(Ax)_i=A_{ij}x_j$. Indeed, this is the most general linear transformation of $x$'s components which, like $x$, has one uncontracted index, so it's natural to place the $A_{ij}$ in a rectangular array we call a matrix, multiplying with vectors as defined above. For square $A$ conformable with $x$, $x^\prime Ax$ exists, and is (the $1\times1$ matrix whose only entry is) $x\cdot Ax$. We define dot products by $u\cdot v=u_iv_i$, a scalar because it lacks uncontracted indices, making it rotationally invariant. So $x\cdot Ax=x_i(Ax)_i=x_iA_{ij}x_j$.

Using the Kronecker delta,$$\begin{align}\frac{\partial}{\partial x_k}(x_iA_{ij}x_j)&=\delta_{ik}A_{ij}x_j+x_iA_{ij}\delta_{jk}\\&=A_{kj}x_j+x_iA_{ik}\\&=A_{kj}x_j+A^T_{ki}x_i\\&=[(A+A^T)x]_k.\end{align}$$

J.G.
  • 115,835