4

I do not want to use index notation.

I want to compute the derivative

$$ D_x (Axx^\top A) = ? $$ where $A$ is an $n\times n$ symmetric matrix and $x$ in a vector in $\mathbb{R}^n$. I tried resources such as the matrix calculus cookbook but they don't deal with scenarios like this: Here the function $f(x) = Axx^\top A$ takes a vector as input and returns a matrix output.

It is possible to express this without using index notation and I want this type of answers. I would like step by step, to figure out how I can go about performing similar calculations in the future.

Attempt

One attempt is using the Frechet derivative definition (I will use the Frobenius norm) $$ \begin{align} \lim_{\|v\|\to 0} \frac{\|A(x+v)(x+v)^\top A - Axx^\top A - Dv\|_F}{\|v\|} &= \lim_{\|v\|\to 0} \frac{\|A(xv^\top +vv^\top + vx^\top)A - Dv\|_F}{\|v\|} \end{align} $$

Euler_Salter
  • 5,153
  • Since the result is a third order tensor you cannot avoid writing ${\left( \frac{\partial T}{\partial x}\right)^{lm}}_i={A^j}_i x_j A^{lm}
    • {A^m}_i x^k {A^l}_k = (Ax)_i A^{lm} + {A^m}_i (x^TA)^l $ where $T=Axx^TA$
    – Ted Black Mar 07 '24 at 22:38

2 Answers2

6

Let's look at pertubations $$f(x+v) = A(x+v)(x+v)^TA = Axx^TA + Axv^TA + Avx^TA + Avv^TA$$

The derivate is often defined as the unique linear function such that: $$f(x+v) = f(x) + D_{f;x}(v) + \mathcal{o}(v)$$ as $v\rightarrow 0$.

Thus $D_{f;x}: v\mapsto A(xv^T+vx^T)A$ is the linear derivative map. We can express this not as a matrix, but a tensor of 3-rd order (a matrix would be a second order tensor). However, compact expressions for tensors of higher order are elusive and so index notations are used.

Snake707
  • 1,041
  • 5
  • 7
1

Let $v\in\mathbb R^n$ such that $v\neq 0_n$. Then \begin{align*} 0&\leq\frac{\Vert A(xv^\top + vx^\top)A - D_xv + Avv^\top A\Vert}{\Vert v\Vert} \\&\leq \frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} + \frac{\Vert Avv^\top A\Vert}{\Vert v\Vert} \\ &= \frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} + \frac{\Vert(Av)(Av)^\top\Vert}{\Vert v\Vert} \\ &\leq \frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} + \frac{\Vert Av\Vert\cdot\Vert Av\Vert}{\Vert v\Vert} \\ &\leq\frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} + \frac{\Vert A\Vert^2\Vert v\Vert^2}{\Vert v\Vert} \\ &=\frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} + \Vert A\Vert^2\Vert v\Vert,\end{align*} by Triangle inequlty and operator norm inequality (2x). The term $\Vert A\Vert^2\Vert v\Vert$ converges to $0$ as $\Vert v\Vert\rightarrow 0$. Now consider the linear map $D_x:\mathbb R^n\rightarrow\mathbb R^{n\times n}$ given by $D_xv = A(xv^\top + vx^\top)A$. For this choice, we have that $$\frac{\Vert A(xv^\top+vx^\top)A - D_xv\Vert}{\Vert v\Vert} = 0.$$ Consequently, this expression (trivially) converges to $0$ as $\Vert v\Vert\rightarrow 0$. That is, this $D_x$ is the Frechet derivative of $f(x) = Axx^\top A$.

But be careful: we can only compute the "point evaluation" $D_x$ (and not $D$ withouth the subscript) in this sense as $D$ would be an element of $\mathbb R^{n\times n\times n}$ (i.e., a third order tensor).