0

I am trying to find the gradient

$$\nabla \mbox{trace}(Axx^TB)$$

where both $A$ and $B$ are $n \times n$ matrices, and $x$ is an $n$-length column vector

I'm not exactly sure how to approach this problem in order to lead me to the solution. I know that $xx^T$ forms an $n \times n$ matrix with squares along the diagonal, but how do both other matrices multiply with that to achieve the trace?

3 Answers3

1

$$ \phi =\mathrm{tr} \left( \mathbf{A}\mathbf{x}\mathbf{x}^T\mathbf{B} \right) =\mathrm{tr} \left( \mathbf{B}\mathbf{A} \mathbf{x}\mathbf{x}^T \right) $$

The differential writes \begin{eqnarray} d\phi &=& (\mathbf{B}\mathbf{A})^T :d(\mathbf{x}\mathbf{x}^T) \\ &=& 2\mathrm{sym}(\mathbf{B}\mathbf{A}) \mathbf{x} :d\mathbf{x} \end{eqnarray}

where $\mathrm{sym}(\mathbf{C})= \frac12 \left( \mathbf{C}+\mathbf{C}^T \right)$ and the colon operator : denotes the Frobenius inner product.

The gradient is the vector $$ \frac{\partial \phi}{\partial \mathbf{x}}= 2\mathrm{sym}(\mathbf{B}\mathbf{A}) \mathbf{x}$$

Steph
  • 3,665
1

Given matrices ${\bf A}, {\bf B} \in \Bbb R^{n \times n}$, let scalar field $f : \Bbb R^n \to \Bbb R$ be defined by

$$ f ({\bf x}) := \mbox{tr} \left( {\bf A} {\bf x} {\bf x}^\top {\bf B} \right) = \mbox{tr} \left( {\bf x}^\top {\bf B} {\bf A} {\bf x} \right) = {\bf x}^\top {\bf B} {\bf A} {\bf x} $$

Note that $f$ is a quadratic form. The gradient of $f$ is

$$ \nabla_{{\bf x}} f ({\bf x}) = \color{blue}{\left({\bf B} \, {\bf A} + {\bf A}^\top {\bf B}^\top \right) {\bf x}}$$

0

$\operatorname{trace}(Axx^TB) = \langle Ax, B^Tx \rangle = \sum_{i\in[n]} (Ax)_{i} \times (B^Tx)_{i} = \sum_{i\in[n]} (A_i \cdot x) \times (B^i \cdot x),$ where $A_i$ is the $i$-th row of A, and $B^i$ is the $i$-th column of B.

$\langle X,Y \rangle = \operatorname{trace}(X^T Y)$ (Euclidean [matrix] inner product)

$x \cdot y = \sum_i x_i \times y_i$ (Euclidean [vector] inner product)

Vezen BU
  • 1,963