1

If we define a function $f : \mathbb{R}^n \to \mathbb{R}$

We can write its matrix form as: $ f(x) = \frac{1}{2}x^tAx + b^tx + c$ and x is a [n,1] vector

To compute $\nabla f(x) $ we expand the matrix form such that:

$f(x) = \frac{1}{2} \sum_i \sum_j x_ix_j A_{ij} + \sum_i b_ix_i + c$

However when we take the partial deriv w.r.t. $x_k$ such that:

$\frac{\partial}{\partial x_k} f(x) =\frac{1}{2}(\sum_j x_j A_{kj} + \sum_i x_i A_{ik} ) + b_k$

I don't understand how we're getting the first 2 summations when taking the partials w.r.t to the symmetric matrix A and the $x_{i,j} \forall i,j \in n$ where the matrix A is a $[n,n]$ matrix?

1 Answers1

2

Let $k$ be fixed. Then in the term

$$\sum_{i,j=1}^n x_ix_j A_{ij},$$

there are only three types that involves $x_k$:

$$\begin{cases}x_kx_k A_{kk} & \text{when } i=j=k, \\ \sum_{i\neq k} x_i x_k A_{ik} & \text{when } j=k, i\neq k, \\ \sum_{j\neq k} x_k x_j A_{kj} &\text{when } j=k, i\neq k. \end{cases}$$

Taking $\frac{\partial}{\partial x_k}$ to each terms, we obtain

\begin{align} \frac{\partial }{\partial x_k} \sum_{i,j=1}^n x_ix_j A_{ij}&= 2x_k A_{kk} + \sum_{i\neq k} x_i A_{ik} + \sum_{j\neq k} x_j A_{kj}\\ &= x_k A_{kk} + \sum_{i\neq k} x_i A_{ik} + x_k A_{kk} + \sum_{j\neq k} x_j A_{kj} \\ &= \sum_{i} x_i A_{ik} + \sum_{j} x_j A_{kj}. \end{align}

Arctic Char
  • 16,007