2

Let $A=\left(A_{i j}\right)$ be an $n \times n$ symmetric matrix, and define the function $f: \mathbb{R}^n \rightarrow \mathbb{R}$ as $$ f(x):= \frac{1}{2} ​ ⟨x,Ax⟩ $$ Using the definition, determine the second-order derivative $D^2 f(a) \in \operatorname{Hom}^2\left(\mathbb{R}^n, \mathbb{R}\right)$. (Here, $\operatorname{Hom}^2\left(\mathbb{R}^n, \mathbb{R}\right)$ refers to the space of bilinear maps $\mathbb{R}^n \times \mathbb{R}^n \rightarrow \mathbb{R}$.).

I have found the derivative using the definition, that is $a^TA$, and I know that the definition of the second-order derivative is $\left(D^2 f\right)(a):=D(D f)(a) \in \operatorname{Hom}^2\left(\mathbb{R}^n, \mathbb{R}^m\right)$, but I don't know how to use this to find the second-order derivative.

Allison
  • 113

3 Answers3

3

Using the definition of inner product:

$$\langle x, Ax\rangle=\sum_{j}\sum_{i}x_j A_{i,j} x_i $$

Taking the $\partial_{x_k}$ component to the inner product:

$${\partial \over \partial {x_k}}\sum_{j}\sum_{i}x_j A_{i,j} x_i=\sum_{j}\sum_{i}{\partial \over \partial {x_k}}x_j A_{i,j} x_i$$

$$=2A_{k,k}x_k+\sum_k\sum_{i\neq k}A_{i,k}x_i$$

Then for the second "derivative" (in reality the Jacobian of the gradient, the Hessian matrix), has component of mixed partial derivatives, so you need to take partial derivative of one component with respect of all components.

$${\partial \over \partial x_l}{\partial \over \partial x_k}\langle x, Ax\rangle={\partial \over \partial x_l}\left(2A_{k,k}x_k+\sum_k\sum_{i\neq k}A_{i,k}x_i\right)$$

$$= \begin{cases} 2 A_{k,k} & k=l \\ \sum_k\sum_{i\neq k}A_{i,k} & k\neq l \end{cases} $$

$${d^2\over d\mathbf x^2}\langle x,Ax\rangle=A+A^T $$

In your case, is $2A$ because $A$ is symmetric and the last result is just $A$ for the factor of $1/2$.

EDIT:

If you use the Einstein summation convention, its more intuitive:

$$\langle x, Ax\rangle=A_j^i x_i x^j$$ $$\Rightarrow {\partial \over \partial x_k}\left(A_j^i x^j x_i \right)=A_j^i\delta_k^j x_i+A_j^i x^j \delta^k_i=A_k^i x_i+A_j^k x^j$$ $$\Rightarrow {d\over d\mathbf x}\langle x, Ax\rangle=x^T(A+A^T)$$

$$ $$

$${\partial \over \partial x_l}\left(A_k^i x_i+A_j^k x^j \right)=A_k^i \delta_i^l+A_j^k \delta^j_l=A_k^l+A_l^k $$ $$\Rightarrow{d\over d\mathbf x}x^T(A+A^T)=A+A^T $$

$$ $$

$${d^2\over d\mathbf x^2}\langle x, Ax\rangle=A+A^T$$

0

I did it in a different way, where $C$ is the second-order derivative, $$ \begin{aligned} r'(h) &= f'(a+h) - f'(a) - Ch \\ &= (a+h)^T A - a^TA - Ch \\ &= a^TA + h^T - a^T A - Ch \\ &= h^TA - Ch \end{aligned}. $$ and now take the limit of $h$ to $0$, $$ \begin{aligned} \lim_{h \to 0} \frac{h^TA - Ch}{||h||} = \lim_{h \to 0} \frac{h^TA - Ch}{h} = A - C. \end{aligned} $$ and choose $C = A$ for the term to be equal to $0$ and so $A$ is the second-order derivative.

Allison
  • 113
-1

Because you wrote $f(x) = \frac12\langle x, Ax\rangle$ I'm going to keep using $x$ consistently and write $$ Df(x)(u) = \langle x, Au\rangle = x^TAu. $$ We want to differentiate this expression again with respect to $x$; but it is a linear function in $x$, and the derivative of a linear function is that function itself. Hence $$ D^2f(x)(u, v) = D[x \mapsto Df(x)(u)](v) = \langle v, Au\rangle, $$ or in matrix notation $$ D^2f(x)(u, v) = v^TAu. $$

  • Could someone explain the downvotes? This is correct and consistent with the other answers. Is there something wrong with my presentation? – Nicholas Todoroff May 18 '23 at 20:19
  • I think you got a little confuse. The question ask for a solution being a bilinear map, so the solution can be written as a matrix, your solution is a covector. – Daniel Muñoz May 18 '23 at 21:30
  • @DanielMuñoz My solution is not a covector, I left the actual calculation of the second derivative up to OP with the hint that the first derivative is linear in the point-of-differentiation $x$. I think though that I was wrong in telling them their derivative was incorrect, I got confused by their switch in notation from $x$ to $a$ for the point-of-differentiation. – Nicholas Todoroff May 18 '23 at 21:57
  • I've edited my answer and hopefully made the main point clearer. – Nicholas Todoroff May 18 '23 at 22:07
  • "...the derivative of a linear function is that function itself." Are you saying that linear functions are eigenvectors of derivatives? – Daniel Muñoz May 19 '23 at 11:47
  • @DanielMuñoz I'm not sure that interpretation really works, but maybe? This is a really basic fact of (total) derivatives. $Dg(x)$, the total derivative of $g$ at $x$, is the linear transformation that best approximates $g$ at $x$. So if $g$ is linear then intuitively $$Dg(x)(u) = g(u)$$ and this is very easy to prove. – Nicholas Todoroff May 19 '23 at 17:40