1

What is the derivative of this expression with respect to x? Here, $c$ is a column vector.

$||Ax+b||^2_2c$

I think it has to be of the form $2A^T(Ax+b)c^T$ or $2c(Ax+b)^TA$ but I don't know how to arrive at this answer.

Edit: The original function I was working on is of the form $$f(x)=\frac{||Ax+b||^2_2}{c^Tx+d}$$ and I was trying to derive its Hessian.

2 Answers2

1

Background info: There's a rule for taking derivatives in multivariable calculus which states that if $g:\mathbb R^n \to \mathbb R^m$ is differentiable and $C$ is a $k \times m$ matrix, then the derivative of the function $f(x) = C g(x)$ is $f'(x) = C g'(x)$. We can apply this rule in this problem.


In this case, your function can be written as $$ f(x) = c g(x) $$ where $g(x) = \| Ax + b \|_2^2$. The derivative of $f$ is \begin{align*} f'(x) &= c g'(x) \\ &= 2c(Ax + b)^T A . \end{align*}


Computing the derivative of the function $g(x) = \| Ax + b \|_2^2$ is a question that comes up frequently on this site. I think the most elegant way to do it is using the chain rule. Note that $g(x) = h(u(x))$, where $u(x) = Ax + b$ and $h(v) = \| v \|_2^2$. The derivatives of $h$ and $u$ are $u'(x) = A$ and $h'(v) = 2v^T$. So by the chain rule $$ g'(x) = h'(u(x)) u'(x) = 2(Ax + b)^T A. $$

littleO
  • 51,938
  • 1
    Why the downvote? Is this not correct? – littleO Apr 03 '20 at 05:01
  • Not sure why you were downvoted. But yeah, your answer looks correct so I upvoted you. Maybe we were skimping on the details of matrix calculus? Your edit seems to have addressed it. – Alex Lapanowski Apr 03 '20 at 05:56
  • 1
    @AlexLapanowski Yeah I guess there weren't enough details. By the way, I wouldn't call this matrix calculus; I think this is just vector calculus. (If we're taking the derivative of a function $f: \mathbb R^n \to \mathbb R^m$, then I'd say we're doing vector calculus. In my mind, matrix calculus involves functions which take a matrix as input, rather than a vector.) – littleO Apr 03 '20 at 05:58
  • @littleO updated the question with the original problem. – user436661 Apr 03 '20 at 06:33
0

Okay, as far as I understand, A is a matrix, b is a vector of appropriate size, and c is a vector of arbitrary size. The vector $x$ is the only variable in this equation.

In this case, we can simply compute the derivative of $\frac{\partial}{\partial x}\|Ax+b\|^{2}$ and then multiply the result by the column-vector $c$.

By the rules of matrix calculus, $$ \frac{\partial}{\partial x}\|Ax+b\|^{2} = \frac{\partial}{\partial x} (x^\top A^\top Ax +2 b^\top Ax+\|b\|^{2})=2A^\top A x+2A^\top b. $$

We need to be careful that the dimensions work out well. So let's make our derivative vector a row vector by taking the transpose. Hence, $$ \bigg(\frac{\partial}{\partial x} \|Ax+b\|^{2} \bigg)^\top c = 2(A^\top A x+A^\top b)^\top c. $$

Perhaps you're unfamiliar with matrix calculus. Here's the wikipedia article covering it: https://en.wikipedia.org/wiki/Matrix_calculus

It's just a matter of getting used to the notation. The only tricky result to show is that $\frac{\partial }{\partial x} x^\top Mx = 2Mx$ for any matrix $M$ and vector $x$ of appropriate dimensions.

Alex Lapanowski
  • 2,926
  • 2
  • 20
  • 23