Derivative of $\|Ax+b\|_2^2c$ w.r.t. $x$

Question

What is the derivative of this expression with respect to x? Here, $c$ is a column vector.

$||Ax+b||^2_2c$

I think it has to be of the form $2A^T(Ax+b)c^T$ or $2c(Ax+b)^TA$ but I don't know how to arrive at this answer.

Edit: The original function I was working on is of the form $$f(x)=\frac{||Ax+b||^2_2}{c^Tx+d}$$ and I was trying to derive its Hessian.

Does this answer your question? How to take the gradient of the quadratic form? — Rodrigo de Azevedo, Apr 03 '20 at 06:22
Are you sure? Your question has been asked 100s of times before under different guises. Asking the community to answer the same over and over is a bit silly. — Rodrigo de Azevedo, Apr 03 '20 at 06:35
@user550103 Yes. It's a scalar $d$. And $b$ is a vector. Corrected. — user436661, Apr 03 '20 at 07:04
Ok, then you can use the chain rule. Let's say $f(x) = \frac{g(x)}{h(x)}$, $g(x) = |Ax + b |_2^2$ and $h(x) = c^Tx + d$. So, $f^\prime(x) = \frac{g^\prime(x)}{h(x)} - \frac{g(x) h^\prime(x)}{h(x)^2}$. Below you have answers which can be rewritten $g^\prime(x) = 2A^T\left( Ax+b\right)$ and $h^\prime(x) = c$. — user550103, Apr 03 '20 at 07:22
@user550103 gradient was okay for me. Dealing with the dimensions when deriving the Hessian was the problem. — user436661, Apr 06 '20 at 08:19
@user436661 I would suggest to write a new question and clearly ask for Hessian. — user550103, Apr 06 '20 at 08:30

littleO · Accepted Answer · 2020-04-03T05:56:24.423

1

Background info: There's a rule for taking derivatives in multivariable calculus which states that if $g:\mathbb R^n \to \mathbb R^m$ is differentiable and $C$ is a $k \times m$ matrix, then the derivative of the function $f(x) = C g(x)$ is $f'(x) = C g'(x)$. We can apply this rule in this problem.

In this case, your function can be written as $$ f(x) = c g(x) $$ where $g(x) = \| Ax + b \|_2^2$. The derivative of $f$ is \begin{align*} f'(x) &= c g'(x) \\ &= 2c(Ax + b)^T A . \end{align*}

Computing the derivative of the function $g(x) = \| Ax + b \|_2^2$ is a question that comes up frequently on this site. I think the most elegant way to do it is using the chain rule. Note that $g(x) = h(u(x))$, where $u(x) = Ax + b$ and $h(v) = \| v \|_2^2$. The derivatives of $h$ and $u$ are $u'(x) = A$ and $h'(v) = 2v^T$. So by the chain rule $$ g'(x) = h'(u(x)) u'(x) = 2(Ax + b)^T A. $$

edited Apr 03 '20 at 05:56

answered Apr 03 '20 at 04:46

littleO

51,938

1

Why the downvote? Is this not correct? – littleO Apr 03 '20 at 05:01
Not sure why you were downvoted. But yeah, your answer looks correct so I upvoted you. Maybe we were skimping on the details of matrix calculus? Your edit seems to have addressed it. – Alex Lapanowski Apr 03 '20 at 05:56
1

@AlexLapanowski Yeah I guess there weren't enough details. By the way, I wouldn't call this matrix calculus; I think this is just vector calculus. (If we're taking the derivative of a function $f: \mathbb R^n \to \mathbb R^m$, then I'd say we're doing vector calculus. In my mind, matrix calculus involves functions which take a matrix as input, rather than a vector.) – littleO Apr 03 '20 at 05:58
@littleO updated the question with the original problem. – user436661 Apr 03 '20 at 06:33

Alex Lapanowski · Answer 2 · 2020-04-03T05:32:34.330

Okay, as far as I understand, A is a matrix, b is a vector of appropriate size, and c is a vector of arbitrary size. The vector $x$ is the only variable in this equation.

In this case, we can simply compute the derivative of $\frac{\partial}{\partial x}\|Ax+b\|^{2}$ and then multiply the result by the column-vector $c$.

By the rules of matrix calculus, $$ \frac{\partial}{\partial x}\|Ax+b\|^{2} = \frac{\partial}{\partial x} (x^\top A^\top Ax +2 b^\top Ax+\|b\|^{2})=2A^\top A x+2A^\top b. $$

We need to be careful that the dimensions work out well. So let's make our derivative vector a row vector by taking the transpose. Hence, $$ \bigg(\frac{\partial}{\partial x} \|Ax+b\|^{2} \bigg)^\top c = 2(A^\top A x+A^\top b)^\top c. $$

Perhaps you're unfamiliar with matrix calculus. Here's the wikipedia article covering it: https://en.wikipedia.org/wiki/Matrix_calculus

It's just a matter of getting used to the notation. The only tricky result to show is that $\frac{\partial }{\partial x} x^\top Mx = 2Mx$ for any matrix $M$ and vector $x$ of appropriate dimensions.

I just updated the question with the original problem. – user436661 Apr 03 '20 at 06:23 — user436661, Apr 03 '20 at 06:23

Derivative of $\|Ax+b\|_2^2c$ w.r.t. $x$

2 Answers2