Product rule for vector-valued functions

Question

I'm trying to wrap my head around how to apply the product rule for matrix-valued or vector-valued matrix functions.

Specifically, I'm trying to work through how to apply the product rule to $$x^TAx = f(x)g(x)$$ where $f(x) = x^T$, $g(x)=Ax$, $x\in\mathbb{R}^N$, and $A\in \mathbb{R}^{NxN}$

I know that $\nabla_x x^TAx = (A + A^T)x$ or $x^T(A + A^T)$ depending on the layout, however I'm just trying to use this as an example to see if I can get the same result with the product rule.

This question explains it for scalar-valued functions as $$f(x)\nabla_x g(x)+g(x)\nabla_x f(x).$$

However things don't have the correct dimensions when I plug in the values in the above, namely. As Travis wrote in the comment below, we should have:

$$ \nabla_x(x^TAx) = (\nabla_x x^T)Ax + x^T\nabla_x(Ax) $$

however that still leaves you with at least an $x$ in the first expression and an $x^T$ in the second. I don't see how that can conform and how it leaves you with $(A + A^T)x$ or $x^T(A + A^T)$

This question is essentially asking the same thing, but the answer doesn't really involve the product rule above. I figure there must be some general formula to apply, as with scalar-valued functions.

Am I writing the product rule correctly in this case? Is there somethign I'm missing or doing incorrectly?

EDIT:

Building off of Algabraic Pavel's answer... I think the problem is that you have to formulate the functions $f(x)$ and $f(x)$ so their in the same space.

That is, for $f,g:\mathbb{R}^N\rightarrow \mathbb{R}^M$, the product rule is:

$$\nabla_x (f(x)^Tg(x)) = f(x)^T\nabla_x g(x) + g(x)^T \nabla f(x)$$

So in the example above, if we let $f(x) = x$, $g(x)=Ax$, then the formula holds.

As another example, consider $$Axx^T$$ and let $f(x) = x^T A^T$ and $g(x) = x^T$. We have both $f,g:\mathbb{R}^{Nx1} \rightarrow \mathbb{R}^{1xN}$ and

$$\nabla_x (f(x)^Tg(x)) = \nabla_x (Axx^T) = Ax + xA^T$$

which holds, notice that if we made $f(x) = Ax$ and not $f(x) = (Ax)^T$, the rule falls apart.

I still don't know if this holds in all instances though. Any counter examples?

The rule is formally the same for as for scalar valued functions, so that $$\nabla_X (x^T A x) = (\nabla_X x^T) A x + x^T \nabla_X(A x) .$$ We can then apply the product rule to the second term again. NB if $A$ is symmetric we can simply the final expression using $\nabla_X (x^T) = (\nabla_X x)^T$. — Travis Willse, May 04 '18 at 13:39
But doesn't that still leave you with an $x^T$ in one expression and an $x$ in another? I'm just not seeing how they conform... but I know I'm clearly missing something. We know that the answer is $(A + A^T)x$ — measure_theory, May 04 '18 at 13:42
Your notation is rather misleading, especially using $X$ in place of $x$ as the direction of differentiation. I see now that you're asking about another quantity altogether. — Travis Willse, May 04 '18 at 14:51
There is a very general rule for the differential of a product $$d(A\star B)=dA\star B + A\star dB$$ where $\star$ is any kind of product (matrix, Hadamard, Frobenius, Kronecker, dyadic, etc} and the quantities $(A,B)$ can be scalars, vectors, matrices, or tensors. There is no general rule for the gradient of a product. — greg, May 04 '18 at 17:58
In order to make sense of a direct application of the product rule as you’re trying to do, you first have to define what it means to apply $\nabla$ to a vector. What do you mean by it in your equations? — amd, May 04 '18 at 20:46

Algebraic Pavel · Answer 1 · 2018-05-04T15:31:21.440

6

It all depends on the conventions you use. Examine the product rule derivative component by component and get that in this case it gives you $$ \tag{1} \nabla_x[f(x)^Tg(x)]=f(x)^T\nabla_xg(x)+g(x)^T\nabla_x f(x). $$ So with $f(x):=x$ and $g(x):=Ax$, we have $$ \nabla_x(x^TAx)=x^TA+x^TA^T=x^T(A+A^T). $$

If $f,g:\mathbb{R}^n\to\mathbb{R}^m$, then $$ \frac{\partial}{\partial x_j}f^Tg= \frac{\partial}{\partial x_j}\sum_{i=1}^mf_ig_i= \sum_{i=1}^m\left(f_i\frac{\partial g_i}{\partial x_j}+g_i\frac{\partial f_i}{\partial x_j}\right). $$ So defining $$ \nabla_x f=\left(\frac{\partial f_i}{\partial x_j}\right)_{ij} $$ gives (1).

edited May 04 '18 at 15:31

answered May 04 '18 at 15:11

Algebraic Pavel

22,928

1

So in general is the product rule then $u(x)\nabla_x v(x) + [\nabla_x u(x)v(x)]^T$? I guess I'm just kind of confused why we're taking the transpose in the second expression. – measure_theory May 04 '18 at 15:18
@measure_theory I've added some details. – Algebraic Pavel May 04 '18 at 15:32
Thanks, I'm still a bit confused about the general rule, though. See my edit above. – measure_theory May 04 '18 at 17:13

Product rule for vector-valued functions

1 Answers1

Linked