Can the Jacobian be written as $\mathbf{f}\nabla^T$?

Question

I was messing around with the various representations of the Jacobian and noticed that it should be able to be written as follows:

$$\mathbf{J} = \mathbf{f}\nabla^T$$

Where $\mathbf{f}$ is a multivariate vector function:

$$\mathbf{f}(x, y, z) = (f(x, y, z), g(x, y, z), h(x, y, z))$$

This expression for the Jacobian would expand to the following:

$$\mathbf{J} = \begin{bmatrix}f \\ g \\ h\end{bmatrix}\begin{bmatrix} \frac{\partial}{\partial x} & \frac{\partial}{\partial y} & \frac{\partial}{\partial z} \end{bmatrix}$$

This seems to work, but surprisingly, I can't find reference to this form anywhere. Does this work or is there some technicality I'm missing? If it does work, why is it never used?

The notation $\mathbf{J} = \mathbf{f}\nabla^T$ is never used because it is extremely confusing. You are free to rigorously introduce it before using it but as your reader I would ask myself every time "why"? — Kurt G., Jul 26 '23 at 08:12
How can you possibly write $f\frac{\partial}{\partial x}$ for $\frac{\partial}{\partial x}f$? — Ted Shifrin, Jul 26 '23 at 23:03
The expression that you want is usually written as $,\large{J=\big(\nabla f\big)^T};$ — greg, Jul 29 '23 at 10:48

Localth · Answer 1 · 2023-07-26T06:14:32.590

1

As @ZainJabbar alluded to, $\nabla$ operates from the left, so instead of the Jacobian from your expression we obtain a "rescale-$\mathbf f$-by-divergence-of-a-second-vector-field" operator. Nevertheless, the Jacobian can be written as a Kronecker product:

$$J=\nabla^T \otimes \mathbf f$$

edited Jul 26 '23 at 06:14

answered Jul 26 '23 at 06:08

Localth

1,286

Differentiating to the right is nothing more than a (very poor) convention. As long as we understand that $\mathbf f$ is being differentiated, then OP's expression is 100% correct. – Nicholas Todoroff Jul 26 '23 at 06:26
2

@NicholasTodoroff Not convinced... any sources? I'd say the convention serves a required grammatical purpose, otherwise $f(x)=f, x=x, f=x(f)$. Sure if a new convention was explicitly stated for an exposition then that's fine, but then you'd have to redefine other syntactic rules. – Localth Jul 26 '23 at 06:50
What are you "not convinced" of, and what sort of sources are you looking for? I.e. what is it you want me to show you? – Nicholas Todoroff Jul 26 '23 at 14:43
@Localth It can also be written as $$\large{ J^T = \nabla\otimes f^T }$$ – greg Jul 29 '23 at 10:57

score 1 · Answer 2 · answered Jul 26 '23 at 06:24

This is correct so long as it's understood that $\nabla$ is differentiating $\mathbf f$. To elaborate a bit more, the Jacobian is the matrix representation of the total differential $\mathrm d\mathbf f$ of $\mathbf f$. It turns out that the total differential is given by the directional derivatives $D_{\mathbf h}$: $$ \mathrm d\mathbf f(\mathbf h) = D_{\mathbf h}\mathbf f. $$ However, the operator $D_{\mathbf h}$ is precisely $\mathbf h\cdot\nabla$ (with $\mathbf h$ undifferentiated): $$ \mathrm d\mathbf f(\mathbf h) = (\mathbf h\cdot\nabla)\mathbf f. $$ Actually, this can be taken as a definition of $\nabla$ (insofar as it gives us the gradient of a scalar function). Now, if we represent vectors as column matrices then the dot product is $$ \mathbf h\cdot\nabla = \mathbf h^T\nabla = \nabla^T\mathbf h $$ where again we must note that $\mathbf h$ is undifferentiated. Let us put a dot $\dot\nabla$ and a dot $\dot{\mathbf f}$ to make it clear that $\nabla$ only differentiates $\mathbf f$. Then because $\nabla^T\mathbf h$ is scalar-like $$ \mathrm d\mathbf f(\mathbf h) = (\dot\nabla^T\mathbf h)\dot{\mathbf f} = \dot{\mathbf f}(\dot\nabla^T\mathbf h) = (\dot{\mathbf f}\dot\nabla^T)\mathbf h $$ so we see directly that $\dot{\mathbf f}\dot\nabla^T$ is the matrix representation of $\mathrm d\mathbf f$, making it by definition the Jacobian.

Zain Jabbar · Answer 3 · 2023-07-26T22:56:31.830

Aloha,

The symbol $\nabla = \begin{bmatrix} \frac{\partial}{\partial x} \\ \frac{\partial}{\partial y} \\ \frac{\partial}{\partial z} \end{bmatrix}$ is known as a vector differential operator. This is a type of function which inputs one function and returns another function.

Because $\nabla$ is a function, this means that the notational rules around "$\nabla$" are the same as functions. Consider a familiar function $\sin$, the expression $x \sin$ is different than $\sin x$. In the case of $\nabla$, this means that writing a function to the left ($f \nabla$) is understood as multiplying $\nabla$ by $f$. Writing the function to the right ($\nabla f$) is understood as application of $\nabla$ to $f$.

Let us do an example with the one dimensional derivative.

$$ \frac{d}{dx} x $$

Is the differential operator $ \frac{d}{dx} $ applied to the function $x$. This results in the constant function $1$.

Writing in a different order gives,

$$ x \frac{d}{dx} $$

This is now another differential operator. This is an object which inputs a function and returns $x$ times the derivative of that function. Here are some examples. I will use the notation

$$\text{Input} \mapsto \text{Output}$$

So,

$$ x \mapsto x \frac{d}{dx}(x) = x \cdot 1 = x $$ $$ x^3 \mapsto x \frac{d}{dx}(x^3) = x \cdot (3x^2) = 3x^3 $$ $$ e^x \mapsto x \frac{d}{dx}(e^x) = x \cdot (e^x) = x e^x $$

We now note that we lose "commutativity" in this sense of writing down functions. Writing in a different order implies a different operation. Let us do the multiplication and see what kind of operator we get.

$$\begin{bmatrix} f \\ g \\ h \end{bmatrix} \begin{bmatrix} \frac{\partial}{\partial x} & \frac{\partial}{\partial y} & \frac{\partial}{\partial z} \end{bmatrix} = \begin{bmatrix} f \frac{\partial}{\partial x} & f \frac{\partial}{\partial y} & f \frac{\partial}{\partial z} \\ g \frac{\partial}{\partial x} & g \frac{\partial}{\partial y} & g \frac{\partial}{\partial z} \\ h \frac{\partial}{\partial x} & h \frac{\partial}{\partial y} & h \frac{\partial}{\partial z} \end{bmatrix}$$

Wanna have a go at some function and multiply this matrix out? Keep in mind our discussion about multiplication order.

Here is something very cool that you have discovered though! Consider the product of two matrices with real elements, $A, B$. Then (assuming the sizes match up)

$$ (AB)^T = B^T A^T $$

For example,

$$ \left( \begin{bmatrix} 1 & 1 & 1 \\ 2 & 3 & 4 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \right)^T = \begin{bmatrix} 1 \\ 2 \end{bmatrix}^T = \begin{bmatrix} 1 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 1 & 3 \\ 1 & 4 \end{bmatrix} $$

And this is an honest theorem (which you may prove). Why does it work here? It's because the order of writing products inside the matrix does not matter. That is, real number multiplication is commutative ($2 \cdot 3 = 3 \cdot 2$). This is in contrast to the derivative above.

Let me know what I may clarify / correct. I like that you are playing around with the math and demonstrating working knowledge with the material. Keep it up!

Differentiating to the right is nothing more than a (very poor) convention. As long as we understand that $\mathbf f$ is being differentiated, then OP's expression is 100% correct. — Nicholas Todoroff, Jul 26 '23 at 06:26
For sure! This is definitely an important point to make clear and I am glad you mention this. This is part of why I used the phrase "writing ... is understood as ...". The above discussion is all based on notation and standard convention. This is likely why OP has not found the following in any textbooks and why it "has never been used". — Zain Jabbar, Jul 26 '23 at 06:37
@NicholasTodoroff I agree that it is a convention. Why is it very poorer than the convention that fixes OP's very confusing notation? — Kurt G., Jul 27 '23 at 08:32
@KurtG. The reason we can get away with differentiate-to-the-right (DttR) in the scalar case is because we are dealing with only one product (multiplication) and that product is commutative. This is not so with general vector algebra, and is exactly the reason you see the warning that $\nabla$ cannot be treated like a vector. If you remove DttR, then $\nabla$ provably satisfies all (linear) vector identities. — Nicholas Todoroff, Jul 27 '23 at 17:07
I try to exposit on the general idea here, though it could be improved; I should probably at least add proofs for the chain rule and "subexpression rule". I've applied this viewpoint to many questions on this site, a recent example is here. — Nicholas Todoroff, Jul 27 '23 at 17:07
In this answer I make use of the "cross divergence" $(\dot f\cdot\check\nabla)(\dot\nabla\cdot\check g)$ which I believe is impossible to express with DttR without resorting to partial derivatives and components. — Nicholas Todoroff, Jul 27 '23 at 17:07
@NicholasTodoroff. Thanks for getting back. Can the vector identities $\nabla$ satisfies be described a bit clearer? This one $\bf{A}\times(\nabla \times A)= \frac{1}{2}\nabla(A \cdot A)-(A\cdot \nabla)A$ is not "Grassmann like" according to a comment I made. I am wondering how to spot this quickly. — Kurt G., Jul 27 '23 at 17:26
@KurtG. This is actually the identity covered by my second link. My first link covers what I mean in detail, but the short of it is that the true expression is $$A\times(\dot\nabla\times\dot A) = (A\cdot\dot A)\dot\nabla - (A\cdot\dot\nabla)\dot A.$$ We keep track of the $A$ which is differentiated via $\dot A$, and when we do this the vector identity applies. — Nicholas Todoroff, Jul 27 '23 at 17:45
To formalize this, for fixed $y$ define $$L_x(w) = A(y)\times(w\times A(x))$$ and $$M_x(w) = (A(y)\cdot A(x))w - (A(y)\cdot w)A(x).$$ Obviously $L = M$, and the equation in my previous comment is exactly $L_x(\nabla) = M_x(\nabla)$ for all $y$, particularly $y = x$. The key definition is this $L_x(\nabla)$. It is in this sense that I truely mean all linear vector identities are satisfies by $\nabla$. By "linear identity" I mean precisely that $L$ and $M$ are linear in $w$. — Nicholas Todoroff, Jul 27 '23 at 17:45

Can the Jacobian be written as $\mathbf{f}\nabla^T$?

3 Answers3