Partial derivative of a vector with respect to a vector

Question

Let $\mathbf{p}(t) = [p_1(t),...,p_n(t)]^{T}$ and $\mathbf{q}(t) = [q_1(t),...,q_n(t)]^{T}$, where $p_i, q_i : \Bbb{R} \to \Bbb{R}$ are "nice" functions (probably differentiable). How exactly is $\frac{\partial}{\partial \mathbf{q}}(\mathbf{p})$ defined? I know it should be a matrix. Ultimately, I am trying to prove that $\frac{\partial}{\partial \mathbf{q}}(\alpha q) = \alpha I_n$, where $\alpha \in \Bbb{R}$ is a constant and $I_n$ is the $n \times n$ matrix; and I am trying to prove that $\frac{\partial}{\partial \mathbf{q}}(C \mathbf{q}) = C$, where $C$ is an $n \times n$ matrix over $\Bbb{R}$.

I found this, and so I guessed that the $(i,j)$-th entry of $\frac{\partial}{\partial \mathbf{q}}(\mathbf{p})$ should be $\frac{\partial}{\partial q_j}(p_i)$. And if that is the case, I can sort of see why $\frac{\partial}{\partial \mathbf{q}}(\alpha q) = \alpha I_n$ is true. But I don't know how to make sense of the notation $\frac{\partial}{\partial q_j}$. Taking the partial derivative with respect to a function instead of a variable? I guess this is relevant. In that MSE post, they are trying to compute $\frac{dg}{df}$, and they use Leibniz notation to make sense of it. But what does, e.g., $\frac{dx}{df}$ even mean?

StiftungWarentest · Accepted Answer · 2023-02-02T21:09:17.093

You're mixing to different kinds of differentiation.

When given a scalar function lets say $f:\mathbb{R}^2\longrightarrow \mathbb{R}, (x_1,x_2)\mapsto f(x_1,x_2)$, then $\frac{\partial f}{\partial x_1}$ measures the change of $f$ in the $x_1$-coordinate, i.e. in the direction of the first basis vector $e_1$, it is thus actually more accurately written as $\frac{\partial f}{\partial e_1}$ (but tbh hardly anyone uses that notation). For any vector $(v_1,v_2)\in \mathbb{R}^2$ you can measure the change of $f$ in that direction, it is $\frac{\partial f}{\partial v} = v_1\frac{\partial f}{\partial e_1}+ v_2\frac{\partial f}{\partial e_2}$ (or $v_1\frac{\partial f}{\partial x_1}+ v_2\frac{\partial f}{\partial x_2}$ in the more common notation). In this notation $e_1$ and $e_2$ (resp. $x_1$ and $x_2$) denote the direction in which we measure change. In that sense we will always have $\frac{\partial x_2}{\partial x_1} = 0$ since $x_2$ does not change in the $e_1$-direction.

Now if for a coordinate-vector $\mathbf{x} = (x_1,x_2)$ we write $\frac{\partial f}{\partial \mathbf{x}}$ that usually denotes the gradient $(\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2})^T$. And if the function was to have multiple components $f_1,..,f_n$, then $\frac{\partial \mathbf{f}}{\partial \mathbf{x}}$ is the matrix (called Jacobi matrix) were the $i$th row the the gradient of the $f_i$. In that sense you're correct in that the $(i,j)$-th entry of $\frac{\partial}{\partial \mathbf{q}}(\mathbf{p})$ is $\frac{\partial}{\partial q_j}(p_i)$ and you will also easily be able to show that $\frac{\partial}{\partial \mathbf{q}}(C\mathbf{q}) = C$ and so on. However in this interpretation it makes no sense for $q_j$ to be a function, it needs to be a coordinate instead.

Now if instead you were given two functions $f,g:\mathbb{R}\longrightarrow \mathbb{R}$, then by the chain-rule we can calculate \begin{align*} \frac{df(x)}{dg(x)}\frac{dg(x)}{dx} = \frac{df(x)}{dx} \end{align*} and thus \begin{align*} \frac{df(x)}{dg(x)} = \frac{df(x)}{dx}\left(\frac{dg(x)}{dx}\right)^{-1} \end{align*} In that sense you could compute $\frac{\partial p_i(t)}{\partial q_j(t)}$ to be \begin{align*} \frac{\partial p_i(t)}{\partial q_j(t)} = \frac{\partial p_i(t)}{\partial t}\left(\frac{\partial q_j(t)}{\partial t}\right)^{-1} \end{align*} although writing $\partial$ here is considered to be inaccurate notation, since this is not a directional derivative in the above sense you really should be writing $\frac{d p_i(t)}{d q_j(t)}$ instead.

You are mixing both concepts, which is confusing to me and also to yourself as it seems. You can of course attempt to define $\frac{\partial \mathbf{p}}{\partial \mathbf{q}}$ as \begin{align*} \left(\frac{\partial \mathbf{p}}{\partial \mathbf{q}}\right)_{i,j} = \frac{d p_i(t)}{d q_j(t)} \end{align*} but this does not have any interpretation known to me and in particular the identities you want to show need then not hold.

This is as much as I can tell you based on the information you gave me. Hope that helps :)

Yes, this does help. But now I realize I am confused by something. As mentioned, one thing I am trying to show is that $\frac{\partial}{\partial \mathbf{q}}(\alpha q) = \alpha I_n$. Since $\frac{\partial q_i}{\partial q_i} = \frac{\partial q_i}{\partial t} \cdot \left(\frac{\partial q_i}{\partial t} \right)^{-1} = 1$, it is easy to see that the $i$-th main diagonal entry of $\frac{\partial}{\partial \mathbf{q}}(\alpha q)$ is $\alpha$. But I don't see why the off-diagonal entries all have to be $0$. What if the $q_i = q_j$ for some $i \neq j$? — user193319, Feb 02 '23 at 19:20
Also, this doesn't see very well-defined if any of the entries of $\mathbf{q} = [q_1(t),...,q_n(t)]^{T}$ are identically zero, right? There must be more assumptions on $\mathbf{q}$ in order for $\frac{\partial}{\partial \mathbf{q}}(\alpha q) = \alpha I_n$ to be true...unless I'm being a knucklehead. — user193319, Feb 02 '23 at 19:21
@user193319 You seem to be confused about the concept of directional derivatives, maybe read into that once more : https://en.wikipedia.org/wiki/Directional_derivative — StiftungWarentest, Feb 02 '23 at 19:25
Hmm...Maybe...I'm just wondering why $i \neq j$ implies $\frac{\partial q_i}{\partial q_j} = \frac{\partial q_i}{\partial t} \cdot \left( \frac{\partial q_j}{\partial t} \right)^{-1} = 0$? I didn't really see anything on that wiki page which could answer that question. — user193319, Feb 02 '23 at 19:36

Partial derivative of a vector with respect to a vector

1 Answers1