3

$\newcommand{\bs}[1]{\boldsymbol{#1}}$ $\newcommand{\xx}[0]{\boldsymbol{x}}$ $\newcommand{\XX}[0]{\boldsymbol{X}}$ $\newcommand{\pderiv}[2]{\frac{\partial{#1}}{\partial{#2}}}$

Hi! I've come across a mathematical contradiction which seems to imply that there is something lacking in my understanding of interchangeability of derivatives when it comes to tensors.

Say we have a 1st order tensor $\xx$ that depends on $\XX$ (also 1st order) and $t$:

$$\xx = \xx(\XX,t)$$

Are derivatives of $\xx$ with respect to $\xx$ itself and $t$ then interchangeable? That is, is the following true?

$$\pderiv{}{\xx}\left(\pderiv{\xx}{t}\right) = \pderiv{}{t}\left(\pderiv{\xx}{\xx}\right)$$

If they aren't, can someone please explain why not? If they are, then how can the following contradiction be explained?

Choose $\xx = t^2\XX$

$$\pderiv{\xx}{t} = 2t\XX = \frac{2}{t}\xx \quad \Rightarrow \quad

\pderiv{}{\xx}\left(\pderiv{\xx}{t}\right) = \frac{2}{t}\bs{I}$$

But

$$\pderiv{}{t}\left(\pderiv{\xx}{\xx}\right) = \pderiv{}{t}\left(\bs{I}\right) = \bs{0}$$

(EDIT: The same problem obviously occurs with scalars)

andreasdr
  • 466

1 Answers1

7

Your expectation that these should commute may or may not be due to being confused by the ambiguous notation we use for partial derivatives, which doesn't mark what's being held fixed, only what's being varied. Note that you're taking the derivative w.r.t. $t$ while holding $X$ fixed and the derivative w.r.t. $x$ while holding $t$ fixed, so these are two unrelated partial derivatives in different coordinate systems $(X,t)$ and $(x,t)$ that we have no reason to expect to commute. See also https://math.stackexchange.com/q/51955/

P.S.: The reason that partial derivatives within one and the same coordinate system usually commute is basically that

$$ \begin{eqnarray} && \left(\frac{\left(f(x+\Delta x,y+\Delta y)-f(x+\Delta x,y)\right)}{\Delta y}-\frac{\left(f(x,y+\Delta y)-f(x,y)\right)}{\Delta y}\right)/\Delta x\\ &=& \left(\frac{\left(f(x+\Delta x,y+\Delta y)-f(x,y+\Delta y)\right)}{\Delta x}-\frac{\left(f(x+\Delta x,y)-f(x,y)\right)}{\Delta x}\right)/\Delta y\;. \end{eqnarray} $$

This works because you reach the same point whether you first move by $\Delta x$ keeping $y$ fixed and then move by $\Delta y$ keeping $x$ fixed or vice versa, whereas this is usually not the case if the two steps are along axes in different coordinate systems.

For $x=x(X,t)$ and $X=X(x,t)$, the general formula for transforming partial derivatives yields, with a vertical bar marking the variable held fixed for first derivatives, and all second derivatives referring to the coordinates $(X,t)$,

$$ \begin{eqnarray} \left.\frac{\partial}{\partial t}\right|_X \left.\frac{\partial}{\partial x}\right|_t &=& \left.\frac{\partial}{\partial t}\right|_X\left(\left.\frac{\partial t}{\partial x}\right|_t\left.\frac{\partial}{\partial t}\right|_X+\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_t\right) \\ &=& \left.\frac{\partial}{\partial t}\right|_X\left(\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_t\right) \\ &=& \left(\left.\frac{\partial}{\partial t}\right|_X\left.\frac{\partial X}{\partial x}\right|_t\right) \left.\frac{\partial}{\partial X}\right|_t + \left.\frac{\partial X}{\partial x}\right|_t \frac{\partial^2}{\partial t\partial X}\; \end{eqnarray} $$

(since $\partial t/\partial x|_t=0$), whereas

$$ \begin{eqnarray} \left.\frac{\partial}{\partial x}\right|_t \left.\frac{\partial}{\partial t}\right|_X &=& \left(\left.\frac{\partial t}{\partial x}\right|_t\left.\frac{\partial}{\partial t}\right|_X+\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_t\right)\left.\frac{\partial}{\partial t}\right|_X \\ &=& \left(\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_t\right)\left.\frac{\partial}{\partial t}\right|_X \\ &=& \left.\frac{\partial X}{\partial x}\right|_t \frac{\partial^2}{\partial X\partial t} \\ &=& \left.\frac{\partial X}{\partial x}\right|_t \frac{\partial^2}{\partial t\partial X}\;. \end{eqnarray} $$

So commuting these two partial derivatives from different coordinate systems leads to an additional term

$$ \left.\frac{\partial}{\partial t}\right|_X \left.\frac{\partial}{\partial x}\right|_t - \left.\frac{\partial}{\partial x}\right|_t \left.\frac{\partial}{\partial t}\right|_X = \left(\left.\frac{\partial}{\partial t}\right|_X\left.\frac{\partial X}{\partial x}\right|_t\right) \left.\frac{\partial}{\partial X}\right|_t $$

proportional to a first derivative, which arises because the rate of change of $X$ with $x$ changes with $t$ (which is of course not the case in a single coordinate system, where the rate of change of $x$ with $x$ is constant, namely $1$). Indeed, in your case, where you apply the mixed derivative to $x$ itself, we have

$$ \begin{eqnarray} \left(\left.\frac{\partial}{\partial t}\right|_X\left.\frac{\partial X}{\partial x}\right|_t\right) \left.\frac{\partial}{\partial X}\right|_t x &=& \left(\left.\frac{\partial}{\partial t}\right|_Xt^{-2}\right) \left.\frac{\partial}{\partial X}\right|_t t^2X \\ &=& -2t^{-3}\cdot t^2 \\ &=& -\frac2t\;, \end{eqnarray} $$

in agreement with your calculation.

joriki
  • 238,052
  • Thanks! That shed some light on things. – andreasdr Jul 21 '11 at 15:08
  • ...and your edit even more so, thanks! I'm still wondering about the following however, where does it come from? $\left.\frac{\partial}{\partial x}\right|_t = \left.\frac{\partial t}{\partial x}\right|_t\left.\frac{\partial}{\partial t}\right|_X+\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_t$ – andreasdr Jul 23 '11 at 10:46
  • @andreasdr: This is just the standard chain rule for a change of variables from $(x,t)$ to $(X,t)$ (see e.g. http://en.wikipedia.org/wiki/Chain_rule#The_chain_rule_in_higher_dimensions, especially the last formula before the example); it may look unfamiliar because $t$ occurs in two different roles; if you write it out for $(x,t)$ and $(X,T)$ instead, it might ring a bell. – joriki Jul 23 '11 at 11:14
  • Got it, thanks! As you suspected, it was the term involving $\partial t/\partial x|_t$ that confused me, but now I see why it would appear in the case of four independent variables. – andreasdr Jul 23 '11 at 12:40
  • Lemme just write it out for future reference :) $\left.\frac{\partial}{\partial T}\right|_X \left.\frac{\partial}{\partial x}\right|_t = \left.\frac{\partial}{\partial T}\right|_X\left(\left.\frac{\partial T}{\partial x}\right|_t\left.\frac{\partial}{\partial T}\right|_X+\left.\frac{\partial X}{\partial x}\right|_t\left.\frac{\partial}{\partial X}\right|_T\right)$ – andreasdr Jul 25 '11 at 12:23