5

I'm having a hard time understanding why \begin{equation}\frac{\partial x}{\partial y}\frac{\partial y}{\partial z}\frac{\partial z}{\partial x}=-1 \tag{1}\label{1} \end{equation} Wikipedia provides this derivation. I have two problems with it.
The proof starts by stating that there is a function f such that $f(x,y,z)=0$ and that $z$ can be made a function of $x,y$. Furthermore it states that there can be found a curve, along which $dz=0$ and $y$ is a function of $x$, such that we can then write the differential of $z$ in terms of the differential of $x$ as $$dz=\frac{\partial z}{\partial x}dx+\frac{\partial z}{\partial y}\frac{\partial y}{\partial x}dx$$
The rest follows naturally from setting $dz=0$ and multiplying some partial derivatives by their inverses.
I have two problems with this proof

1.Chain rule

The first one is that since \begin{equation} \tag{2} \label{2} \frac{\partial z}{\partial x}=-\frac{\partial z}{\partial y}\frac{\partial y}{\partial x} \end{equation} That would mean that, by chain rule, $$\frac{\partial z}{\partial x}=-\frac{\partial z}{\partial x}$$ which would imply this partial derivative to be zero. However if this is true, \ref{1} yields $0=-1$. Is the chain rule not valid in this case? If so, why?

2.Inverse of the partials
The second is that, while applying the last step, it is implied that we obtain \ref{1} by multiplying by the inverse of the righthand-side in \ref{2}. I thought the relationship $$\frac{\partial y}{\partial x}=\frac{1}{\frac{\partial x}{\partial y}}$$ was in general not true, as pointed out in this post. Is it true in this case? And if so, why is that?
Also, if that really is the case, then using the chain rule again yields, from \ref{1}, $$\frac{\partial x}{\partial z}\frac{\partial z}{\partial x}=-1 \iff 1=-1$$ What am I doing wrong?

  • 2
    Your main problem is lack of context. $x,y,z$ do not automatically have any relationships, so those partial derivatives do not have any meaning unless you set something up. And if (for example) $z$ is assumed to be a function of $x,y$, then, still, $\partial x/\partial z$ has no meaning, without further assumptions. You are in a situation where the natural symbol heuristics are misleading, unfortunately... Can you give your larger context? – paul garrett Jan 09 '22 at 23:15
  • the context is the context of the Wikipedia demonstration that I linked at the top of the question. Maybe I should edit my post to make it clearer? But anyway, the context is, given $f(x,y,z)=0$, consider a path along which, writing $z$ as a function of $x$ and $y$, $dz=0$. They further parametrize $y$ with $x$ and write out the differential of $z$ in the basis of the differential of $x$, from whence everything else follows – Lourenco Entrudo Jan 09 '22 at 23:21
  • 1
    Ah, yes, please do insert the context. The style on this site, for better or worse, is to ignore links... – paul garrett Jan 09 '22 at 23:22
  • 1
    Suppose $x+y+z=0$. Then, informally, ${\partial z\over\partial x}=-1$, ${\partial z\over\partial y}=-1$, and ${\partial y\over\partial x}=-1$, so you see that (2) is correct, and what you are calling the chain rule doesn't apply. – Gerry Myerson Jan 09 '22 at 23:22
  • 2
    I suggest you read this post and this post instead of Wikipedia. :) – Ted Shifrin Jan 09 '22 at 23:28
  • @GerryMyerson But why doesn't the chain rule apply? If I have a differentiable function $z(y,x)$ and a parametrization $y=\lambda(x)$ then shouldn't the partial derivative, with respect to $x$, of $z \circ \lambda(x)$, be $\frac{\partial z}{\partial y}\frac{\partial y}{\partial x}$ (noting that the last partial derivative is the same as$ \lambda'(x)$)? – Lourenco Entrudo Jan 09 '22 at 23:37
  • @TedShifrin thank you very much. I understand that a careful approach using the implicit function theorem yields the correct results, however, I am still confused on why the "usual" chain rule does not apply (up to a -1). What is going on here? Is it because we're "performing calculus on a manifold"? If so, how exactly does that affect the chain rule in question? In light of my early response to Gerry's comment, would you consider writing an answer which clears up that conceptual struggle I'm having? P.S- I love your lessons on youtube :) I started learning real analysis because of them – Lourenco Entrudo Jan 10 '22 at 00:01
  • The point is the one I made in the posts I linked. You have to pay attention to what variables are fixed when you take partial derivatives. If you have variables $x,y,r,\theta$ (or more), only if you fix all the remaining variables can you say $\frac{\partial x}{\partial\theta}\frac{\partial \theta}{\partial x} = 1$. The usual situation is that we have $x=x(r,\theta)$ and $\theta=\theta(x,y)$. If you compute the derivative of $\theta$ fixing $r$ rather than $y$, then it will work! Similarly, your two $\partial z/\partial x$ mean totally different things. Write it out very carefully! – Ted Shifrin Jan 10 '22 at 01:07
  • Here and here I've answered questions from other users which are more concrete applications of the abstract question you've asked, hopefully seeing these reinforces the point @Ted is making. Our notation lacks vitally important extra information, namely the ambient variables being held fixed (in physics these variables are included as subscripts) – Ninad Munshi Jan 10 '22 at 01:26
  • $z=z(x,y)$ is not the same thing as $f(x,y,z)=0$, so different rules apply. – Gerry Myerson Jan 10 '22 at 01:32
  • The problem is that in partial derivatives, the notation is problematic. $\partial z$ is not a unique term (nor is $\partial x$ or $\partial y$). The total identity of the partial differential is based on what's in the denominator. One option (which makes more sense of what you are seeing) is to translate $\frac{\partial z}{\partial x}$ into $\frac{\partial_x z}{dx}$, and recognize that $dz = \partial_x z + \partial_y z$. This will make the algebra more direct. – johnnyb Jan 10 '22 at 01:52
  • See my "Exploring Alternative Notations for Partial Differentials" for more information - https://www.researchgate.net/publication/334121766_Exploring_Alternate_Notations_for_Partial_Differentials – johnnyb Jan 10 '22 at 01:53
  • @johnnyb thank you. I had no idea that partial differentials could be used to improve notation – Lourenco Entrudo Jan 10 '22 at 02:48
  • The appearance of a $(-1)$ here really reminds me of the situation where you have a $3\times 3$ grid of chain complexes, and chain maps (down and right) that make all $6$ rows and columns into short exact sequences in every degree. Then you can get from bottom right to top left via a horizontal snake map, followed by a vertical snake map, or a vertical snake map, followed by a horizontal snake map. These two compositions differ by a factor $-1$: defying the usual "every diagram commutes" mantra, in the same way this triple product identity defies the notation. – tkf Jan 10 '22 at 08:58

2 Answers2

4

At least as a place-holder, in line with @GerryMyerson's apt comment:

Yes, there is an appealing heuristic that suggests that ${\partial z\over \partial x}={\partial z\over \partial y}{\partial y\over \partial x}$, ... and such things.

In a different universe, it might not matter that these named variables were related by $f(x,y,z)=0$ or $z=f(x,y)$ or some other relation. But, in our universe, this does have some relevance.

Even in a simpler situation, $f(x,y)=0$, whether or not we rename $f$ to $z$, a person might imagine that (via some sort of implicit function theorem, making $y$ a function of $x$) ${\partial y\over \partial x}={\partial f\over \partial x}/{\partial f\over \partial y}$... but that's off by a sign!?!?! :)

Careful application of the chain rule corrects the sign. :)

EDIT: When $y$ is (locally) defined as a function of $x$ by a relation $f(x,y)=0$, differentiating this with respect to $x$ gives $$ 0 \;=\; f_1(x,y)\cdot {dx\over dx} + f_2(x,y)\cdot {dy\over dx} \;=\; f_1(x,y)+f_2(x,y){dy\over dx} $$ where $f_i$ is the partial derivative of $f$ with respect to the $i$-th argument. This gives $$ {dy\over dx} \;=\; -{f_1(x,y)\over f_2(x,y)} $$ If we somewhat-abuse notation by thinking that $f_1=f_x$ and $f_2=f_y$, then this would be $$ {dy\over dx} \;=\; -{{\partial f\over \partial x}\over {\partial f\over \partial y}} $$ which does not give the expected heuristic outcome, being off by a sign. :)

paul garrett
  • 52,465
  • Could you clarify how "careful application of the chain rule" corrects the sign? – Lourenco Entrudo Jan 09 '22 at 23:40
  • Thank you :) However I still feel a bit uneasy, as I previously explained in my comments to Ted and Gerry. I feel like thinking "heuristically" should yield the right results, and I'm still struggling on why it isn't. Why is the situation in question different from just having a function $z(x,y)$ and a parametrization $\lambda(x)=y(x)$? I mean, I know what is different, the relationship $f(x,y,z)=0$, but what I mean is: why does having this constraint alter a chain rule which should be general for any two function and parametrization( given they are differentiable)? – Lourenco Entrudo Jan 10 '22 at 00:13
  • @LourencoEntrudo, I think I do understand/sympathize with your question/concern... as this was a fact that disturbed me years ago, as well. Apparently, this is a fairly extreme case of a symbol-oriented heuristic being significantly "off" from the truth. "Only by a sign", ... :) In general, we would indeed want our notation to suggest only what is true, but this is not universally possible... apparently! :) – paul garrett Jan 10 '22 at 00:55
  • It is not so much a matter of notation as it is a matter of understanding why the derivative of the composition of two functions is not given by the chain rule – Lourenco Entrudo Jan 10 '22 at 01:08
  • 1
    @LourencoEntrudo, the composition of two functions is given by the chain rule, but we can accidentally choose a notation that presents functions misleadingly. – paul garrett Jan 10 '22 at 19:07
1

It has come to me with all your help that my confusion was only a matter of damned notation and that in fact the chain rule is not broken in the implicit function theorem. It all boils down to what the original wikipedia article calls partial derivative and the way I also thought of it. For me, partial differentiation is always when only one parameter is free to move. It is the natural way of defining it in real analysis. Everything else is just a derivative of the composition of a function with a parametrization (what the physicists like to call the "total derivative"). And that is the derivative that is being used and that eluded me. So as an exercise and to check that I have understood everything, I will answer my own post and "correct" the enunciation of the triple product.
Let $f=f(x,y,z)$, $z=z(x,y)$ and $y=y(x)$. Consider the differential of z, $$dz=\frac{\partial z}{\partial x}dx+\frac{\partial z}{\partial y}dy$$ This is the definition of the differential. But now , when considering the differential of y, we shouldn't write $dy=\frac{\partial y}{\partial x}dx$, but $dy=\frac{d y}{d x}dx$. This is the notation I was used to and a notation which indicates that not only x is being free to vary: we are taking the derivative of y=y(x,z) composed with z=z(x). We can even see the relationship of this derivative with $\frac{\partial y}{\partial x}$, applying the chain rule. $\frac{dy}{dx}=\frac{\partial y}{\partial x}+\frac{\partial y}{\partial z}\frac{dz}{dx}$. This makes sense. y is a function of x, but also of z (were it not the case, $\frac{\partial y}{\partial z}$ in the triple product wouldn't even make sense). Then, we write $$dz=\frac{\partial z}{\partial x}dx+\frac{\partial z}{\partial y}\frac{dy}{dx}dx$$ Moving now along a path where $dz=0$ $$\frac{\partial z}{\partial x}=-\frac{\partial z}{\partial y}\frac{dy}{dx} \tag{3} \label{3}$$ Now, to address my (1). Of course the right hand side is not $\frac{\partial z}{\partial x}$.$\frac{\partial z}{\partial x}$ would be $\frac{\partial z}{\partial y}\frac{\partial y}{\partial x}$, where the difference has been discussed earlier.
To adress (2), the relationship is in general not true, but, in this case differentiating the function $f$ with respect to $y$ and $x$ and comparing the two will show that $\frac{\partial z}{\partial x}\frac{\partial x}{\partial z}=1$ and so we write the "corrected" formula, $$-1=\frac{\partial x}{\partial z}\frac{\partial z}{\partial y}\frac{dy}{dx}$$ I hate thermodynamics

  • Historical footnote: The mathematisation of thermodynamics with all its partial derivatives is due in great part to Poincaré (a mathematician of the years 1880-1910 you may know) – Jean Marie Jan 10 '22 at 07:57
  • Besides, a similar question here with an interesting answer. – Jean Marie Jan 10 '22 at 08:01
  • 1
    I really don't like the notation of a partial derivative where you allow more than one variable to move. It's automatically misleading because there seems to be no sense in which you can do it precisely without relating the variables in question, in which case it is better represented by a one dimensional derivative, where you compose the function of several variables with the parametrization of every variable(except the one to which respect you're differentiating) with the variable you're taking a derivative out of. Honestly I might write a separate post to discuss that – Lourenco Entrudo Jan 10 '22 at 13:08