Why is this a fake proof?

Question

I am aware of the "definition" of the total differential as follows:

$$\mathrm{d}f = \frac{\partial f}{\partial x} \mathrm{d}x + \frac{\partial f}{\partial y} \mathrm{d} y.$$

Now, assume we wished to show that:

$$\frac{\mathrm{d}y}{\mathrm{d}x} = - \frac{\partial f / \partial x}{\partial f / \partial y}$$

for a level curve of $f$. Now, I thought one could accomplish this as follows: For a level curve, $f(x, y) = c$, we have that $\mathrm{d}f = 0$, so:

$$\frac{\partial f}{\partial x} \mathrm{d}x + \frac{\partial f}{\partial y} \mathrm{d} y = 0$$

and so therefore:

$$\frac{\mathrm{d} y}{\mathrm{d} x} = - \frac{\partial f / \partial x}{\partial f / \partial y}$$

by simple manipulation. I was told that this isn't rigorous, and furthermore, that's it entirely incorrect in nature. Now, I understand that it's nonrigorous, because it involves loosely using $dx$ and $dy$, but I'm sure there is some way to make the above treatment of differentials more rigorous, but it probably involves concepts that I haven't learned yet. What is the logical flaw in this "proof"?

Differentials(which are a type of infinitesimal) are rigorous as Hyperreal numbers. Calculus(and Analysis as a whole) with hyperreal numbers is known as non-standard analysis. — Jeevan Devaranjan, Dec 02 '15 at 01:01
For this to count as a "proof" of anything, you have to define what "$dx$" and "$dy$" are; otherwise you're just pushing around meaningless symbols. There are various definitions you could make, but none of them make every step of this argument valid. To name a specific "logical flaw", you have to say exactly what definition you're using. — Eric Wofsey, Dec 02 '15 at 01:01
The process implies that there are quantities $dy$ and $dx$ that, when divided, give you $dy/dx$. This is impossible at face vakue, but there are ways to make sense of this manipulation. — Ben Grossmann, Dec 02 '15 at 01:06
@Omnomnomnom How should I define $dx$ and $dy$ so that this would work? I learned that they are just "infinitesimal" changes, but this doesn't really mean much to me, and it certainly doesn't let me "prove" anything. "Impossible" manipulations such as the above using differentials work very well intuitively (like in the above "proof"), so I am sure there must a better definition than "little change in x". Thinking of $\frac{dy}{dx}$ as $dy$ divided by $dx$ often seems to lead to valid results elsewhere too. — ra1nmaster, Dec 02 '15 at 01:07
On a curve on which $f$ is constant, the tangent vector at a given point is perpendicular to the gradient of $f$ there. Therefore the slope of the tangent vectors are the negative reciprocal of the slope of the gradient of $f$. That is all the formula says. — Ian, Dec 02 '15 at 01:18
One way to make the derivation rigorous is to parametrize the curve, obtaining $\frac{df}{dt} = \frac{\partial f}{\partial x} \frac{dx}{dt} + \frac{\partial f}{\partial y} \frac{dy}{dt} = 0$. (You have a one parameter parametrization because you have a one dimensional surface). This equation literally just says that the tangent to the curve is perpendicular to the gradient of $f$. Proceeding this way reduces the problem to showing $\frac{\frac{dy}{dt}}{\frac{dx}{dt}} = \frac{dy}{dx}$, which is a combination of the chain rule and the inverse function theorem. — Ian, Dec 02 '15 at 01:22
The use of differential notation hints at the fact that this result doesn't depend on how we pick the parametrization (so in a sense the variable $t$ is a "nuisance variable" that we don't really care about very much). — Ian, Dec 02 '15 at 01:23
@Ian Isn't that last equation just a form of the chain rule? As in:
$\frac{dy}{dt} \cdot \frac{dt}{dx} = \frac{dy}{dx}.$

I understand that. But I don't quite see where the $t$ came from. And why is the surface one-dimensional? It seems two-dimensional to me. — ra1nmaster, Dec 02 '15 at 01:26
$\frac{1}{\frac{dx}{dt}} = \frac{dt}{dx}$ is the inverse function theorem. Then you use the chain rule after that. Just look at a prototypical example to see why the surface is one dimensional: $f(x,y)=x^2+y^2=1$. This is the unit circle, which is one dimensional (locally it looks like a line). — Ian, Dec 02 '15 at 01:27
@Ian Ok I see what you meant. I was a little confused at first. But can't I see your equation from this:
$$\nabla f \cdot \frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t} = 0$$ where $\mathbf{r}$ is the parametrization of the curve you described. This works because the gradient is perpendicular to $\frac{\mathrm{d}\mathbf{r}}{\mathrm{d}t}.$

And then I would just use the dot product to get that equation? — ra1nmaster, Dec 02 '15 at 01:33
@Omnomnomnom This might be a little off topic, but since in my question, $dx$ and $dy$ are just mere meaningless symbols, I'm interested in knowing why we are allowed to make statements like: $\mathrm{d}y = f'(x) \mathrm{d}x$ or:
$\mathrm{d}f = \frac{\partial f}{\partial x} \mathrm{d}x + \frac{\partial f}{\partial y}\mathrm{d}y.$

Clearly, if these kinds of equations are going to be taught, they ought be given a meaning. One way I see it is that $dy$ gives the linear change in $f(x)$ along a tangent line given a change in $x$ ($dx$), but this definition doesn't seem to hold in say, integrals. — ra1nmaster, Dec 02 '15 at 02:28
@ra1nmaster I'd tend to say that all of that is really a way of encoding the chain rule — Ben Grossmann, Dec 02 '15 at 03:05
@ra1nmaster Also before trying to find the derivative we need to show it exists. Once you know it exists by a similar argument using chain rule we can derive the above result — happymath, Dec 02 '15 at 04:31
The point is not whether this proof is rigorous or not. It explains intuitively why that formula is correct, so it's perfectly valid in that sense. I don't think it's right to call it "fake" or "incorrect". — , Dec 02 '15 at 21:55
@Omnomnomnom What's an example of $x(t), y(t)$ for which this holds? $$\frac{d^2 y}{dx^2} \ne \frac{d^2 y/dt^2}{\left(dx/dt\right)^2}$$ — , Dec 02 '15 at 21:58
An example where dividing $dy$ and $dx$ can lead to the wrong results: see this question — Ben Grossmann, Dec 03 '15 at 10:05
@NotNotLogical wait... miscalculated... I may just be wrong here — Ben Grossmann, Dec 03 '15 at 10:07

score 3 · Accepted Answer · answered Dec 02 '15 at 04:54

$\mathrm dx $ and $\mathrm dy$ can't be taken as meaningless symbols. If they were, how could you talk about what it means to divide them (or even add them)? How could you talk about $\mathrm df$ as a linear combination of them, unless that also makes it a meaningless symbol?

No, these are well-defined quantities called differential forms.

A differential $k$-form is a function of the coordinate position as well as a function of $k$ vectors. $\mathrm dx$ and $\mathrm dy$ are differential 1-forms, and the equation you were given for the total differential can be seen as defining $\mathrm df$ as a differential 1-form in terms of $\mathrm dx$ and $\mathrm dy$:

$$\begin{align*}\mathrm df&: \mathbb R^2 \times \mathbb R^2 \to \mathbb R\\ \mathrm df(r,a) &= \partial_1 f(r) \mathrm dx(r, a) + \partial_2 f(r) \mathrm dy(r,a)\end{align*}$$

This is a very verbose and explicit way of writing the total differential. I'm being extremely, extremely pedantic here: I'm not even calling $\partial_1 f$ by $\partial f/\partial x$ because, from a strict mathematics perspective, $f$ is a function of a vector argument, and the components of that argument (the coordinates) need not be called $x,y$.

When we consider a level curve of $f$, that means we have some curve $C: \mathbb R \to \mathbb R^2$ such that $(f \circ C)(t) = K$ for some constant $K$. Taking a derivative of this function yields, using the chain rule,

$$\begin{align*}0 &= (f \circ C)'(t) \\ &= \mathrm df(C(t), C'(t)) \\ &= [(\partial_1 f) \circ C](t) \mathrm dx(C(t), C'(t)) + [(\partial_2 f) \circ C](t) \mathrm dy(C(t), C'(t))\end{align*}$$

Again, being painfully explicit here. Most people would not even bother writing the function $C$ in here, and the derivative $C'$ would be considered implied.

Now at this point, you can write

$$\frac{\mathrm dy(C(t), C'(t))}{\mathrm dx(C(t), C'(t))} = - \frac{(\partial_1 f)\circ C(t)}{(\partial_2 f) \circ C(t)}$$

That, however, is far from considering $y$ a function of $x$ and taking a derivative.

Now, what some people might do is define the level curve such that $C$ takes one of the coordinates (like $x$) and spits out the coordinate pair $(x,y)$ that corresponds to a point on the curve. When this is done, $C(x) = (x, Y(x))$ and $C'(x) = (1, Y'(x))$ for some function $Y$. Moreover, $\mathrm dx$ and $\mathrm dy$ don't actually depend on position at all: they only look at the second argument.

When this is done, the resulting equation looks like

$$Y'(x) = -\frac{\partial _2 f(x,Y(x))}{\partial _1 f(x,Y(x))}$$

So while the argument put forth is very powerful and very suggestive, and it gets the gist of things right, the bookkeeping required "under the hood" of this argument may require some additional thought.

Wow, thanks for the insight. I will admit, this appears to be something far beyond my current level of understanding. It seems that I have learned to work with differentials algebraically, but I've never really learned what they were. Does this definition have any relationship to intuitively seeing $dx$ as a "small change in $x$?. Also, when would one learn about differential forms? — ra1nmaster, Dec 02 '15 at 05:31
Differential forms would typically be introduced in the context of differential geometry--the study of curves, surfaces, and the like, that are smooth enough to be described by differentiable functions. Differential forms are very convenient for integration on such regions. - My point of view on "small changes in $x$" is that you'll stop thinking about small intervals and start thinking about weighted directions. Here, that direction is $C'$, the tangent direction to the level curve. Anything you can do with a differential could be done with tangent directions instead, using vectors. — Muphrid, Dec 02 '15 at 06:41

Why is this a fake proof?

1 Answers1