Understanding what a differential is (Why can physicists multiply both sides by dx?)

Question

I've just finished my first semi-rigorous run through single variable calculus, where I tried proving most of the results using epsilon-delta proofs. I'm just not sure that I'm understanding the meaning of differentials properly.

I learned that the derivative with respect to a variable was an operator, meaning it 'took in' a function and 'outputted' a function that has a value of $\lim_{h\to 0} \frac{f(x+h)-f(x)}{h}$ at all values of x where $f$ is differentiable. Based on this definition, $\frac{dy}{dx}$ is a function of x.

I also learned that when integrating or finding the antiderivative with respect to x, the dx within the integral represented the infinitely small width of an interval in a partition of a Darboux or Reimann sum.

When solving differential equations with separation of variables or integrating using u-substitution, I've seen people say things like 'multiply both sides by dx' or 'bring dt to the other side', especially in physics. This doesn't sound right, because I'm not sure how multiplying a function ($\frac{dy}{dx}$) by an infinitesimal (dx) gives another infinitesimal. So I'm wondering if this is this just a notational shortcut for the chain rule, or saying things like $\frac{dy}{dx} dx = dy$ has some mathematical meaning.

See https://math.stackexchange.com/questions/1906241/when-not-to-treat-dy-dx-as-a-fraction-in-single-variable-calculus and all the questions linked in the comments there. — Arthur, Dec 20 '20 at 23:25
Different sources and arguments here https://math.stackexchange.com/questions/3819116/rigorously-whats-happening-when-i-treat-fracdydx-as-a-fraction/3819142#3819142 — zkutch, Dec 20 '20 at 23:29
The crux of the matter: Indeed when you multiply a real quantity by an infinitesimal, you get another infinitesimal. That is, if $f$ is real and $\epsilon$ infinitesimal, then obviously $f\epsilon$ is infinitesimal too. — Allawonder, Dec 21 '20 at 08:21

Joe · Accepted Answer · 2020-12-21T00:44:54.650

There are many instances where treating $dx$ as a number is convenient, and luckily, many times it can be at least partially justified. I'm sure your familiar with the 'proof' of the chain rule: $$ \require{cancel} \frac{dy}{du} \cdot \frac{du}{dx} = \frac{dy}{\cancel{du}} \cdot \frac{\cancel{du}}{dx} = \frac{dy}{dx} \, . $$ While it is incorrect to naively view the chain rule as cancellation, I don't think this argument is that far off the truth. The chain rule can be more properly justified in the following way: $$ \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta x} = \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta u} \cdot \frac{\Delta u}{\Delta x} = \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta u} \cdot \lim_{\Delta x \to 0}\frac{\Delta u}{\Delta x} \, . $$ It can then be shown that $$ \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta u} = \lim_{\Delta u \to 0}\frac{\Delta y}{\Delta u} \, . $$ While even this argument sweeps some issues under the rug, the point remains that is often convenient to regard derivatives as fractions. And it shouldn't be too surprising that limits of fractions often behave just like fractions. It is clear that $$ \lim_{h \to 0}\frac{f(x+h)-f(x)}{h} \approx \frac{f(x+h)-f(x)}{h} $$ for small enough $h$. This means that $dy/dx$ can, roughly speaking, be viewed as the quotient of two tiny quantities, $dy$ and $dx$, which partially justifies the manipulation of them in an equation. There have been many attempts to formalise these heuristic arguments—for example, in nonstandard analysis. While I think most mathematicians will agree that none of these attempts have been entirely successful, it is reassuring to know that there is a framework that can turn these informal arguments into rigorous proofs.

$dy/dx$ can be viewed as an actual quotient in another sense, too. Say you plot the graph of $y=x^2$ and consider the gradient at a particular point $P$. If you move a certain distance along the tangent to $P$, going $dy$ units upwards and $dx$ units to the right, then $dy/dx$ should be exactly equal to $2x$. But if you 'zoom in' far enough, then moving along the tangent line looks no different to moving along the curve. This way of viewing $dy/dx$ as a quotient avoids the machinery of nonstandard analysis, while also being fairly intuitive in my opinion. This means that $$ \frac{dy}{dx}=f'(x) \implies dy = f'(x)dx = \frac{dy}{dx}dx $$ as you mentioned in your post. So $dx$ and $dy$ can be given a meaning as independent quantities, even though $dy/dx$ generally means 'apply the derivative operator, $\frac{d}{dx}$, to $y$'.

Regarding some of the other instances where you question the use of infinitesimals:

Integration by substitution

Integration by substitution comes from reversing the chain rule. Recall that if $y=f(g(x))$, then $$ \frac{dy}{dx}=f'(g(x))g'(x) \, . $$ Thus, $$ \int f'(g(x))g'(x) \, dx = f(g(x))+C \, . $$ On the other hand, if we make the substitutions $u=g(x)$ and $du = g'(x) \, dx$, then the integral becomes $$ \int f'(u) \, du=f(u)+C=f(g(x))+C \, . $$ It just so happens that working out $du/dx$ and then 'multiplying by $du$' means that you will make the correct substitutions. The formula can be summarised as $$ \int f(g(x))g'(x) \, dx = \int f'(u) \, du \quad \text{where $u=g(x)$} $$ In Leibnizian notation, it reads particularly well $$ \int \frac{dy}{du} \cdot \frac{du}{dx} \, dx = \int \frac{dy}{du} \, du \, . $$ Again, the $dx$'s 'cancel' in a similar fashion to the chain rule.

Separating the variables

It is common to solve first order differential equations in the following way: \begin{align} \frac{dy}{dx} &= f(x)g(y) \\[4pt] dy &= f(x)g(y)dx \\[4pt] \frac{1}{g(y)} dy &= f(x) \, dx \\[4pt] \int \frac{1}{g(y)} dy &= \int f(x) \, dx \\[4pt] \end{align} Although the steps we used do seem questionable, the conclusion we draw is not. It can be directly shown that $$ \frac{dy}{dx} = f(x)g(y) \iff \int \frac{1}{g(y)} dy = \int f(x) \, dx \, . $$ Here is how: \begin{align} &\int \frac{1}{g(y)} dy = \int f(x) \, dx \\[4pt] \iff & \frac{d}{dx} \int \frac{1}{g(y)} dy = \frac{d}{dx} \int f(x) \, dx \\[4pt] \iff & \frac{1}{g(y)} \cdot \frac{dy}{dx} = f(x) \\[4pt] \iff & \frac{dy}{dx} = f(x)g(y) \, . \end{align} I'm not sure if there is a 'deeper' reason why $dx$ and $dy$ can be treated as manipulable quantities in this case. Perhaps others can shed some light on that.

Understanding what a differential is (Why can physicists multiply both sides by dx?)

1 Answers1