Why it is absolutely mistaken to cancel out differentials?

Question

In many of my physics courses, (don't worry, this is a mathematics question!) My teachers cancel out differentials, and every time, they say: "If a mathematician saw me canceling out this differentials he would get mad, but we are physicists so, there is no problem in doing this." However, still, I haven't seen an acceptable explanation why is this so dramatically wrong. I would appreciate if someone could explain!

I go mad not only when somebody cancels them out, but also when writes them. Has anybody a definition of what means $x^2dx$? — ajotatxe, May 07 '15 at 15:00
See http://math.stackexchange.com/questions/21199/is-frac-textrmdy-textrmdx-not-a-ratio — Arthur, May 07 '15 at 15:02
It is often an application of the inverse function theorem with some implied monotonicity assumption. — copper.hat, May 07 '15 at 15:02
Physicists don't necessarily know what mathematicians will object to. In many cases their manipulations are correct if certain assumptions are made. Can you give a specific example, with context, of such cancellation? — Robert Israel, May 07 '15 at 15:05
That's his way of saying he isn't 100% confident that he is allowed to do it within the laws of mathematics, but his colleagues tell him it's ok — jameselmore, May 07 '15 at 15:13
If you mean something such as: $x,dx = y,dx$ implies $x = y$, then your physics teacher is perfectly right to eliminate the differentials, in a mathematical sense. — Circonflexe, May 07 '15 at 17:59
It's abuse of notation, like with treating $dy/dx$ as a fraction — Dylan, May 07 '15 at 18:18
@zoli Now this is not strict mathematics at all! You have to check first if $\int \neq 0$ — Lurco, Aug 03 '15 at 14:02

Stephan Kulla · Accepted Answer · 2015-05-08T16:12:59.320

Physicists might use infinitesimals like in the following derivation of the product rule:

$$\begin{align} f(x+dx)g(x+dx) &= \left(f(x)+ f^\prime(x)dx\right) \cdot \left(g(x)+g^\prime(x)dx\right) \\ &= f(x)g(x)+ f^\prime(x)g(x) dx + f(x)g^\prime(x) dx + f^\prime(x)g^\prime(x) \underbrace{dx^2}_{=0} \\ & = f(x)g(x)+ (f^\prime(x)g(x) + f(x)g^\prime(x)) dx \end{align}$$

The problem is: What is $dx$? Here you will get normally the answer that $dx$ is an infinitesimal, i.e. a number not being zero and with a distance from zero smaller than every rational number $q\in\mathbb Q^+$. But there are some problems with this explanation:

$dx$ is mostly used like an ordinary real number. One builds fractions like $\tfrac{dy}{dx}$ and calculates with those objects like they would be real fractions. But if you think how $dx$ is used, it would be a strange number. When I write $dx^2=0$ I use $dx$ like it would be zero. One the other side $dx$ might occur in the denominator of a fraction, which is only allowed for $dx\neq 0$. So sometimes $dx$ behaves like 0 and sometimes like a nonzero number.
Due to the Archimedian property there is no real number satisfying the properties of $dx$. So if the number system you use is $\mathbb R$, the object $dx$ cannot be a number. Because the Archimedian property is an axiom of contemporary analysis and this theory has no other concept for infinitesimals, one cannot use $dx$ in nowadays analysis.
Normally nobody gives a mathematical rigorous definition of $dx$ when it is used in a physics course. So the question remains: What is $dx$?

Nowadays there are mathematical theories for infinitesimals: For example there is non-standard analysis, where the set of real numbers is expanded to the set of hyperreal numbers, which contains infinitesimals. In this theory one can do calculations as in the example above. So it is possible to cancel out differentials, if one shifts the underlying theory from contemporary analysis to something like non-standard analysis.

My Opinion: I do not think, that there is actually a problem. Normally physicists have a good intuition with infinitesimals. They know how they can use them and what problems might occur and they can work effectively with them. Okay, it would be great, if more people would be aware of theories like non-standard analysis, but in my opinion, one first has to learn the intuition of a concept before he can study its rigorous definition. For example you first calculate with real numbers in school and get some intuition for them before you go to university and learn, what the axioms of the real number system are or how they can be constructed via Dedekind cuts or Cauchy sequences.

Even in nonstandard analysis, I think that $\frac{dy}{dx}$ is the standard part of a fraction, rather than a fraction itself. (The "standard part" of a hyperreal is the real number closest to it. Infinite hyperreals have no standard part; infinitesimals have standard part 0.) Generally, though, I think that $\operatorname{st}(a)\operatorname{st}(b)=\operatorname{st}(ab)$, if these all exist, so it doesn't make too much of a difference. (That statement is basically another way of saying $(\lim f)(\lim g)=\lim(fg)$, but from the point-of-view of hyperreals.) — Akiva Weinberger, May 08 '15 at 18:13
@columbus8myhw: $dy/dx$ is well defined in non-standard analysis, but you have $y^\prime(x) = st\left(\frac{dy}{dx}\right)$. So you are right... — Stephan Kulla, May 08 '15 at 18:29
@tampis, would you that physicists are effectively doing non-standard analysis whenever they are working with infinitesimals? — Andrea, May 10 '15 at 21:32
@tampis: In the presentations of NSA I'm used to, they define $\mathrm{d}y(x)$ to be $y'(x) \mathrm{d}x$, not $y(x + \mathrm{d}x) - y(x)$. Instead, $\Delta y(x)$ would be more likely to be used for $y(x + \mathrm{d}x) - y(x)$. The intent, presumably, is to correctly model the familiar manipulations with $\mathrm{d}$'s, not to give a redundant notation for working with infinitesimals. — , Oct 25 '15 at 09:44

Ian · Answer 2 · 2015-05-08T23:47:30.000

I agree with the accepted answer: differential notation is a very useful tool for calculations, and in most of the situations where physicists and engineers use it, everything works out fine. That said, I'd like to point out a case where being sloppy with differential notation can lead one to an apparent "proof" of a false statement. I have heard this attributed to Cauchy, although I suspect this attribution is a "mathematical urban legend".

Suppose $f_n$ is a sequence of continuous functions which converges to a function $f$ on $[0,1]$. We ask whether $f$ must be continuous. We write

$$\begin{align} |f(x+dx)-f(x)| & =|f(x+dx)-f_n(x+dx)+f_n(x+dx)-f_n(x)+f_n(x)-f(x)| \\ & \leq |f(x+dx)-f_n(x+dx)|+|f_n(x+dx)-f_n(x)|+|f_n(x)-f(x)| \end{align}$$

Informally we now think about infinitely large $n$ and infinitely small $dx$. Then all three terms should be infinitely small (the first and third because of convergence and the second because of continuity). So the original side should be infinitely small, and so $f$ should be continuous.

When we formalize the above argument, everything works out provided $f_n$ converge uniformly. But if they converge only pointwise, then this fails: $f_n(x) = x^n$ converges for $x \neq 1$ to $0$ and for $x = 1$ to $1$. Note that unlike uniform convergence, pointwise convergence can be reasonably formulated without developing the axiomatic framework of analysis: all that we need is a way to talk about convergence of sequences of numbers, and a notion of continuity.

The problem in the argument when we go to approach the problem formally is that we need to pick a single $n$ to control both the first and third terms, and only after doing so do we choose how small $dx$ must be to control the second term. But that means that we choose $n$ before we have chosen $dx$, and without uniform convergence we may need a larger $n$ to control the first term for our chosen $dx$.

It is hard to even describe this phenomenon in the infinitesimal language! One way to see it is in the hyperreal framework: take $N$ to be an infinite natural number and $dx=1/N$. Then the standard part of $(1-dx)^N$ is not $1$, but rather $e^{-1}$. And now we see the problem: the limit processes $n \to \infty$ and $x \to 1^-$ compete with one another, one trying to pull the result toward $0$ and the other trying to pull the result toward $1$. This effect is missed when we naively say that $n$ is infinite and $dx$ is infinitesimal without saying how they compare to one another.

I've been trying to see where your argument fails (when $f$ isn't continuous), when we look at the hyperreals, and I think I've come to an odd conclusion. (Notation: $\mathbb R^$ is the hyperreals, and $f^$ is the natural extension of $f$ to $\mathbb R^$.) So, as far as I can tell, even though the $f_n$ converge point-wise to $f$ on $\mathbb R$… the $f_n^$ don't converge point-wise to $f^$ on $\mathbb R^$? — Akiva Weinberger, May 08 '15 at 18:35
Assuming I haven't made a mistake, I think that means that $f_n\to f$ uniformly iff $f_n^\to f^$ point-wise. — Akiva Weinberger, May 08 '15 at 18:39
@columbus8myhw On compact sets at least that is correct. I am not sure about the natural extension to the entire space. Incidentally, there is a corresponding theorem about uniform continuity: $f$ is uniformly continuous iff $f^*$ is continuous. — Ian, May 08 '15 at 18:40
I probably should have said $[0,1]$ and $[0,1]^$ at the end there instead of $\mathbb R$ and $\mathbb R^$, respectively, since you said $f$ is defined only on $[0,1]$. Where $[0,1]^*$ is the set of hyperreals between $0$ and $1$ inclusive. — Akiva Weinberger, May 08 '15 at 18:42
There is a theorem in Robinson's 1966 book that $f_n$ converges uniformly iff the natural extensions converge pointwise in the sense of the epsilon-N defintiion to the natural extension of $f$. — Mikhail Katz, Jul 07 '16 at 07:32

score 1 · Answer 3 · answered May 08 '15 at 16:07

1

The standard example is to solve $y' = y$. So the physicist would write $dy = ydx$ and therefore $\frac{dy}{y} = dx$ then compute anti-derivatives of both sides. The physicist does not care about the mathematics here, but only whether or not the manipulations will lead to a correct answer. Indeed he divides by $y$, which could be zero, but again he does not care.

It is a mistake because what does $dy$ and $dx$ even mean? I still have no idea what it means, and despite people trying to explain what it means in the last 10 years I have never came across any clear explanation. Think of it as the magic of notation that solves the problem for you, that is it.

answered May 08 '15 at 16:07

Nicolas Bourbaki

5,064

2

Good point. I also think that most physicists do not care about the underlying concepts as long as they get correct results. I recommend you to read a textbook about non-standard analysis once you will have time. Its an interesting theory... – Stephan Kulla May 08 '15 at 16:18
@tampis There is a calculus book, written by a physicist, I do not remember what it is called, which intentionally does calculus the "wrong" way. It looked pretty interesting, the intention of the book is to provide the intuition mathematicians used to have before rigor was ever an issue. The only time I teach my students infinitesimal arguments is in integral calculus, when we do various applications. Because this is how physicists learn calculus. – Nicolas Bourbaki May 08 '15 at 17:54
1

This isn't an answer so much as a statement of a lack of knowledge about two things: (1) how calculus was done ca. 1680-1890, and (2) the existence of NSA. A nice freshman-level presentation of NSA is available online for free in Keisler's book: https://www.math.wisc.edu/~keisler/calc.html – Oct 26 '15 at 00:56

Why it is absolutely mistaken to cancel out differentials?

3 Answers3

Linked