5

It seems fairly common to describe $\mathrm{d}x$ in nonstandard analysis as an infinitesimal. But after thinking it through (and then skimming Keisler's text), I can't see the point and actually think it's misleading!

First, let me clearly point out that $\mathrm{d}y$ is not being used here as to express a "difference in $y$"; this post is following the convention that $\Delta y$ is used for such things, and $\mathrm{d}y$ is reserved for the differential.

That is, suppose $y = x^2$. A "change in $y$" is the quantity $\Delta y$ given by, after fixing some change $\Delta x$ in $x$: $$ \Delta y = (x + \Delta x)^2 - x^2 = 2 x (\Delta x) + (\Delta x)^2 $$

This is not what $\mathrm{d}y$ is. We simply have $\mathrm{d}y = 2x \,\mathrm{d} x$. In general, if $y = f(x)$, then $\mathrm{d} y $ is simply defined to be $f'(x) \,\mathrm{d} x$. No differences, infinitesimal approximations, or anything of that flavor is going on here; $\mathrm{d} y$ is nothing more than a vessel for carrying around a copy of $f'(x)$.

(and $\mathrm{d} x$ was simply defined to be an independent, infinitesimal variable)

A typical application of a differential is that in a definite integral $\int_a^b f(x) \,\mathrm{d}x$, we might decide to write down a Riemann sum with $H$ evenly spaced partitions for some infinite $H$, and substitute in the notation $\mathrm{d}x$ with the width of an interval $\frac{b-a}{H}$ to get $$ \int_a^b f(x) \,\mathrm{d}x \approx \sum_{i=1}^H f\left(a + \frac{b-a}{H}i \right) \frac{b-a}{H} $$ However, if I encode $\mathrm{d}x$ as an infinitesimal, then write $\int_a^b f(x) \epsilon$, there's no way to figure out what that means. You might write $H = (b-a)/\epsilon$ and write down the Riemann sum above, but that gives the wrong answer if I encoded $\mathrm{d}x$ as $2 \epsilon$. The best you can do is to undo the encoding; e.g. $$ \int_a^b f(x) \epsilon \approx \sum_{i=1}^H f\left(a + \frac{b-a}{H}i \right) \frac{\epsilon}{\mathrm{d}x} \frac{b-a}{H} $$

Thus, the encoding of the differential form as an infinitesimal does not seem to do anything useful for this application of differentials. But maybe we can do other interesting arithmetic with them. However, I don't think there's any application of quantities like $(\mathrm{d}y)^2$ or $1 + \mathrm{d}y$ or $\sin(\mathrm{d}y)$ — it's the quantities like $(\Delta y)^2$ or $1 + \Delta y$ or $\sin(\Delta y)$ that we play with.

Instead, the only useful operations seem to be the ordinary differential form operations — things like adding two differential forms or multiplying a differential form by a function.

In sum, the only application of this definition seems to be to allow one to say that $\frac{\mathrm{d}y}{\mathrm{d}x}$ is the ratio of two hyperreal-valued variables — but even in ordinary analysis we can understand that as a ratio of differential forms!

Furthermore, insistence that $\mathrm{d}x$ be infinitesimal appears to be completely irrelevant; you could do the same thing in standard analysis simply by removing the constraint that $\mathrm{d}x$ be infinitesimal. In fact, to some extent, people do do the same thing; e.g. defining the differential of a function $f$ to be the function $\mathrm{d}f(x,e) = f'(x) e$.

So, I pose my question — what is the point of making $\mathrm{d}x$ an infinitesimal hyperreal?

  • 2
    For more than a century we have had a version of calculus that is infinitesimal-free. Yet in applications and informal reasoning, infinitesimals remain useful. It is interesting that that intuition can be partly preserved in a formal version of calculus. – André Nicolas Jul 22 '16 at 23:02
  • 1
    Comment 1: It's nice to do formal analysis in a way that looks somewhat like how we do informal analysis. Unfortunately I will concede that hyperreal analysis does not really look quite as similar to informal analysis as one might hope. – Ian Jul 22 '16 at 23:03
  • @Andre: Yes, but that's $\Delta x$ and $\Delta y$. –  Jul 22 '16 at 23:04
  • 1
    Comment 2: The transfer principle pretty much says that hyperreal analysis does not directly buy you any "meaningful" theorems that standard real analysis (with ZFC as the ambient set theory) doesn't buy you (in a certain precise sense of "meaningful"). It merely changes the language that we talk about things in. Apparently this is sometimes useful; several of Terry Tao's recent papers state their proofs in hyperreal language for convenience reasons. – Ian Jul 22 '16 at 23:04
  • 2
    @Hurkyl I don't quite see what you mean; when you, say, prove that $x^2$ is continuous at $1$ using hyperreal machinery, you write $(1+dx)^2-1=2dx+(dx)^2$ and observe that this infinitesimal if $dx$ is infinitesimal. The fact that I called the infinitesimal $dx$ is not essential, I can call it whatever I want, and maybe using the same symbol as we use in classical analysis is misleading. – Ian Jul 22 '16 at 23:06
  • 1
    @Ian: I don't write that; we're talking about actual differences in the variables here, so I'd use $\Delta x$. You can show $y=x^2$ is continuous at $1$ by showing that $\Delta y$ is infinitesimal. You can't do it by showing $\mathrm{d} y$ is infinitesimal, because to even define $\mathrm{d}y$ you need to know beforehand that $x^2$ is differentiable! –  Jul 22 '16 at 23:09
  • 1
    You are either misunderstanding me or else you are saying that the name for the infinitesimal is misleading. In the latter case, I can work with that point, but that seems like a rather pedantic criticism: if I say "$dx$ is an infinitesimal number" you should be able to roll with that even though you are used to a different usage of the symbol $dx$. The only reason you might not be able to roll with that is if you don't believe that there is a rigorous way to manipulate infinitesimals...but Robinson showed us that there is about 50 years ago. – Ian Jul 22 '16 at 23:11
  • @Hurkyl: I think there is a significant difference between taking the standard part of $\frac{f(x+dx)-f(x)}{dx}$ for an actual non-zero infinitesimal and taking the limit of $\frac{\Delta f}{\Delta x}$, thought the effect is the same. – André Nicolas Jul 22 '16 at 23:11
  • @Ian: Well, I am asking about $\mathrm{d}x$ used the way that I used $\mathrm{d}x$ after all. I tried to make the notational usage clear at the beginning of my post. The way I used it is a thing people do -- e.g.that's how Keisler's book proceeds. Keisler always uses $\mathrm{d}y$ to mean $f'(x) \mathrm{d}x$ (when $y=f(x)$, of course). If he wants to talk about a change in $y$, he uses $\Delta y$. –  Jul 22 '16 at 23:14
  • 1
    It seems to me that your criticism is then merely about the usage of $dy:=f'(x)dx$, and has little to do with $dx$ itself being an infinitesimal. Specifically, your criticism seems to be that this really just carries $f'(x)$ around in a box, since we think of $dx$ as really being arbitrary. And there I think I agree with you. – Ian Jul 22 '16 at 23:21
  • @Ian: I think $\mathrm{d}y = f'(x) \mathrm{d}x$ is fine. That's how differential forms work, after all. My criticism is that there doesn't seem to be any point to realizing differential forms as hyperreals. –  Jul 22 '16 at 23:24
  • So that you can write integration as a sum? So that you can write the Wiener process as a bona-fide "broken line" instead of merely a limit of them? etc. – Ian Jul 22 '16 at 23:28
  • @Ian: I know how to write the definite integral of a differential form as a Riemann sum. I know, given standard $f$ and $\epsilon$, how to do as Keisler defines and write down for the definite integral of $f$ a Riemann sum over evenly spaced intervals of length $\epsilon$. But if we are interpreting differentials as infinitesimals, I don't know how to make sense of $\int_a^b f(x) \epsilon$, except by inverting the encoding and recovering the differential form it is intended to express. (in fact, $\int_a^b f(x) \epsilon$ depends on what precise infinitesimal we chose for $\mathrm{d}x$) –  Jul 22 '16 at 23:43
  • 3
    If $N$ is a hyperfinite integer and $\epsilon=\frac{b-a}{N}$ then $\sum_{n=1}^N f(a+\epsilon n) \epsilon$ is $\int_a^b f(x) dx$ or at least infinitely close to it. So I don't see what you mean exactly. Where I do a discrepancy from intuition is that the hyperreal version of the Riemann integral is not just summing up the values at each point and multiplying by an infinitesimal; we are still ultimately partitioning, just into hyperfinitely many subintervals rather than taking a limit of finite partitions. – Ian Jul 22 '16 at 23:57
  • @Ian: And if $M$ is another particular hyperfinite integer, $\epsilon' = \frac{b-a}{M}$ and $\sum_{n=1}^M (2f(a + \epsilon' n) \epsilon')$ is infinitesimally close to $\int_a^b 2f(x) , \mathrm{d}x$. But $f(x) \epsilon$ and $2f(x) \epsilon'$ are the same thing, so there is no well defined way to take $f(x) \epsilon$ and say "integrate that". The only way to write down a Riemann sum is to also know what I encoded $\mathrm{d}x$ is, or something equivalent (like how many partitions you're 'supposed' to use). So what was the point of interpreting $f(x) \mathrm{d}x$ with $f(x) \epsilon$? –  Jul 23 '16 at 08:24
  • 1
    You may be somewhat missing the point. The hyperreals have no derivative; they have ordinary subtraction and division, and the standard part operation. The hyperreals also have no integral; they have summation and the standard part operation. Part of the point of all this was to describe these "limiting" processes in standard analysis (which are stuck being expressed in terms of quantifiers) in terms of things like sums over infinitely many terms (which are not limits; the hyperreals have no real notion of "limit", only something that transfers to the standard notion of "limit"). – Ian Jul 23 '16 at 10:30
  • Where I agree with you is that we might hope that instead of doing one of these partitions, we could simply sum up values at each hyperreal point in an interval and multiply by an appropriate infinitesimal. The fact that we need a hyperfinite partition instead is a bit annoying. On the other hand, the transfer principle somehow tells you that you shouldn't expect anything quite that magical. – Ian Jul 23 '16 at 10:32
  • @Ian: Think of the classic freshman question "Is $\frac{dy}{dx}$ a fraction?". An interesting and useful thing to say is "We can make sense of the difference quotient with $f'(x) = \mathrm{std}\left(\frac{\Delta y}{\Delta x} \right)$ for infinitesimal $\Delta x$". Sometimes (e.g.Keisler's book!) people say "Yes, because we make $dx$ a number and define $dy = f'(x) dx$", and I don't see the point of that. Maybe I am missing the point, which is why I want to learn about it! Maybe they are missing the point, in which case I would like to know I'm not missing a useful point of view. –  Jul 23 '16 at 10:53
  • I know you keep saying things, but in my eyes, you keep saying "this is the point of using infinitesimals; you can do these neat things with nonstandard analysis", but you don't seem to actually be saying anything about the specific construction I'm asking about. (also, addendum to the previous comment: Keisler's book says both things) –  Jul 23 '16 at 10:58
  • 3
    As I said earlier, I agree with you that responding to "is $\frac{dy}{dx}$ a fraction?" with "yes, just let $dx$ be a number and define $dy=f'(x) dx$" and not developing the subject any further is a waste of mental energy at best and misleading at worst, because the corresponding $dy$ is not actually a $\Delta y$ for any $\Delta x$. For infinitesimal $\Delta x$ it is so close to such a $\Delta y$ that the difference is still infinitesimal when divided by $\Delta x$. But so what? All you've done is rearranged things now, you haven't shown how $f'(x)$ itself emerges as a fraction. – Ian Jul 23 '16 at 11:14
  • 1
    Ah, I missed the earlier agreement. I feel bad for dragging this out now! –  Jul 23 '16 at 18:39
  • @Ian: You may be interested in this post where I sketch a framework that is 100% rigorous and yet supports intuitive Leibniz-style derivatives. Moreover, it is easy to use it with asymptotic analysis, and does not depend on any dubious set-theoretical assumptions. – user21820 Feb 08 '22 at 18:53

2 Answers2

2

It is true that $dy$ is not the $y$-increment, but the point is that the derivative $\frac{dy}{dx}$ can be uniquely deduced from the ratio (by taking standard part) $\frac{\Delta y}{\Delta x}$ which is a ratio of increments. This cannot be done in the framework using the real numbers only. Hyperreal infinitesimals are infinitesimals in the usual sense that they violate Definition 5.4 of Euclid's Elements (i.e, what is known today as the Archimedean property since Stolz in the 1880s) . This was recognized already by Leibniz 300 years ago.

Mikhail Katz
  • 42,112
  • 3
  • 66
  • 131
0

I assert that that there is no intrinsic reason to make $\mathrm{d}x$ an infinitesimal. However, conventions can force us to doing so; for example, it is a consequence of:

  • The functional form of the differential: $\mathrm{d}f(x,y) = f'(x) y $
  • The habit of designating a particular variable (e.g. $x$) as special
  • The habit of implicitly partially evaluating differentials at the special difference $\Delta x$

And by making the second argument implicit and fixed, if we use $\mathrm{d}f(x)$ to implicitly mean $\mathrm{d}f(x, \Delta x)$, the notation lends itself to the variable form $\mathrm{d}y$ to be used in place of $\mathrm{d}f(x)$ whenever $y = f(x)$.

Under these conventions, if $i$ is the identity function $i(x) = x$, then $$\mathrm{d}x = \mathrm{d}i(x) = \mathrm{d}i(x, \Delta x) = \Delta x $$

thus identifying the differential $\mathrm{d}x$ with the variable $\Delta x$ which, conventionally, is infinitesimal-valued.

  • If we wish for example to prove the chain rule in an accessible way it is helpful to have the equality $dx=\Delta x$ as in Keisler, to result in a famous cancellation (after suitable explanations concerning dependent and independent variables). On the other hand if you allow arbitrary values for $dx$'s then the argument becomes less transparent. – Mikhail Katz Mar 06 '17 at 14:48