25

When it comes to definitions, I will be very strict. Most textbooks tend to define differential of a function/variable like this:


Let $f(x)$ be a differentiable function. By assuming that changes in $x$ are small enough, we can say: $$\Delta f(x)\approx {f}'(x)\Delta x$$ Where $\Delta f(x)$ is the changes in the value of function. Now we define differential of $f(x)$ as follows: $$\mathrm{d}f(x):= {f}'(x)\mathrm{d} x$$ Where $\mathrm{d} f(x)$ is the differential of $f(x)$ and $\mathrm{d} x$ is the differential of $x$.


What bothers me is this definition is completely circular. I mean we are defining differential by differential itself. Can we define differential more precisely and rigorously?

P.S. Is it possible to define differential simply as the limit of a difference as the difference approaches zero?: $$\mathrm{d}x= \lim_{\Delta x \to 0}\Delta x$$ Thank you in advance.


EDIT:

I still think I didn't catch the best answer. I prefer the answer to be in the context of "Calculus" or "Analysis" rather than the "Theory of Differential forms". And again I don't want a circular definition. I think it is possible to define "Differential" with the use of "Limits" in some way. Thank you in advance.


EDIT 2 (Answer to "Mikhail Katz"'s comment):

the account I gave in terms of the hyperreal number system which contains infinitesimals seems to respond to your concerns. I would be happy to elaborate if anything seems unclear. – Mikhail Katz

Thank you for your help. I have two issues:

First of all we define differential as $\mathrm{d} f(x)=f'(x)\mathrm{d} x$ then we deceive ourselves that $\mathrm{d} x$ is nothing but another representation of $\Delta x$ and then without clarifying the reason, we indeed treat $\mathrm{d} x$ as the differential of the variable $x$ and then we write the derivative of $f(x)$ as the ratio of $\mathrm{d} f(x)$ to $\mathrm{d} x$. So we literally (and also by stealthily screwing ourselves) defined "Differential" by another differential and it is circular.

Secondly (at least I think) it could be possible to define differential without having any knowledge of the notion of derivative. So we can define "Derivative" and "Differential" independently and then deduce that the relation $f'{(x)}=\frac{\mathrm{d} f(x)}{\mathrm{d} x}$ is just a natural result of their definitions (using possibly the notion of limits) and is not related to the definition itself.

I know the relation $\mathrm{d} f(x)=f'(x)\mathrm{d} x$ always works and it will always give us a way to calculate differentials. But I (as an strictly axiomaticist person) couldn't accept it as a definition of Differential.


EDIT 3:

Answer to comments:

I am not aware of any textbook defining differentials like this. What kind of textbooks have you been reading? – Najib Idrissi

 

which textbooks? – m_t_

Check "Calculus and Analytic Geometry", "Thomas-Finney", 9th edition, page 251

and "Calculus: Early Transcendentals", "Stewart", 8th edition, page 254

They literally defined differential by another differential.

  • 5
    That definition says "if you have the differential of $x$, then the differential of $f(x)$ is $f'(x) dx$". It doesn't really say what $dx$ is. For its own purposes it does not really need to do that, because for its purposes this is just notation, not really an object in its own right. One can make differentials into objects in their own right through either differential forms, hyperreal analysis, or smooth infinitesimal analysis. The latter are the two most prominent types of nonstandard analysis. These have been discussed at length on MSE, in the context of questions much like this one. – Ian Nov 03 '16 at 10:56
  • 1
  • The "usual" approach is to define differentiable as the property of a function $f$ and then introduce the derivative of $f$ as the "related" fucntion $f'$. Then we have "symbols manipulations"... See e.g. Ethan Bloch, The real numbers and real analysis (2011), page 183. – Mauro ALLEGRANZA Nov 03 '16 at 10:58
  • It doesn't really matter what $\mathrm{d}x$ or $\mathrm{d}f(x)$ are -- you only care how they're related. In the same way you don't care what real numbers "are", just that you know how to do arithmetic with them. –  Nov 03 '16 at 11:17
  • 2
    I am not aware of any textbook defining differentials like this. What kind of textbooks have you been reading? – Najib Idrissi Nov 03 '16 at 14:50
  • @Hamed, the account I gave in terms of the hyperreal number system which contains infinitesimals seems to respond to your concerns. I would be happy to elaborate if anything seems unclear. – Mikhail Katz Nov 11 '16 at 07:56
  • "Most textbooks tend to define differential of a function/variable like this" which textbooks? – Matthew Towers Nov 12 '16 at 12:41
  • Just another discussion of the same subject: http://physics.stackexchange.com/questions/92925/how-to-treat-differentials-and-infinitesimals/93025#93025. – Tobias Nov 12 '16 at 14:24
  • @HamedBegloo, the definition is not circular because the infinitesimal $\Delta y$ is defined as the $y$-increment $f(x+\Delta x)-f(x)$. This was essentially Leibniz's approach and he rarely did things that were circular. – Mikhail Katz Nov 13 '16 at 15:05
  • 1
    Related: http://math.stackexchange.com/questions/1991575/why-cant-the-second-fundamental-theorem-of-calculus-be-proved-in-just-two-lines/1991585#1991585 – Ethan Bolker Nov 13 '16 at 15:11
  • 1
    Good question! Many years ago, I remember asking my introductory DE prof the same question and remembered feeling the answer seemed awfully "hand-wavy." I didn't pursue the matter and never took another course on DE's (a mistake in hindsight). Perhaps the experts here can clarify in what context the differential is used in solving DE's. Quickly trying to re-familiarize myself with the topic online, it seems that, in ODE's anyway, it is always in the context of a ratio of two differentials. Is this correct? – Dan Christensen Nov 14 '16 at 14:52
  • 1
    Or the DE that is given can be recast in a form using only such ratios. – Dan Christensen Nov 14 '16 at 15:50
  • @DanChristensen As what I see in how we treat ODEs, the answer is yes. For example we easily take $\mathrm{d} x$ and $\mathrm{d} y$ from any side of the equation an move it to any other side to make a term in the form of $\frac{\mathrm{d} y}{\mathrm{d} x}$, then we call it "Derivative of $y$". Isn't it true? – Hamed Begloo Nov 14 '16 at 16:29
  • 1
    I can clearly answer all of your questions, and I think you will definitely satisfied. These very questions exactly bothered me for my first two year of undergraduate times. And I asked the same question as you then at my country's forum, and got the non-ideal answers like you now. However, it may took me too much time to write, so ... – Eric Nov 14 '16 at 16:45
  • 1
    If so, perhaps a differential is a purely a notational shortcut that encapsulates some fundamental theorem of differential equations: wherever you see $dy$ in a statement, say in an integral expression, with $y=f(x)$, you can substitute $f'(x) dx$. – Dan Christensen Nov 14 '16 at 16:55
  • @DanChristensen Maybe. But somehow still I think it's more than a notation. For example think of how we represent derivatives with the ratio of differentials to show how chain rule works. Or how we broke a differential in the $u$-substitution method to solve integrals – Hamed Begloo Nov 14 '16 at 17:17
  • @DanChristensen Or even in DEs, when we say we integrate over both sides of equation we are not actually integrating. I mean we are not using the "Antiderivative" operation in this case. We are just applying the "Elongated S($\int $)" on both sides not "Antiderivative". I would call it "Antidifferential". When we think of it this way I mean "Differential" as an operator(which could possibly have an anti-operator) it seems so interesting. I really think there is something more deep in this. – Hamed Begloo Nov 14 '16 at 17:17
  • What about a definition in the Algebraic context? You can consider the "space" of differentials as a $\mathcal{C}^0$-submodule. In this case $dx$ is just a generator. – Michele Maschio Nov 17 '16 at 11:59
  • @MicheleMaschio Thank you, but introducing a new space remembers me of talking in the context of "Hyperreal numbers". I know considering "Hyperreal numbers system" in mind should solve many problems but it seems many mathematicians hate the concept and a big part of them even don't count "Hyperreals" as numbers and it discourages me more to get into the hyperreals. Is it possible to build a better definition in "Real space"? – Hamed Begloo Nov 17 '16 at 15:34
  • Many related questions on this site, some of which may help. See http://math.stackexchange.com/questions/1991575/why-cant-the-second-fundamental-theorem-of-calculus-be-proved-in-just-two-lines – Ethan Bolker Dec 06 '16 at 15:16
  • I get the impression that the issue here is more about notation than anything else. –  Jan 06 '17 at 09:36
  • Incidentally, a "strictly axiomaticist person" should have absolutely no trouble accepting a definition of a thing in terms of how you use it. You seem to be insisting on reductionism, not 'axiomaticism'. –  Jan 06 '17 at 10:02
  • @Hurkyl "...should have absolutely no trouble accepting a definition of a thing in terms of how you use it." Sorry but I think it's pragamtism - just wanted to add another "-ism" :) - But seriously I think an axiomaticist would only accept definitions if they are defined in terms of predefined or primitive notions. – Hamed Begloo Jan 10 '17 at 16:18
  • @Hamed: An axiomatic system is exactly the opposite of that -- the main point of axioms for some concept is that one can reason about that concept using just the axioms. (in fact, sometimes axioms are used even when you do reduce a notion to something more primitive, because the axioms are easier to work with!) –  Jan 10 '17 at 16:34
  • @Hurkyl Well that seems reasonable too. Anyway does it mean it's better to accept "Differential" as a primitive? – Hamed Begloo Jan 10 '17 at 16:47
  • 3
    @Hamed: The usual answer to such questions is "yes and no"; a fair number of the concepts we have in mathematics were invented precisely because treating them as primitive makes a lot of problems much simpler. On the other hand, reducing them to other concepts (or maybe it's more accurate to say that you're relating one concept to a different one) can offer new insights on both concepts too. –  Jan 10 '17 at 20:34

8 Answers8

35

Of course, defining $$ \mathrm{d}x= \lim_{\Delta x \to 0}\Delta x $$ is the same as defining $$ dx=0, $$ which makes no sense. The correct approach is to define the differential as a kind of linear function: the differential $df(x)$ (sometimes denoted by $df_x$) is the linear function defined by $$ df(x):\mathbb R\to\mathbb R\qquad t\mapsto f'(x)\cdot t $$ In particular $$ dx:\mathbb R\to\mathbb R\qquad t\mapsto t $$ Therefore, one can also write $ df(x)=f'(x)dx$ (the composition with the identity map). This sounds perhaps trivial for scalar funtions $f$. The concept is more interesting for vector functions of vector variables: in that case $df(x)$ is a matrix. The differential $df(x_0)$ has to be interpreted as the best linear function which approximates the incremental function $h(x):=f(x)-f(x_0)$ near $x=x_0$. In this sense, the concept is connected to the idea you have expressed through the approximate 'equation' $\Delta f(x)\approx {f}'(x)\Delta x$

guestDiego
  • 4,006
  • 1
    As someone reviewing calculus for the first time in ages I'm thrown off by your use of t. I understand the arrow notation to say it's a function from reals to reals, but I have no idea what the other kind of arrow means or what t is. – Joseph Garvin Oct 03 '17 at 01:42
  • 4
    The arrow $\mapsto$ can be pronounced "maps to". It shows the input of the function on the left and the output on the right. For example, $t\mapsto 3t^2$ is the function that takes any number, squares it, and multiplies by $3.$ The symbol on the left of $\mapsto$ (in this case $t$) is just a name for "the input." The answer above says that after we choose a particular value of $x,$ the differential $df(x)$ becomes a function (but not a function of $x$) that multiplies its input by $f'(x).$ (Since we have already chosen $x,$ the expression $f'(x)$ is just a number.) – David K Oct 03 '17 at 02:49
  • 1
    If dx is an identity map then wouldn't df=f'(x)? – Pineapple Fish Jun 19 '19 at 20:16
  • No because df is map and f'(x) is a number. Its necessary to add dx to convert it to a map. – Ricardo Jun 09 '21 at 00:01
10

There are two ways of defining the differential of $y=f(x)$:

(1) as differential forms. Here $dx$ is a linear function on the tangent space (in this case tangent line) at a point, and the formula $dy=f'(x)dx$ is a relation between 1-forms.

(2) as an infinitesimal number. Such a number is an element of the hyperreal number system, as detailed in the excellent textbook by H. J. Keisler entitled Elementary Calculus that we are currently using to teach calculus to 150 freshmen.

Here the independent variable $\Delta x$ is an infinitesimal, one defines $f'(x)=\textbf{st}(\frac{\Delta y}{\Delta x})$ where "$\textbf{st}$" is the standard part function (or shadow) and $\Delta y$ is the dependent variable (also infinitesimal when the derivative exists). One defines a new dependent variable $dy$ by setting $dy=f'(x)dx$ where $dx=\Delta x$. Note that it is only for the independent variable $x$ that we set $dx=\Delta x$ (therefore there is no circularity).

The advantage of this is that one can calculate the derivative $\frac{dy}{dx}$ from the ratio of infinitesimals $\frac{\Delta y}{\Delta x}$, rather than merely an approximation; the proof of the chain rule becomes more intuitive; etc.

More generally if $z=f(x,y)$ then the formula $dz=\frac{\partial f}{\partial x} dx + \frac{\partial f}{\partial y}dy$ has two interpretations: as a relation among differential 1-forms, or as a relation among infinitesimal differentials. Classical authors like Riemann interpreted such relations as a relation among infinitesimal differentials.

It is not possible to define $dx$ by a limit as in $\mathrm{d}x= \lim_{\Delta x \to 0}\Delta x$ (as you wrote) because that would simply be zero, but a generalisation of limit called ultralimit, as popularized by Terry Tao, works just fine and produces an infinitesimal value for $dx$.

More specifically, concerning your hope of somehow "defining differentials with the help of limits", the following can be said. The notion of limit can be refined to the notion of an ultralimit by refining the equivalence relation involved in defining the limit. Thus the limit of a sequence $(u_n)$ works in such a way that if $(u_n)$ tends to zero then the limit is necessarily zero on the nose. This does not leave much room for infinitesimals. However, the refined notion, the ultralimit, of a sequence $(u_n)$ tending to zero is typically a nonzero infinitesimal, say $dx$. We can then use this as the starting point for all the definitions in the calculus, including continuity and derivative. The formula $dy= f'(x) dx$ then literally makes sense for nonzero differentials $dx$ and $dy$ (unless of course $f'(x)=0$ in which case $dy=0$).

The definition is not circular because the infinitesimal $\Delta y$ is defined as the $y$-increment $f(x+\Delta x)-f(x)$. This was essentially Leibniz's approach (differentials are just infinitesimals) and he rarely did things that were circular.

Mikhail Katz
  • 42,112
  • 3
  • 66
  • 131
7

We consider a real valued function $y=f(x)$ differentiable at $x=x_0$.

The following reasoning can be found in section 3.7 of Höhere Mathematik, Differentialrechnung und Integralrechnung by Hans J. Dirschmid.

Definition: We call the change of the linear part of $f$ at $x=x_0$ considered as function of the argument increment $\Delta x$ the differential of the function $f$ at $x_0$, symbolically \begin{align*} dy=f^\prime(x_0)\Delta x\tag{1} \end{align*} The linear part of $f$ at $x_0$ is the expression \begin{align*} f(x_0)+f^\prime(x_0)\Delta x \end{align*}

Note that we introduce the term $dy$ in (1) without using $dx$ and so avoid any circular reasoning.

Here is a small figure for illustration:

                                        enter image description here

When talking about the differential $dy$ we use it for both as a function symbol and as the value of the function $dy$ evaluated at $\Delta x$. \begin{align*} dy=dy(\Delta x)=f^\prime(x_0)\Delta x\tag{2} \end{align*}

$$ $$

Connection with $dx$:

We consider the identity function $y=x$. Since $y^\prime=1$ we obtain by (2) \begin{align*} dy=1\cdot \Delta x=\Delta x \end{align*} Since $y=x$ and $dy=\Delta x$ we use this relationship to define \begin{align*} dx:=\Delta x \end{align*} and call it the differential of $x$.

With this two step approch we can write $dy=f^\prime(x_0)\Delta x$ as \begin{align*} dy=f^\prime (x_0) dx\tag{3} \end{align*} and resolve the seemingly circular definition.

[Add-on 2016-11-15]:

From (3) we see the differentials $dy$ and $dx$ are proportional as functions of $\Delta x$. Since we are allowed to divide real functions, we can also consider the quotient \begin{align*} \frac{dy}{dx}=f^\prime(x_0)\tag{4} \end{align*} This justifies the term differential quotient.

Observe the left-hand side of (4) is the quotient of two functions dependent on the argument increase $\Delta x$ which does not occur on the right-hand side. This implies that the quotient does not depend on the argument $\Delta x$ of the numerator $dy$ and the denominator $dx$.

$$ $$

Approximation of $f$ at $x=x_0$:

The linear part $$f(x_0)+f^\prime(x_0)\Delta x$$ approximates the function $f$ at $x=x_0$ with an error which decreases with an order higher than first order. This implies the change of the linear part - the differential $dy$ - approximates the change of the function, which is the difference $\Delta y=f(x+\Delta x)-f(x)$ also with this error quality: \begin{align*} \Delta y=dy+\Delta x \varepsilon(\Delta x),\qquad \lim_{\Delta x\rightarrow 0}\varepsilon(\Delta x)=0. \end{align*}

Markus Scheuer
  • 108,315
  • So, you are really talking about a ration $\frac{dy}{dx}=f'(x_0)$. Correct? (See my comment to the OP.) When would you want to use $dy$ in isolation like this to solve DE's? – Dan Christensen Nov 14 '16 at 15:12
  • 1
    Thank you for your answer. But it seems to me this could be just the definition for a special case. I mean logically speaking how could we come an "Special" function into play(namely Identity function) for a "General" definition which should apply to all functions? – Hamed Begloo Nov 14 '16 at 16:31
  • Sorry for my bad English. I meant: "...how could a "Special" function come into play for...". – Hamed Begloo Nov 14 '16 at 18:02
  • @HamedBegloo: I've added some information which might be useful. – Markus Scheuer Nov 15 '16 at 11:04
  • Thank you for adding more details. But still I think I didn't get the answer of my question stated in the comments. I asked the definition of "Differential of a function" whether having any formula(algebraic law) could be stated as a general definition and should not use a special function like identity function. Am I not correct? – Hamed Begloo Nov 17 '16 at 08:13
  • @HamedBegloo: You're welcome. My answer has as focus your phrase: What bothers me is this definition is completely circular., which I thought is at the heart of your question. – Markus Scheuer Nov 17 '16 at 12:34
  • 1
    @HamedBegloo: When thinking about your statement in this last comment, it seems you should shift a little bit your point of view. Note, the differential quotient is a rather mathematically rigid formulation of the proportion of change $\Delta y$ of any function $y$ (in $y$-direction) with the change $\Delta x$ of $x$ (in $x$-direction) by taking the limit. The differential $dy$ is therefore fundamentally related to $\Delta x$ resp. $dx$. I don't see any possibility, or better any reason why we should want to get rid of it. – Markus Scheuer Nov 17 '16 at 12:34
  • @MarkusScheuer I think you actually got it conversely. Infact not only I don't wanna get rid of the notion of "Differential quotient" but rather I want to make it more well defined. And one of the reasons I suggested to define it as the limit of a difference while the difference approaches zero was that. This gives us the ability to represent derivatives as a ratio of two differentials(as we know we can show the limit of a ratio is the limit of the numerator divided by the limit of the denominator). – Hamed Begloo Nov 17 '16 at 15:32
  • @MarkusScheuer But although similar, I think there is an important distinction between $\Delta x$ and $\mathrm{d} x$. What I thought is it could be possible to introduce a general definition of "Differential" without using special functions(and of course not being circular). – Hamed Begloo Nov 17 '16 at 15:32
3

I think the differential forms version deserves to be fleshed out a little more:

Let $x, y, z, \ldots$ be all the (scalar) variables in use. Write $p$ for a tuple that assigns values to those variables: $(x_p, y_p, z_p, \ldots)$. Then a variable quantity is a (mathematical) function that assigns a (real or vector) value to each tuple $p$. Note that the variables are well-defined variable quantities given by

$$x(x_p, y_p, z_p, \ldots) = x_p\\ y(x_p, y_p, z_p, \ldots) = y_p\\ z(x_p, y_p, z_p, \ldots) = z_p\\ \vdots$$

For each variable quantity $E$, we're going to define another quantity $dE$. In particular, if $E$ is a real variable quantity, the differential of $E$ $dE$ is going to be a (partial function) that assigns to each assignment $p$ a linear transformation from the vector space of assignments to the vector space of real numbers (under addition). If $E$ is a vector variable, $dE$ will map each $p$ to a linear transformation from the vector space of assignments to the vector space where $E$ takes its values (this is a generalization of the definition for real variables).

If $\Delta p$ is a small displacement of the assignment $p$, we want $E(p) + dE(p)\Delta p$ to be a good approximation to $E(p + \Delta p)$. Note first that $$dE(p)\Delta p \to 0 \text{ as } \Delta p \to 0$$ by definition, since we want $dE(p)$ to be linear. So unless $$E(p + \Delta p) \to 0 \text{ as } \Delta p \to 0$$ i.e., $E$ is continuous, $E(p) + dE(p)\Delta p$ is never going to be a good approximation to $E(p + \Delta p)$. So we're going to only look at points $p$ where $E$ is continuous (there may not be any such points).

On the other hand, $$E(p) + Q\Delta p \to E(p) \text{ as } \Delta p \to 0$$ for all linear transformations $Q$, so that can't be a sufficient definition of $dE(p)$. Consider the following: $$x \to 0 \text{ as } x \to 0\\ x^2 \to 0 \text{ as } x \to 0$$, but $$\frac{x}{x} \to 1 \text{ as } x \to 0\\ \frac{x}{x^2} \to \infty \text{ as } x \to 0\\ \frac{x^2}{x} \to 0 \text{ as } x \to 0$$ Intuitively, you can see that $x$ and $x^2$ go to 0 at different speeds as $x \to 0$. We can use that idea to pin down $dE(p)$ more precisely. At a minimum, we want $E(p) + dE(p)\Delta p$ to go to $E(p)$ faster than $\Delta p$ goes to 0. We can write this formally (rigorously) as $$\frac{E(p + \Delta p) - E(p) - dE(p)\Delta p}{\|\Delta p\|} \to 0 \text{ as } \Delta p \to 0$$ Note that this is precisely the same thing as defining $dE(p)$ to be the (vector) derivative of $E$ at $p$. The uniqueness of the linear transformation (if it exists) satisfying that property (the best linear approximation to $E$ at $p$) is a basic theorem proven in any vector analysis textbook.

The variable quantity $f(x)$ is really a composition: $f(x)(p)$ really means $f(x(p))$. So the rule $$d(f(x)) = f'(x)dx$$ (which really means $$d(f(x))(p) = f'(x(p))(dx(p))$$) is just a simple application of the chain rule.

3

What bothers me is this definition is completely circular. I mean we are defining differential by differential itself. Can we define differential more precisely and rigorously?

What book are you reading and where did you find such definition? Since you mentioned Stewart in your post, I would like to mention that the version he gave in his calculus book is not circular:

enter image description here


[Added later:] In Stewart's definition, he is using the differential of $x$ to define the differential of $y$, which is not circular because they are two different things in the definition: first of all you define $dx$ to be $\Delta x$, which is a real number and call it the "differential of $x$"; then you define "the differential of $y$ (at $x$)" be $f'(x)\ dx$ and denoted it as $dy$.


First of all we define differential as $\mathrm{d} f(x)=f'(x)\mathrm{d} x$ then we deceive ourselves that $\mathrm{d} x$ is nothing but another representation of $\Delta x$

No. It is the other way around in Stewart's definition. He defines $dx$ to be $\Delta x$ first.

and then without clarifying the reason, we indeed treat $\mathrm{d} x$ as the differential of the variable $x$

Again, it is the other way around. First $dx$ is defined, then it is called the differential of $x$.

and then we write the derivative of $f(x)$ as the ratio of $\mathrm{d} f(x)$ to $\mathrm{d} x$. So we literally (and also by stealthily screwing ourselves) defined "Differential" by another differential and it is circular.

No. The notation $\frac{dy}{dx}$ is not defined by $dy$ and $dx$. The three notations $\frac{dy}{dx}$, $dy$ and $dx$ are completely different things. You could say that this is an abuse of notation, but not circular.


I prefer the answer to be in the context of "Calculus" or "Analysis" rather than the "Theory of Differential forms". And again I don't want a circular definition. I think it is possible to define "Differential" with the use of "Limits" in some way.

  • In the context of an undergraduate-level calculus course, I don't think you should expect a "rigorous" definition of differential of a function. In a "rigorous" analysis book, one would not even use the symbol "$\approx$". It seems that you don't doubt that an expression like $ \Delta y\approx f'(x)\Delta x $ is actually not rigorous.

  • The trouble to define the differential of a function is that the mathematical object "$dx$" and "$dy$" is not even a real number. (By the way, I don't think any calculus book would tell you what a real number really is.) One might appreciate the beauty and rigorousness of the $\epsilon$-$\delta$ definition of a limit so much that one might think that's the only way to make a mathematical concept rigorous. However, that is not the case. In an undergraduate linear algebra course, one would rarely see any argument using the $\epsilon$-$\delta$ language. Without knowing want a linear transformation is, (which, I would say, is the minimum requirement for giving a rigorous definition of differentials, if one dose not want to run to the so called non-standard analysis) one would hardly know what the differential of a function really is.

  • If you want to read "rigourous" mathematics, a book like Stewart's one (good for an introduction though) would not be appropriate for you. You could try Analysis (I and II) by Terence Tao.

  • As Terence Tao said: There’s more to mathematics than rigour and proofs.

  • Thank you for your answer. But I think there is a reason behind why Leibniz's notation is used widely in calculus(even in advanced calculus books). If it haven't a deep meaning we might always used Lagrange's notation for derivatives and antiderivatives. Also consider that how we can easily broke differentials inside antiderivatives or how we easily take differentials from one side of the differential equations to the other side to solve them. It seems we are treating differentials as actual algebraic expressions. If it wasn't true why even in advanced textbooks we see it this way? – Hamed Begloo Nov 17 '16 at 15:50
  • Sorry, I don't understand your comment. (1) I have no objection to Leibniz's notation for the derivative of a function. (2) I don't say the notation have a deep meaning. (3) I'm making a point that the definition in Stewart is not circular. –  Nov 17 '16 at 18:31
  • OK I will explain: (1) I meant if differentials hadn't a rigorous meaning then it could be possible to completely reformulate "Calculus" using Lagrange's notation(for both univariate and multivariate functions and both derivative and antiderivative operators) and thus we haven't to struggle about what differential is and there is no need to use Leibniz's notation to confront this confusion. – Hamed Begloo Nov 17 '16 at 19:18
  • (2) By "Deep meaning" I meant the notation says that $\mathrm{d} y$ and $\mathrm{d} x$ have separate standalone meanings and $\frac{\mathrm{d} y}{\mathrm{d} x}$ has another meaning and then I concluded: "It seems we are treating differentials as actual algebraic expressions." I also gave some examples of this treatment. Finally I think this means "Differential" should be defined rigorously and separately. – Hamed Begloo Nov 17 '16 at 19:18
  • (3) Again(as I stated in (2nd EDIT of) my question) I think it's a deception. Because we later treat $\mathrm{d} y$ and $\mathrm{d} x$ as two identical mathematical objects. Especially when it comes to using the chain rule(cancelling out identical differentials), integration(breaking a differential into another differentials), solving differential equations(taking differentials of any functions/variables from one side to the other), etc. – Hamed Begloo Nov 17 '16 at 19:19
0

My advice: Don't worry about it. I've always taught calculus without defining the damned things and done well with that approach. I of course push around differentials from time to time, as in changes of variables for integrals, but I introduce it with a public service announcement: this doesn't make literal sense, everybody, but let's use it as a convenient notational device.

Let me say I think $dy/dx$ as notation is great in some ways, and $\int_a^b f(x)\, dx$ is even better. It reminds you of where these objects of study come from. But the notation $dy/dx$ should be taken as a whole. It's not a quotient of anything, although in appearance it reminds one of the quotients $\Delta y/\Delta x.$ We should stop trying to carve $dy/dx$ into smaller pieces and leave it alone! (I once had a student who looked at $dx^2/dx$ on an exam, cancelled the $d$'s, then cancelled two $x$'s and obtained the answer of $x.$ I had to admit it had the right order of magnitude.)

To define $df$ as a linear mapping can confuse the heck out of students at the beginning. I remember self studying calculus out of Thomas back in the day, and I still have a copy of that book. Thomas tried to explain $df$ as this linear mapping thingie, and rereading it now, it seems like a joke, a terrible idea. That seems far removed from the original idea of $df$ as something "incredibly small".

Sure, in the more advanced setting of multivariable calculus, you'll see $df$ all over the place, denoting a certain linear mapping. That's a whole different ball of wax however. It's decent enough notation there, when you have experience, and when there is little chance of confusion with the original notions of differentials.

As for hyperreals and nonstandard analysis and all that, I am not qualified to say much. I've always been skeptical of this stuff. Seems to me to go beyond the "ghosts of departed quantities" to dark matter. But some mathematicians (not that many really) love this approach. Anyone going down this road should be advised that you will learn a language not too many of your peers and teachers will understand.

zhw.
  • 105,693
  • You have a very lively writing style. I like it. – littleO Nov 16 '16 at 02:22
  • 1
    Thank you for your advice. But I learned that Mathematics is a precise and rigorous field of study which at its worst case it is at least a self-consistent formalism. Maybe science could be stated in an informal way but I don't expect from mathematics to be this informal. – Hamed Begloo Nov 17 '16 at 08:14
  • As somebody studying out of Thomas right now it would be useful if you could expand a bit on the idea of these things as units (I assume you mean like how physical units are cancelled and such in physics). Also how there is less chance of confusion once you get to multivariable calculus. I have reviewed linear and so know what a linear transformation is, but it's not really clarifying the notation for me. – Joseph Garvin Oct 03 '17 at 02:03
  • @JosephGarvin Perhaps "units" is not a good choice of words. Nothing to do with the units of physics. The idea is that the symbol $dy/dx$ should be taken as a whole. Don't carve it up into "smaller pieces". (I think I'll edit my answer.) In the multivariable setting, I'm saying you are more advanced, are better at abstraction, and don't need as much feel good intuition. You could write  $Df(x), df(x),$ or something else. It wouldn't matter much. But the $D$ or $d$ reminds you of the word "derivative" which is where the idea was born. – zhw. Oct 03 '17 at 02:32
  • Agreeing with Hamed, the question is "what exactly is a differential" not how to use one. – Pineapple Fish Jun 19 '19 at 20:31
0

The differential of a function at a given point is the linear part of its behavior.

When you write $$f(x+dx)=f(x)+\Delta_f(x,dx),$$ the $\Delta_f$ has a linear part, i.e. strictly proportional to $dx$, which we can denote $dy=s\,dx$, where $s$ is a constant, and a remainder, let $\Delta'_f$.

Hence,

$$\Delta_f(x,d x)=s\,dx+\Delta'_f(x,dx)$$ where $\Delta'_f$ has a superlinear behavior at $x$ (quadratic or more). Thanks to this property, we can define $s$ by means of a limit, letting $\Delta'_f$ vanish:

$$s:=\frac{\Delta_f(x,dx)-\Delta'_f(x,dx)}{dx}=\lim_{dx\to0}\frac{\Delta_f(x,dx)}{dx}.$$

(In fact $s$ is defined when the limit exists.)

Of course, this definition coincides with that of the derivative, which allows us to write

$$dy=f'(x)\,dx.$$

Note that $dx,dy$ are not considered as "infinitesimals", but as finite numbers (variable but proportional to each other).