Meaning behind differentials

Question

So I think I understand what differentials are, but let me know if I'm wrong.

So let's take $y=f(x)$ such that $f: [a,b] \subset \Bbb R \to \Bbb R$. Instead of defining the derivative of $f$ in terms of the differentials $\text{dy}$ and $\text{dx}$, we take the derivative $f'(x)$ as our "primitive". Then to define the differentials we do as follows:

We find some $x_0 \in [a,b]$ where there is some neighborhood of $x_0$, $N(x_0)$, such that all $f(x)$ in $\{f(x) \in \Bbb R \mid x \in N(x_0)\}$ are differentiable. Then we choose another point in $N(x_0)$, let's call it $x_1$, such that $x_1 \ne x_0$. Then let $dx = \Delta x = x_1 - x_0$. Now this $\Delta x$ doesn't actually have to be very small like we're taught in Calculus 1 (in particular it's not infinitesimal, it's finite). In fact, as long as $f(x)$ is differentiable for all $x \in [-10^{10}, 10^{10}]$ we could choose $x_0 = -10^{10}$ and $x_1 = 10^{10}$.

Then we know that $\Delta y = f'(x_0) \Delta x + \epsilon(\Delta x)$, where $\epsilon(\Delta x)$ is some nonlinear function of $\Delta x$. If $f(x)$ is smooth, we know that $\epsilon(\Delta x)$ is equal to the sum of powers of $\Delta x$ with some coefficients, by Taylor's theorem. But of course, $\epsilon(\Delta x)$ won't be so easy to describe if $f(x)$ is only once differentiable. So we define $dy$ as $dy = f'(x_0) dx$: that is, $dy$ is the linear part of $\Delta y$. This has the very useful property that $\lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x} = \frac{dy}{dx} = f'(x_0)$. This is then not a definition of the derivative, but a consequence of our definitions.

It can be seen from this $dy$ really depends on what we choose as $dx$, but $f'$ is independent of both.

This definition can be extended to functions of multiple variables, like $z = f(x, y)$ as well, by letting $\Delta x = dx,\ \Delta y=dy$ and defining $dz$ as $dz = \frac{\partial f(x_0, y_0)}{\partial x}dx + \frac{\partial f(x_0, y_0)}{\partial y} dy$. So $dz$ is the linear part of $\Delta z$. Does all of the above look correct?

If so, then where I'm having a problem is:
1) how then do we define the derivative of $f(x)$ if not by $f'(x_0) = \lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x}$?
2) how do we apply this definition of $dx$ to $\int_a^b f(x)dx$? It seems like the inherit arbitrariness of $dx$ is really going to get in the way of a good definition of the integral.

score 2 · Answer 1 · 2014-07-23T22:44:18.977

$\mathrm{d}y$ depends only on $y$: it doesn't depend on any choice of $x$ or anything else: that's one of the big advantages to differentials (as opposed to, say, partial derivatives).

A differential is a gadget that expresses how something varies. There are three main things you can do with such a gadget:

You can compare two differentials: e.g. if $x$ and $y$ are dependent on one another in a differentiable way, then they are multiples of each other. e.g. if $y = f(x)$, then $\mathrm{d}y = f'(x) \mathrm{d}x$.
Given a differential, you can ask if it has an antiderivative: e.g. $2x \mathrm{d}x$ is the differential (often called the "exterior derivative") of $x^2$.
You can compute a (path) integral to 'add up' along a path all of the variations the differential expresses. e.g. $\int_0^1 2x \mathrm{d}x$ means we 'accumulate' all of the variations $2x \mathrm{d}x$ as we go from $x=0$ to $x=1$. And as we know $2x \mathrm{d}x = \mathrm{d}(x^2)$, our intuition is satisfied in the sense that accumulating how $x^2$ varies from $x=0$ to $x=1$ works out to $1^2 - 0^2$.

You can also ask the differential to give you an ordinary number expressing a variation along a (tangent) vector. A common notation for this is, e.g. in $(x,y)$ coordinates, to let the symbol $\partial/\partial x$ and $\partial/\partial y$ denote vectors, and for a differential $\omega$, the notation $\frac{\partial}{\partial x} \omega$ means ordinary number that $\omega$ yields for a variation by the vector $\partial/\partial x$.

e.g. we have $$ \frac{\partial}{\partial x} \mathrm{d}x = 1 \qquad \qquad \frac{\partial}{\partial x} \mathrm{d}y = 0 \qquad \qquad \frac{\partial}{\partial y} \mathrm{d}x = 0 \qquad \qquad \frac{\partial}{\partial y} \mathrm{d}y = 1$$

This is consistent with the notation for partial derivatives you've learned, in that, e.g.,

$$ \frac{\partial}{\partial x} f = \frac{\partial}{\partial x} \mathrm{d} f $$

where the left hand side is the meaning taken from introductory multivariable calculus, and the right hand side is the meaning I describe above. (usually first introduced in differential geometry)

Incidentally, I think partial derivative notation is absolutely terrible, and I avoid using it whenever possible. I also think differentials are more intuitive than partial derivatives as well, and I prefer to do all of my calculus in terms of differentials these days. A convenient analog to $f'$ for multivariable functions is to let, e.g., $f_1$ denote the derivative of $f$ in its first argument, $f_2$ denote the derivative in the second argument, and so forth. So I would prefer to write

$$ \mathrm{d}f(x,y) = f_1(x,y) \mathrm{d}x + f_2(x,y) \mathrm{d}y $$

rather than anything resembling the traditional notion of partial derivatives. If I want derivatives in the direction where $y$ is held constant, I express that as setting $\mathrm{d}y = 0$ rather than resorting to partial derivatives.

This use of combining vectors with differentials is related to the (unfortunately common) mistake / abuse of notation that you often see, where the notation $\mathrm{d}x$ is treated an actual change in $x$, rather than as a gadget that can tell you what the change in $x$ is.

Is the implication here that there is no good definition of a differential? Or that it's something very complex and that it's more important to know the "main things you can do with" them? — Hola_Mundo, Jul 23 '14 at 22:40
@user: There are several good ways to define them, but understanding the definition requires more sophistication than understanding how to use them. Compare to the fact that you can use real numbers quite proficiently without ever actually having seen the definition of the real numbers. If you go into differential geometry, you will see the definition of differential forms. If you go into abstract algebra, especially algebraic geometry, you will see them defined by the means of generators and relations (i.e. the algebraic rules differentials satisfy). — , Jul 23 '14 at 22:47
The only thing I have against the notation $f_1$ for the 1st-argument partial of $f$ is that a common notation is $f = (f_1, f_2, ..., f_n)$. Of course, if it's clearly defined the first time that notation is used, that's not a problem. — Hola_Mundo, Jul 23 '14 at 22:48
There is another definition of differential that is more elementary that encodes $\mathrm{d}f$ as an ordinary function: see wikipedia. I don't know how adequate this definition is in the face of the all the things differentials are used for, so I hesitate to recommend it. But it is closely related to the "combining a differential with a vector to returns a number expressing the variation along that vector" meaning. — , Jul 23 '14 at 22:48
While I am interested in the notations and ideas you bring up here, the question was really more about how to define differentials and how those definitions relate to differentiation and integration. So your response really doesn't answer my question as is. That said, can you recommend some resource for learning more about what you've written here? It does seem interesting. — Hola_Mundo, Jul 24 '14 at 14:37

score 1 · Answer 2 · edited Jul 24 '14 at 00:15

1

Also, the definition of a derivative that I learned was $\lim_{h \to 0} \frac{f(x+h) - f(x)}{h}$...basically rise over run as run ($\mathrm{d}x$) approaches $0$ (thus the tangent line concept).

edited Jul 24 '14 at 00:15

user44789

307

answered Jul 23 '14 at 22:12

user121955

233

1

Here's a tutorial on MathJax. – Hola_Mundo Jul 23 '14 at 22:13
$\Delta y = f(x_1) - f(x_0)$ by definition. Likewise $\Delta x = x_1 - x_0$. Now let $h=\Delta x = x_1 - x_0$. Then $\Delta y = f(x_0 + h) - f(x_0)$. So you can see that your definition is exactly the same as $\lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x}$. – Hola_Mundo Jul 23 '14 at 23:59

score 0 · Answer 3 · edited Oct 18 '14 at 03:13

OP here. I think I've figured this out:

This definition does seem to hold for differentiation and integration.

Differentiation
My worry here was that because $\Delta x = dx$ and $\Delta y$ is a function of $\Delta x = dx$, that $\lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x}$ would also be dependent on $dx$, which would make the definition $f'(x) := \lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x}$ a circular argument (as $dx$ was defined in terms of $f'(x)$) -- but in fact, it's really only $dy$ that's defined in terms of $f'(x)$. $dx$ is just some arbitrary change in $x$, i.e., $dx = x_1 - x_0$. We need $f'(x_0)$ to be defined in terms of $x_0$, but the inherit arbitrariness of $x_1$ would make anything defined explicitly in terms of $dx$ not-well-defined. However, what the limit operation really does is remove the arbitrariness of $x_1$. That is, the smaller we make $\| x_1 - x_0 \|$, the less "arbritrary" $x_1$ is. In the limit, it has lost all of it's "arbitrariness". So $\lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x}$ does not actually depend on the value we initially select for $x_1$ at all. So good, this definition works for differentiation.

Integration
I couldn't figure out here what some arbitrarily large interval $dx$ (weird to hear $dx$ described as "large" isn't it?) had to do with integration (and thus why it would be in the integrand). But again, an arbitary interval under a de-arbitrarifying process (a limit) is exactly what we need. In this case $\int_a^b f(x)dx$ really means something like $$\lim_{\| dx_i \| \to 0} \sum_{i=0}^n f(x_i)dx_i= \lim_{\max(x_{i+1}-x_i) \to 0} \sum_{i=0}^n f(x_i)(x_{i+1} - x_i)$$ where each $dx_i$ is a subinterval of $[a,b]$. Notice that each $x_{i+1} \to x_i$ and thus $x_{i+1}$ loses it's "arbitrariness" in this limit.
In case you're thinking that these $dx_i$ are not defined the same as in my question, notice that in the definition I gave, $x_0$ was fixed by the problem -- our only variable that we were free to fix was $x_1$. That's exactly how these $dx_i$ are defined: for instance, the first one, $dx_0 = x_1 - x_0 = x_1 - a$ for some arbitrary $x_1 \in (a,b]$. Assuming $x_1 \ne b$, then to fill out our partition we must define another subinterval $dx_1 = x_2 - x_1$, where $x_1$ in this case is not arbitrary -- it's defined to be the end point of our $1^{st}$ (or $0^{th}$ maybe because of how I chose to define my partition) subinterval -- but $x_2$ is arbitrarily chosen from the interval $(x_1, b]$. And so on.

NOTE: I don't particularly like the notation I used to "define" integration above (the $\lim_{\| dx_i \| \to 0} \dots$ part), one of several reasons is that it really doesn't tell you that in the limit, $n \to \infty$. The wikipedia page on the Riemann integral doesn't have such an equation (they just describe it with words... ugh). Do you guys know a better notation?

Conclusion
It seems that defining $dx:=\Delta x$ and $dy:=f'(x)dx$ does exactly what it needs to do. The relationship between the two can be used to approximate small changes in $\Delta y$, neither differentiation nor integration is defined in terms of $dy$ and though they are both defined in terms of $dx = x_1 - x_0$, they are not dependent on any values we initially select for $x_1$, and probably one of the best things is that because $dx$ and $dy$ are finite, the relationship $f'(x) = \frac{dy}{dx}$ holds for any differentiable $f$.

NOTE: This definition of $dx$ and $dy$ does not seem to be the same as the one used in differential geometry -- as described by Hurkyl. But then, I'm not entirely sure because I don't completely understand Hurkyl's answer. If anyone knows of a good primer on the notations and concepts he's using -- suitable for someone who's gone through the calculus sequence and linear algebra only -- I would be grateful for a link. However, even if the definitions are different, it doesn't mean mine is not useable -- in fact, unless you guys can come up with a situation where $dx = x_1 - x_0$ and $dy = f'(x)dx$ (where $dx$ and $dy$ can be arbitrarily large) don't do what they're supposed to, I'm just going to take them as my definition of them from now on.

The moral is, I guess, that introductory calculus teachers need to quit telling students that $dx$ and $dy$ are really, really small changes in $x$ and $y$, so that we don't have to spend an entire day figuring out a better definition on our own. — Hola_Mundo, Jul 24 '14 at 02:42

score -1 · Answer 4 · answered Jul 23 '14 at 21:36

-1

Differentials are infinitely small changes in x or y. For instance, the concept of the integral is the sum of the areas of an infinite number of rectangles under a curve. The height of each is f(x) and the width is dx.

answered Jul 23 '14 at 21:36

user121955

233

When you write $dx$, is $dx$ a real number? Would you say it's infinitely small? What does that mean? – littleO Jul 23 '14 at 22:13
It's just a very small change in x. For instance, 1.001-1.0005. So theoretically it's a real number but isn't generally represented as one. And by infinitely small it's meant that it's as small as you let it be. However, calculus is based on dx being extraordinarily small for the closest accuracy of computations. – user121955 Jul 23 '14 at 22:32

Meaning behind differentials

4 Answers4

Linked