41

In Leibniz notation, the 2nd derivative is written as $$\dfrac{\mathrm d^2y}{\mathrm dx^2}\ ?$$

Why is the location of the $2$ in different places in the $\mathrm dy/\mathrm dx$ terms?

bobobobo
  • 9,502
  • 24
    Mostly to confuse you. (I am somewhat serious.) – Pete L. Clark Mar 05 '11 at 03:53
  • 11
    It's somewhat logical that squaring the operator $\frac{d}{dx}$ results in $\frac{d^2}{dx^2}$ though... now let it operate on $y$. – Myself Mar 05 '11 at 03:57
  • 1
    @Pete L. Clark: Really? I mean, there are ways of making the notation rigorous... – Jesse Madnick Mar 05 '11 at 04:17
  • 6
    @Jesse: not really really. I agree that the notation is sensible (as evidenced by the answers below). But from a pedagogical perspective, I have found that explaining it is often more trouble than it's worth: it's tempting to switch to $y''$ instead. Really really the entire Leibniz notation, suggesting a ratio of differentials, is somewhat unfortunate. You could make it rigorous using (e.g.) nonstandard analysis, but the average student -- even if she is a future mathematician -- is not going to see or use that perspective. – Pete L. Clark Mar 05 '11 at 04:23
  • @Jesse: the same remarks apply to your (nice) explanation: try it out on a calculus student, or even a typical undergrad math major, and see what kind of response you get. – Pete L. Clark Mar 05 '11 at 04:27
  • @Pete: For calculus students, sure, I can imagine that explaining the notation probably is an irritation. And while the average mathematician may not ever use non-standard analysis, surely she would encounter Hessians and Frechet derivatives and such at some point. – Jesse Madnick Mar 05 '11 at 04:30
  • @Pete: Ah, sorry, I type slowly and missed your response. Yes, I suppose you're right... I guess it's simply my hope that the Frechet derivative perspective is given its fair time at some point, at least in upper-division math courses. – Jesse Madnick Mar 05 '11 at 04:33
  • @Pete I find your perspective of the notation "unfortunate" interesting considering that most of the scholars after Newton and Leibniz used Leibniz's notation much more often. – Doug Spoonwood Aug 21 '11 at 15:52
  • @Doug: rereading these comments, I find I was being unusually flippant. More seriously: of course the Leibniz notation is used by almost everyone at times, including me. What is "unforunate" is the sort of yes/no answer that you get (or give!) when asked whether $\frac{dy}{dx}$ is actually a ratio of two quantities. – Pete L. Clark Aug 21 '11 at 20:11
  • https://math.stackexchange.com/questions/475016/leibniz-notation-for-high-order-derivatives/4831211#4831211 – zkutch Jan 24 '24 at 11:33

6 Answers6

27

Purely symbolically, if we accept that $dy = f'(x)\,dx$, and treat $dx$ as a constant, then $$d^2y = d(dy) = d(f'(x)\,dx) = dx\,d(f'(x)) = dx\,f''(x)\,dx = f''(x)\,(dx)^2,$$ so dividing yields: $$\frac{d^2y}{(dx)^2} = \frac{d^2y}{dx^2} = f''(x).$$

As to where this notation actually comes from, though: My guess is that it comes from a time when mathematicians primarily thought of $dx$ and $dy$ as "infinitesimal quantities." There are ways of doing so rigorously (via non-standard analysis), and perhaps there is a way of making this notation rigorous that way.


However, we can still give rigorous meaning to these calculations without appealing to non-standard analysis by using the language of bilinear forms.

If $f$ is differentiable, we can define a map \begin{align*} df\colon \mathbb{R} & \to L(\mathbb{R}; \mathbb{R}) \\ df(x)(dx) & = f'(x)\,dx. \end{align*} Here, $L(\mathbb{R};\mathbb{R})$ denotes the set of linear maps from $\mathbb{R} \to \mathbb{R}$, and $dx$ is simply a real number. Going one step further, we can consider the map $$d^2f = d(df)\colon \mathbb{R} \to L(\mathbb{R};L(\mathbb{R};\mathbb{R})).$$ By identifying $L(\mathbb{R}; L(\mathbb{R}; \mathbb{R}))$ with the set of bilinear maps $B(\mathbb{R} \times \mathbb{R};\mathbb{R})$, we have the bilinear map $$d^2f(x)(dx^1, dx^2) = dx^1\, f''(x) \,dx^2$$ whose associated quadratic form is $$d^2f(x)(dx) = f''(x)\,(dx)^2.$$ It is now perfectly legal to divide on both sides by $(dx)^2$, obtaining $$\frac{d^2f}{dx^2} = f''(x).$$

Jesse Madnick
  • 31,524
  • 2
    If dx and dy referred to infinitesimal quantities I would expect both dx and dy to be treated as a unit. But only dx is, and I'm still left with no idea what it would mean to write $d$ or $d^2$ by itself. Why do the Ds in dy square but not the Ds in dx? If I accept your definition of a map named df, what is the definition of the function/map you're referring to as just $d$ within $d(df)$? Or are you defining a map whose whole name is $d(df)$? – Joseph Garvin Dec 07 '17 at 04:42
  • 6
    $d$ on top is an operator; $dx$ on the bottom is a variable. – Arturo Magidin Dec 07 '17 at 05:13
  • @JosephGarvin: Everything before the "However, we can still..." is purely symbolic; I haven't made any rigorous claims in that top section. So instead, let's talk instead about the stuff after the "However...." which is rigorous. As Arturo says, I'm using $dx$ (and $dx^1$, $dx^2$) as an ordinary real number (so, a variable), whereas $d$ itself is an operator. See, for instance, the Wikipedia article on Frechet derivatives, but note that the article uses $D$ instead of $d$. – Jesse Madnick Dec 07 '17 at 06:17
  • Is $dy$ an application of the $d$ operator to $y$? If so wouldn't it be more consistent (and sane) to write $d(d(y))$ rather than $d(dy)$? With this notation I'm rolling the dice every time trying to figure out what is a one/two letter real/operator. – Joseph Garvin Dec 09 '17 at 20:13
  • I think the lower half has the same problem. You write a definition for an operator -- it's unclear if you are defining an operator called $df$ or if you are defining an operator called $d$ and naming the function it takes as an argument $f$. You write $d(df)$ -- if I assume the former (you were defining $d$) then it would be clearer to write $d(d(f))$ -- if I assume the latter, then you have defined $df$ without defining $d$. – Joseph Garvin Dec 10 '17 at 19:53
  • 1
    @JosephGarvin: I'm using $df$ as shorthand for $d(f)$, and using $d(df)$ for $d(d(f))$. The operator $d$ inputs functions $f \colon \mathbb{R} \to V$ (where $V$ is a real Banach space) and outputs a certain function called $d(f) \colon \mathbb{R} \to L(\mathbb{R}; V)$. Defining the operator $d$ means defining $d(f)$ for all $f$. Conversely, defining $d(f)$ for every $f$ defines the operator $d$. The definition when $V = \mathbb{R}$ as in my post. For general $V$, see the Wikipedia article I keep referencing. – Jesse Madnick Dec 10 '17 at 23:07
  • At last I found answer, which coincides with my answer on https://math.stackexchange.com/questions/475016/leibniz-notation-for-high-order-derivatives/4831211#4831211 . (+1). – zkutch Dec 20 '23 at 22:14
26

Somewhat mundanely,

$$ \frac{d}{dx}\left(\frac{d}{dx}(y)\right) = \frac{d}{dx}\left(\frac{dy}{dx}\right) = \frac{d\,dy}{dx\,dx} = \frac{d^2 y}{dx^2} $$

  • 10
    Wait, I am confused with the denominator $dx*dx=dx^2$ how is this possible? – Itakura Oct 14 '15 at 11:38
  • 4
    @KennyGuy maybe they think $dx$ as one variable (just like $x$) which make $dx*dx=dx^2$? – He Yifei 何一非 May 26 '17 at 14:16
  • 17
    This answer doesn't explain anything because the last jump in equality doesn't follow the normal multiplication rules. I don't think anybody asking the original question would be helped by this. Even if I read other answers and accept dx as a constant (in which case at least putting dx in parenthesis before squaring would help), why not dy? Why can you separately square just the d? – Joseph Garvin Dec 07 '17 at 04:30
  • 1
    @JosephGarvin: Are you trying to understand, or just trying to be contrarian? I ask because your tone is particularly combative. The $d$ on top is not the same as the $d$ in the bottom. The $d$ on top is an operator, whereas the $d$ in the bottom is just half of the symbol "$dx$". So "$dx^2$" in interpreted as $(dx)(dx)$; just like $\sin^2$ does not mean $s$ times $i$ times $n$ squared, here $dx,dx$ does not mean "$d$ times $x$ times $d$ times $x$"; it means dee ex; single thing; and $dx^2$ does not mean "$d$ times $x$ times $x$", it means object dee ex, squared. – Arturo Magidin Dec 07 '17 at 05:13
  • 2
    @ArturoMagidin: Two things: 1) if an explanation the length of your comment is needed to, let's say, "justify" the final equality, then that should be part of the answer; and 2) if the $d$ on the top is different to the $d$ on the bottom, then they shouldn't look the same, i.e., the real answer is that the notation is, let's say, "not optimal". – Will R Dec 07 '17 at 07:10
  • 3
    @WillR Not my answer; and the length of the comment is also related to the length of the misunderstanding in the question being asked, so... no. In fact, the notation is very useful and very flexible and indicative; for example, it immediately tells you the correct units to use for the $n$th derivative (units of $y$ divided by (units of $x$)${}^n$. So... no. The real answer is that it needs proper interpretation. Just line $\sin^2(x)$ is correctly interpreted as $(\sin(x))^2$, and not as a product of $s$, $i$, $n^2$, and $x$; or, for that matter, $\sin($n$)$, where the two $n$s are different. – Arturo Magidin Dec 07 '17 at 16:19
  • 1
    What would you expect $\frac{d}{dx} * \frac{d}{dx}$ to be (recalling that $dx$ is a single thing, and not $d * x$)? You'd expect it to be $\frac{d^2}{(dx)^2}$, and you wouldn't bother putting in parentheses. Then you apply this operator to $y$: $\frac{d^2}{dx^2}(y)$, which can also be written $\frac{d^2y}{dx^2}$. – A_P Dec 26 '19 at 21:12
11

The $d$ is meant to represent the "change in". And the Leibniz notation is meant to remind you that you are computing the ratio between the change in $y$ and the change in $x$.

When you take the second derivative, you are computing how the derivative is changing as $x$ changes; that is, you are trying to compute $$\frac{d(y')}{dx}.$$ Now, $y'$ is itself a rate of change: it is the rate at which $y$ changes. So the "numerator" of the differential notation is telling you that you are trying to consider the change in the change in $y$, not the change in $y^2$ (which is what "$dy^2$" would represent).

So you are trying to describe the change in "the-change-in-$y$", relative to how $x$ is changing. $x$ is only changing "once", so you should have a single $d$ in the "denominator" (remember, not really a denominator). So why $x^2$? Because you are trying to figure out the change of blah as $x$ changes, and blah is a rate of change as $x$ changes as well. So you are taking $x$ twice, but considering only one change. Hence, single $d$, but $x$ squared.

Arturo Magidin
  • 398,050
  • 2
    Doesn't your explanation for $d^2y$ undermine your explanation for $dx^2$? If $dy/dx$ refers to the ratio of change in y to change in x, and I accept $d^2y$ as meaning the change in the change in y, one would still expect by your rationale for $d^2y/dx^2$ to refer to the ratio of the change in the change in y to the change in $x^2$. Also no idea what blah is meant to represent. – Joseph Garvin Dec 07 '17 at 04:54
  • 2
    @JosephGarvin: No, it doesn't; "blah" in this case is whatever it is that $y'$ is measuring. If $y$ is position, and $x$ is time, with $y'$ you are trying to figure out how the rate of change of position over time; if you are trying to figure out the second derivative, you are trying to to figure out the rate of change of the rate of change, over time squared (which is why if position is measured in miles and time in hours, velocity is measured in miles per hour, but acceleration, the rate of change of velocity, is measured in miles per hour squared). blah in that example is velocity. – Arturo Magidin Dec 07 '17 at 05:07
  • Let's just collectively ask the person who first used this notation? – Aelx Jan 18 '24 at 13:00
4

There is no possible way of understanding why Leibniz invented the notation he did unless you think about calculus the way Leibniz did, using infinitesimal numbers.

Take the velocity $dx/dt$. Leibniz would have described it as the ratio of two infinitesimals. (Nonstandard analysis shows that this idea can be made rigorous, but in any case limits didn't exist in Leibniz's time.) The numerator is an infinitesimal number with units of meters. The denominator is an infinitesimal with units of seconds. You divide them, and it gives m/s.

In the acceleration, $d^2x/dt^2$, the numerator is written to suggest something with units of meters, and the denominator to suggest units of seconds squared, giving the correct units of m/s$^2$.

2

Glossing over a few issues for clarity,

If I wanted you to differentiate, say $3x^4$ twice, I could ask the question in a variety of ways such as,

1) Find the second derivative of $3x^4$

2) If $f(x)=3x^4$, find $f''(x)$

3) If $y=3x^4$ find $\frac{d^2y}{dx^2}$

The latter is an accepted lazy corruption of the more technically correct,

Find, $$ \big(\frac{d}{dx}\big)^2 y$$ or find $$ \big(\frac{d}{dx}\big)^2 (3x^4)$$ So, essentially, you have spotted that mathematicians are quite lazy, when they can get away with it !

Although we write $\frac{d}{dx}$ this isn't a fraction in the sense that $\frac{2}{5}$ is. Perhaps best to park that thought for now, although maybe 'expanding the brackets' of $\big(\frac{d}{dx}\big)^2$ as $\frac{d^2}{dx^2}$ rather than $\frac{d^2}{(dx)^2}$ is a reminder of that.

All of this does makes it harder for beginners to make sense of the notation. There are many, many more examples I could give of such 'lazy corruption'. I think of it as being like learning any foreign language, where there are always all sorts of quirks and customs that break a general rule.

Once you understand the 'lazy corruption' in the context of its surroundings, the meaning is, more often than not, actually, perfectly clear.

This answer was given when this question was asked again, in March 2019 here : Why $x^2$ in $\frac{d^2y}{dx^2}$?

  • 1
    "Corruption" makes it sound a bit like we had the good convention and then dropped some parentheses, but Leibniz's original writing was this way and everyone copied him. I basically agree with everything you're saying, though. – Mark S. Mar 20 '19 at 11:53
  • Good point, for something to become corrupted it has to have been correct first... maybe 'lazy adoption' of something that's not quite right would have been better in my answer, although if Leibniz wrote it like that, it's almost by definition right ! Maybe it is only over time, as it's been merged with how the rest of mathematics has developed that it's become a quirky inconsistency. Probably a topic for the History of Maths MSE. – Martin Hansen Mar 20 '19 at 13:14
2

Leibniz notation is probably the most common notation for the derivative, but it is just a notation: there is no objective derivation for the notation itself, and alternatives exist such as Euler notation ($D_x^2 y$). In fact, if we wanted to derive the Leibniz notation for the second derivative in a more systematic way, we could use the quotient rule on the first derivative:

$$ \frac{\mathrm d\left( \frac{\mathrm d y}{\mathrm d x} \right)}{\mathrm d x} = \frac{ \frac{\mathrm d^2 y}{\mathrm d x} - \frac{\mathrm d y}{\mathrm d x} \frac{\mathrm d^2 x}{\mathrm d x} }{\mathrm d x} = \frac{\mathrm d^2 y}{\mathrm d x^2} - \frac{\mathrm d y}{\mathrm d x}\frac{\mathrm d^2 x}{\mathrm d x^2} $$

Now we have a notation that we can algebraically manipulate to derive identities to, e.g., swap dependent and independent variable:

$$ - D_x^2 y \left( \frac 1 {D_xy} \right)^3 = D_y^2x $$

Using a system where differentials are algebraically manipulable highlights the importance of the distinction between $\mathrm d^2 x = \mathrm d(\mathrm d(x))$ and $\mathrm d x^2 = (\mathrm d(x))^2$. To see this hands-on, attempt to prove (4) via algebraic manipulation.

It's much harder to understand intuitively what $\mathrm d^2 t/\mathrm d t^2$ means in this new system but, crucially, it's not the derivative of $\mathrm d t / \mathrm d t$.

Zaz
  • 1,466