22

I want to know if the Leibniz differential notation actually leads to contradictions - I am starting to think it does not.

And just to eliminate the most commonly showcased 'difficulty':

For the level curve $f(x,y)=0$ in the plane we have $$\frac{dy}{dx}=-\frac{\dfrac{\partial f}{\partial x}}{\dfrac{\partial f}{\partial y}}$$ If we were to "cancel" the differentials we would incorrectly derive $\frac{dy}{dx}=-\frac{dy}{dx}$. Why does this not work? Simple: The "$\partial f$" in the numerator is a response to the change in $x$, whereas the "$\partial f$" in the denominator is a response to the change in $y$. They are different numbers, and so cannot be cancelled.

Related: consult the answer to this previous question.


The other part has been moved to a new post here.

  • 1
  • 5
    You claim to "eliminate the most commonly showcased difficulty" but I don't see how you're eliminating it. In fact, you seem to be answering your own question. – anon Apr 22 '14 at 21:19
  • 2
    One classic difficulty that should be noted is that if you claim that $dx$ is a non-zero "number", then your "numbers" must be non-Archimedean – Ben Grossmann Apr 22 '14 at 21:22
  • You can't do this for the same reason that $\frac{53}{} \ne 53$. – Disintegrating By Parts Apr 23 '14 at 00:00
  • @hardmath: I don't think I'm shortchanging them in this case. They're reasoning by cancelling one piece of a notation with another piece of another notation without any other evidence of truth, and reaching a contradiction. – Disintegrating By Parts Apr 23 '14 at 01:55
  • @T.A.E. wrote above that "you can't do this for the same reason as, etc." Could you elaborate on what the this in your comment refers to? – Mikhail Katz Apr 23 '14 at 09:25
  • @user72694 I was referring to cancelling 'differentials'. The notation for a derivative is a not a fraction; it's notation, and cancelling as though it were something more is no more valid that what I wrote where I cancelled the operation of multiplication. I can understand asking a question about whether or not something suggested by notation might work, but it should be backed up with some reason other than it is suggested by a false interpretation of notation. If NotNotLogical will interpret the meaning of the pieces, then I think they'll see why their question doesn't make good sense. – Disintegrating By Parts Apr 23 '14 at 16:15
  • @T.A.E., Leibniz thought that the notation $\frac{dy}{dx}$ is indeed a fraction. I am inclined to take Leibniz's viewpoint seriously, though I am open also to other viewpoints. I take it you have been taught that it is not a fraction? I expect your teacher is not familiar with infinitesimal-enriched continua. I can suggest some reading matter if you are interested in pursuing this beyond what you have been taught. – Mikhail Katz Apr 23 '14 at 16:18
  • @user72694 Leibnitz didn't lose track of what the symbols meant in the context of the discussion. Nor did Newton. And you shouldn't encourage abandoning proper meaning behind symbols in favor of symbolic manipulation--that doesn't help anyone. – Disintegrating By Parts Apr 23 '14 at 16:20
  • @T.A.E., if I read your comment correctly, you seem to feel that Leibniz did not view $\frac{dy}{dx}$ as a fraction. I am willing to consider such a hypothesis. Do you have a source for this or is it your own opinion? – Mikhail Katz Apr 23 '14 at 16:23
  • @user72694 Don't put words in my mouth. I have seriously looked into the History, and these folks were not mindless symbolic manipulators. – Disintegrating By Parts Apr 23 '14 at 16:26
  • 1
    @T.A.E., I am sorry if I misinterpreted your position, but I would like to understand it nonetheless. I agree with you that Leibniz was not a mindless symbolic manipulator. But how do you feel Leibniz viewed the expression $\frac{dy}{dx}$ ? – Mikhail Katz Apr 23 '14 at 16:30
  • @user72694 - If you're interested in understanding concepts as opposed to meaningless symbolic manipulation, then consider starting with The History of Calculus and its Conceptual Development by Carl B. Boyer. – Disintegrating By Parts Apr 23 '14 at 16:33
  • @T.A.E., I have published articles in leading history and philosophy journals where I point out errors in Boyer's interpretation. Boyer is not considered to be cutting-edge scholarship by today's historians. But you are deflecting my question. I still don't know whether you feel Leibniz viewed $\frac{dy}{dx}$ as a ratio and I am even more puzzled by your apparent unwillingless to answer such a straightforward question. – Mikhail Katz Apr 23 '14 at 16:37
  • @user72694 And now I challenge you to find a possible meaning for the symbolic manipulation leading to the incorrect formula. Try to explain what meaning those operations would have, and you'll see why I object. The great minds behind this subject had thoughts behind what they were trying to do, and that should be encouraged instead of mere symbolic manipulation. – Disintegrating By Parts Apr 23 '14 at 16:51
  • @T.A.E., does that mean that Leibniz thought of $\frac{dy}{dx}$ as a ratio or that he did not think of $\frac{dy}{dx}$ as a ratio? Or perhaps he did think of it as a ratio but revealing this to youthful minds might corrupt them (and lead them astray to "mere symbolic manipulation") and therefore it is imperative not to reveal it? – Mikhail Katz Apr 23 '14 at 17:08
  • @user72694 I don't object to thinking of infinitesimals. I object to symbolic manipulation without meaning. And I object to people trying to defend/encourage mindless manipulation of symbols. hardmath is using the symbols in a way that has meaning; there is serious thought behind what he is trying to do. Try to interpret $(\partial f/\partial x)/(\partial f/\partial y)$ as $dy/dx$ without thought behind what you're doing and you'll end up with wrong answers. Leibniz didn't work that way, and any scholar or Historian who makes claims about Leibniz should know that. – Disintegrating By Parts Apr 23 '14 at 17:22
  • 1
    @T.A.E. Please note that the calculation $(\partial f/\partial x)/(\partial f/\partial y)=dy/dx$ is precisely what I object to in my post. Also bear in mind that Leibniz designed his notation intentionally - it is no accident that it often "works" to treat differentials as quantities. –  Apr 23 '14 at 17:27
  • 2
    @T.A.E., I have news for you. Not only did Leibniz think of $\frac{dy}{dx}$ as a ratio of infinitesimals, but Leibniz's viewpoint has a consistent interpretation that can be used even at the level of infinitesimal calculus; see Keisler's book. You seem to pay lip service to accepting infinitesimals but your allegations of "mindless symbolic manipulation" appear to be a strawman attack against them. What is the basis for this? – Mikhail Katz Apr 23 '14 at 17:28
  • @NotNotLogical, of course your point is perfectly valid, and in no way contradicts the possibility of viewing $\frac{dy}{dx}$ as a ratio. – Mikhail Katz Apr 23 '14 at 17:31
  • @user72694 you don't have news for me. I've written about topics related to this, including justifying Newton's fluent/fluxion approach. And I have taught this approach with approval of the Department as well. You keep putting words into my mouth, and you're wrong. http://math.stackexchange.com/questions/618509/differentials-definition/628904#628904 . I said it before, I object to blind symbolic manipulation without meaning. – Disintegrating By Parts Apr 23 '14 at 18:33
  • 2
    @T.A.E., and I am in favor of motherhood and apple pie. On the other hand, I object to the implied allegation that viewing $\frac{dy}{dx}$ as a ratio as Leibniz did leads to blindless or loss of meaning; on the contrary. In my teaching experience, students vote with their feet when given a chance to take a calculus class using infinitesimals rather than epsilontics. – Mikhail Katz Apr 23 '14 at 18:39
  • @user72694 Again, putting words in my mouth. I don't object to thinking in this way, but blind manipulation of symbols isn't thinking. It's not different than writing (53)/ = 53. – Disintegrating By Parts Apr 23 '14 at 18:42
  • 2
    @T.A.E., could you please clarify why you feel that viewing $\frac{dy}{dx}$ as a ratio amounts to blind manipulation and absence of thinking? Just repeating this ad nauseam does not convince. If it this is not what you mean, what do you mean exactly? I don't like blind manipulation either, but note that what most calculus students get out of epsilon, delta is usually mere blind manipulation of symbols without thinking. – Mikhail Katz Apr 23 '14 at 18:46
  • @user72694 Again, for contrast, please look at hardmath's post below. hardmath has done a beautiful job of explaining what's going on in this problem ... using infinitesimals. Why do I think this? Because hardmath is thinking about what is going on, and is using the symbols as words to express those thoughts. Thought doesn't lead hardmath anywhere close to the error contained in this post. The error in this post can only be found in the absence of thought about what is going on; it can only be found in symbolic manipulation that isn't driven by understanding the problem at hand. – Disintegrating By Parts Apr 23 '14 at 18:54
  • @T.A.E., I think that claiming that $\frac{dy}{dx}$ is a ratio is "not different than writing $(53)/=53$" as you did above is a rather extreme position that will be supported by few editors here. I must say I am puzzled by your adopting such a position given that you appear to have done some serious historical work on Newton. Note that one of Newton's approaches to calculus was precisely via infinitesimals. We can test the views of the community by posting this as a separate question. How many people do you think will share your view that these are "no different"? – Mikhail Katz Apr 23 '14 at 18:54
  • @user72694 As $\frac{dy}{dx}$ is used and defined in Calculus, it most certainly is not a ratio. That's a fact. – Disintegrating By Parts Apr 23 '14 at 18:58
  • 1
    @T.A.E., I have no objection to hardmath's answer and in fact I upvoted it. I note that the only difference between hardmath's answer and mine is that he is careful to avoid using fractions and always works with differentials. This was one of Leibniz's approaches as well, though at other times he did work with ratios of differentials; see this recent article. On the other hand, if one is in a field, I see no conceivable reason why one shouldn't divide by $dx$ and work with ratios. There does exist a modern formalisation of... – Mikhail Katz Apr 23 '14 at 18:58
  • 1
    ... infinitesimals where the number system is not a field and one has nilpotent infinitesimals. Here one can literally write $f(x+dx)=f(x)+Adx$ on the nose. This is called "smooth infinitesimal analysis". Perhaps you prefer this system? – Mikhail Katz Apr 23 '14 at 19:00
  • @T.A.E., you write above that "$\frac{dy}{dx}$ as used and defined in the calculus is most certainly not a ratio", and claimed that "that's a fact". Would you be willing to consult a calculus textbook that went through several editions and is still in print, where this expression is indeed a ratio? – Mikhail Katz Apr 23 '14 at 19:02
  • 1
    Please keep the comments polite, people. – Alex Becker Apr 23 '14 at 19:55
  • @AlexBecker Do you, by any chance, have an opinion on this post? :) –  Apr 24 '14 at 00:27
  • The consistency of differentials being treated as individual objects is further suggested by the chain rule for geometric derivatives:

    $$\sqrt[dx]{dy} = \sqrt[du]{dy}^{du \over dx}$$

    The real question though isn't whether they are consistent, but rather which rules are the consistent ones and which aren't. Btw, I wonder why we use the notation ${d^2 y \over dx^2}$ rather than ${d^2 y \over d^2 x}$?

    – DanielV Apr 25 '14 at 09:41
  • Also, the consistency of partial derivatives should be considered separately from the consistency of derivatives. http://math.stackexchange.com/questions/760195/resolving-a-contradiction-in-the-proof-of-expected-value-of-binomial-distributio/760766#760766 is an example of a user being tricked by partial derivatives, and my brief explanation of their fundamental problem. Also, if you haven't already you might want to check out the paper "Is Mathematical History written by the Victors?" I haven't finished it yet, but it dwells on NotNotLogical's exact question. – DanielV Apr 25 '14 at 10:09
  • 2
    @DanielV, Leibniz wrote $dx^2$ in the denominator rather than $d^2x$ for a reason. The reason is that this is the square of $dx$. Meanwhile, $dy$ is not the square of $dy$ but rather a second difference, something like $f(x+dx)+f(x-dx)-2f(x)$. – Mikhail Katz Apr 29 '14 at 07:46
  • 1
    A notation can't be inconsistent... – vonbrand May 02 '14 at 12:37
  • 1
    Notnotlogical, this question is getting a little crowded and it is not clear what has been answered already and what has not yet been answered. If you have a specific question that has not been answered yet I would suggest posting it as a separate question. There does not seem to be any shortage of room here yet :-) – Mikhail Katz May 05 '14 at 12:50
  • @user72694 I know, but with the bounty on this question I didn't want to move to a new thread... –  May 05 '14 at 16:57
  • @NotNotLogical: I have no objections about moving your (very relevant) additional question to a new thread! Maybe I acted a bit exaggerated after you awareded the first bounty to me, so there is no need to hold yourself back in respect of my bounty. My bounty was mainly designated for the growing discussion on clarification of the purpose of the original thread. The interest for the technicalities seem to have decreased in the current thread, so your new question would be better served with a new thread! – String May 06 '14 at 07:43
  • @String Thanks for the suggestions, I have moved the question to a thread of its own. –  May 06 '14 at 14:52
  • @user72694 Thanks for the suggestions, I have moved the question to a thread of its own. –  May 06 '14 at 14:53
  • @NotNotLogical, in a discussion of Zeno paradoxes here: http://mathoverflow.net/questions/27075/what-is-the-oldest-open-problem-in-mathematics/165550#165550 I got quite a tongue-lashing. Wonder what you think of this issue. – Mikhail Katz May 11 '14 at 06:59
  • @user72694 Sorry for the long wait - after thinking about it for a while, I basically agree with you. I'm not sure I would call it "unsolved" though, perhaps "unresolved". I think the danger with assuming fundamental philosophical issues like this to have been completely "dealt with" by current maths (convergent series in this case), is that it distracts from the more subtle problems at hand that will probably persist as long as mathematicians are around. As to the tongue-lashing, it seems that academics are very particular about their details :) PS unrelated to post, so email me if you want. –  May 19 '14 at 23:36

10 Answers10

16

The discussion here has been quite interesting! I wrote about Leibniz's notation in my Bachelor's Thesis in 2010 reading through major parts of Bos's 1974 PhD on higher order differentials in the Leibnizian calculus. I believe Bos is wrong at one point. Assuming one variable in the by Bos so-called arithmetic progression is never necessary - only convenient! I will answer to that below.

Leibniz's differentials

Leibniz developed his differentials, at first, from a geometrical intuition - although he reconsidered the actuality of this idea time and again. In my words, this idea can be very briefly summarized as:

A curve can be thought of as a polygon with infinitely many infinitely small sides $ds$. Each $ds$ is an infinitesimally small straight line segment being a part of the curve and (paradoxically) tangent to it at the same time. Gathering the $ds$ to one straight line segment $s=\int ds$ this will constitute the length of the curve. Expressing such a curve by a geometrical relation between coordinate line segments $x$ and $y$ one could consider each $ds$ as hypotenuse of a right triangle with legs $dx$ and $dy$ so that $dx^2+dy^2=ds^2$.

This is only to say that $dx,dy$ and $ds$ was thought of as geometrical and mutually dependend entities - never considered just numbers like we allow functions to be today.

Just to stress how geometrical: the function nowadays expressed by the formula $f(x)=x^2$ would be something like $a\cdot y=x\cdot x$ where $a,y$ and $x$ where all considered line segments so that both sides of the equation would constitute an area in Leibniz's time.

The level curve example

In the fractions $\frac{\partial f}{dx}$ and $\frac{\partial f}{dy}$ the $\partial f$'s in the two fractions are unrelated because:

  • We do not have $\partial f,\partial x$ and $\partial y$ mutually dependend geometrical entities due to the reason you already gave that the first $\partial f$ is the change in $f$ when you move in the $x$-direction by the vector $(dx,0)$ whereas the second $\partial f$ corresponds to moving by the vector $(0,dy)$. So they are unequal although infinitesimally small ...
  • Even if we had some $df$ mutually dependend to $dx$ and $dy$ this would naturally have to be the change in $f$ when you travel the vector $(dx,dy)$ and thus different from the $\partial f$'s described before.

The chain rule example

Since we consider higher order differentials the work of Bos is relavant here: Had there been such thing as a derivative $z=\frac{dy}{dv}$ in Leibniz's time, the differential of that should read $$ dz=d\frac{dy}{dv}=\frac{dy+ddy}{dv+ddv}-\frac{dy}{dv}=\frac{dv\ ddy-dy\ ddv}{dv(dv+ddv)} $$ Now, since $ddv$ is infinitesimally small compared to $dv$ we may skip $ddv$ in the bracket and simply write $dv$ instead of $(dv+ddv)$. Therefore we have $$ \frac{dz}{dv}=\frac{dv\ ddy-dy\ ddv}{dv^3}=\frac{ddy}{dv^2}-\frac{dy\ ddv}{dv^3} $$ Note that $ddy$ can also be written as $d^2 y$. So the second order derivative of $y$ with respect to $v$ equals $\frac{d^2 y}{dv^2}$ minus some weird fraction $\frac{dy\ d^2 v}{dv^3}$ which can only be disregarded if it is zero. This only happens if either $dy=0$ or $d^2 v=0$. Choosing $d^2 v$ identical zero does the trick and renders $dv$ constant.

Suppose now that $d^2 v\equiv 0$. Then for the example $y=u=v^2$ we see that $du=2v\ dv$ and furthermore $ddu=2v\ ddv+2\ dv^2=2\ dv^2$ where the last equality is due to our choice that $ddv$ is identical zero. Therefore we see that the derivative of $w=\frac{dy}{du}$ will be given as $$ \frac{dw}{du}=\frac{d^2 y}{du^2}-\frac{dy\ ddu}{du^3} $$ where the last fraction is far from being zero as it may be rewritten - noting that $y=u\implies dy=du$ and that $\frac{dv}{du}=\frac{1}{2v}$ - to obtain $$ \require{cancel} \frac{\cancel{dy}\ ddu}{\cancel{du}\cdot du^2}=\frac{2\ dv^2}{du^2}=\frac{1}{2v^2} $$ This shows that assuming $\frac{d^2 y}{dv^2}$ to be the second order derivative of $y=v^2$ with respect to $v$ in the modern sense makes $\frac{d^2 y}{du^2}$ differ by $\frac{1}{2v^2}$ from being the second order derivative of $y=u$ with respect to $u$. Now since we know that $y=u$ we have $w=\frac{dy}{du}=1$ and thus $\frac{dw}{du}=0$. Therefore we must have $$ \frac{d^2 y}{du^2}-\frac{1}{2v^2}=0 $$ in this case showing that $\frac{d^2 y}{du^2}=\frac{1}{2v^2}$. So with the choice $y=u=v^2$ and $ddv\equiv 0$ the equation $$ \frac{d^2 y}{du^2}\cdot\left(\frac{du}{dv}\right)^2=\frac{d^2 y}{dv^2} $$ may be successfully checked applying that $\frac{du}{dv}=2v$ since we then have $$ \frac{1}{2v^2}\cdot(2v)^2=2 $$ which is actually true. This is NOT a coincidence!

Conclusion

The above calculations show that Julian Rosen's very appealing example of failure in the method of the Leibnizian calculus seems to be a misunderstanding about what is meant by the notions of $d^2 y$ and the hidden, but important, additional variables $ddv$ and $ddu$. This provides specific details regarding the comments given by user72694 below the answer from Julian.

However, proving that Leibniz's notation will never produce false conclusions when handled correctly is a whole different story. This is supposedly what Robinson managed to do, but I must admit that I have not read and understood that theory myself.

My Bachelor's thesis focused mainly on understanding how the method was applied by Leibniz and his contemporaries. I have often times thought about the foundations, but mainly from a 17th century perspective.

Comment on Bos's work

On page 31 in his thesis, Bos argues that the limit $$ \lim_{h_1,h_2\rightarrow 0}\frac{[f(x+h_1+h_2)-f(x+h_1)]-[f(x+h_1)-f(x)]}{h_1 h_2} $$ only exists if $h_1=h_2$ which then makes this limit equal $f''(x)$. But that is in fact not entirely true. The $x$-differences $h_1$ and $h_2$ need not be equal. It suffices for them to converge to being equal which is a subtle, but important, variation of the setup. We must demand that $h_1$ and $h_2$ converge to zero in a mutually dependend fashion so that $$ \lim_{h_1,h_2\rightarrow 0}\frac{h_2}{h_1}=1 $$ With this setup the limit of the large fraction from before may still exist, but need not equal $f''(x)$. Since $h_1,h_2$ play the role of $dx$'s this is equivalent to allowing $dx_1\neq dx_2$ so that $ddx=dx_2-dx_1\neq 0$ although being infinitely smaller than the $dx$'s.

This means that it is in fact possible to imitate the historical notion of $dx$ being constant (and thereby $x$ in arithmetic progression) directly by modern limits.

Extras regarding the OP's answer

You are quite right that the differentials can be succesfully manipulated into the equation $$ \frac{d^2}{dv^2}\big(y(u(v))\big)=y''(u(v))\cdot u'(v)^2+y'(u(v))\cdot u''(v) $$ under the assumption that $ddv\equiv 0$.

There is, however, a more obvious and even less restrictive choice to leave the progressions of all three variables $u,v$ and $y$ unspecified, and yet to connect the notation in a meaningful way to modern standards:

Introduce a fourth variable $t$ in arithmetic progression (i.e. $ddt\equiv 0$). One could think of it as a time variable so that $u(t),v(t)$ and $y(t)$ are coordinate functions of some vector valued function. Then Julian Rosen's equation can be directly transformed to $$ \frac{\left(\frac{d^2 y}{dt^2}\right)}{\left(\frac{du^2}{dt^2}\right)}\cdot\left(\frac{\left(\frac{du}{dt}\right)}{\left(\frac{dv}{dt}\right)}\right)^2=\frac{\left(\frac{d^2 y}{dt^2}\right)}{\left(\frac{dv^2}{dt^2}\right)} $$ and since $dt$ is in arithmetic progression $y''(t)=\frac{d^2 y}{dt^2}$ so that this may be written in modern notation as $$ \frac{y''(t)}{u'(t)^2}\cdot\left(\frac{u'(t)}{v'(t)}\right)^2=\frac{y''(t)}{v'(t)^2} $$ which is easily verified to be correct. This is probably the simplest account, but it only uses but does not give a very clear example of the necessity of choosing the progression of the variables. I think my first account did that better.

String
  • 18,395
  • +1 this is amazing. I'm going to need some more time to sort through this, cheers! Also would you be willing to email me a copy of your abovementioned thesis? –  May 01 '14 at 01:35
  • And what did you mean by that last sentence? What does "$dx$ being in non-constant progression" refer to? –  May 01 '14 at 01:39
  • @NotNotLogical: I would really like to share my thesis, but it is written in Danish - my native tongue. Maybe I should translate certain extracts from it some day. – String May 01 '14 at 23:35
  • Regarding the citation "$dx$ being in non-constant progression" that is actually a bit unclear. It should read "$x$ being in non-arithmetic progression" or "$dx$ being non-constant" corresponding to $ddx\neq 0$. These terms may be found in Bos's work. The concept of "the progression of the variables" was (to my knowledge) coined by Bos himself and corresponds to the fact that at the time of Leibniz there was no independent variable. All variables were interdependent, depending on the shape of the curve and the chosen "progression". – String May 01 '14 at 23:41
  • 1
    I could try using Google Translate :) If the thesis is mostly written, then it probably wouldn't be worth sending. On the other hand, if it's predominately equations there's a chance I could get through it with a free translation. If you would like to send it to me, please let me know, and I would love (attempting) to read it! As to the "non-constant progression", I think I have a better understanding of that now. Writing things in my own words (the 'update' post) was quite helpful. You have definitely stimulated my interest to study the historical calculus in more detail. Cheers! –  May 01 '14 at 23:53
  • Here it is. The pages 39-53 (the printed page numbers not the page numbers the PDF viewer counts) describe the method both from a geometrical and calculational perspective. If you are interested, I could write an extra answer in here pinpointing the central aspects of the method of higher order derivatives by referring to formulas and figures in the PDF. Link: https://drive.google.com/file/d/0B4GXFmv1fpLza0huTUQ4cDNlLVk/edit?usp=sharing – String May 02 '14 at 00:46
  • 1
    @NotNotLogical: Having spent so much time understanding the subtleties of this historical topic, reading through manuscripts and historically theoretical texts, your question has been one of the most interesting for me to respond to so far here on SE! – String May 02 '14 at 01:33
  • Thank you! It will be a real pleasure looking through your thesis! And thank you for the kind compliment about this post, I am glad that others have enjoyed this discussion as much as I have. Certainly feel free to post another answer if you like, that would be great. I am still digesting what several people have said, hopefully this discussion will continue :) –  May 02 '14 at 02:44
  • @NotNotLogical: I have no problem sharing my work. :) It still belongs to me nonetheless! I wish it were in English though. – String May 02 '14 at 08:59
  • Ok thank you very much! –  May 02 '14 at 20:32
  • @String would you mind taking a look at at this question by another user? I think you might be able to answer it: http://math.stackexchange.com/questions/724339/define-second-derivative-f-without-using-first-derivative-f – DanielV May 03 '14 at 05:36
  • @DanielV: Thank you for notifying me! I will certainly have a further look. The answer by Mauricio G Tec is very nice, though, but the main question is only very loosely adressed I agree :) – String May 03 '14 at 21:20
  • @user72694: I was surprised to read those comments, indeed! I have experienced too often how online discussions sometimes tend to spiral out of control. Another such example can be found in my thread about notation where my "thoughtless eagerness" caught negative attention from another user. Once that had happened it was simply irreversible. The point of no return emerges almost instantly in such occasions. Strange! – String May 11 '14 at 19:29
  • Anything having to do with pedagogy as well as philosophy is often viewed viewed with suspicion but you might be interested in consulting the SE matheducators forum. – Mikhail Katz May 12 '14 at 07:26
  • @user72694: Thank you for telling me about SE matheducators. Did you see this thread about teaching infinitesimals in freshman calc there? Quite interesting! – String May 12 '14 at 07:52
  • Did you mean to say "We do not have $∂f$,$∂x$ and $∂y$ mutually independend geometrical entities"? – The Quark Jul 18 '23 at 15:07
  • For the development of $d\frac{dy}{dv}$, the reasoning would be more directly understandable if the following step was added: $d\frac{dy}{dv}=df(dy,dv)$ with $f(u,v)=\frac{u}{v}$, so $d\frac{dy}{dv}=f(dy+ddy,dv+ddv)=...$. Somehow I had a hard time to see it right away. – The Quark Jul 18 '23 at 15:11
  • Have you encountered any use of keeping $dx_1$ and $dx_2$ independent in second derivatives or, perhaps more interestingly, in second-order differentials? I am wondering what could be missed or disregarded when setting $dx_1=dx_2=dx$... – The Quark Jul 18 '23 at 17:00
  • @TheQuark I think I meant to say what it says, but I will have to re-read the context a bit more. Your second point about notation confuses me a bit! I am working completely in Leibniz' notation, and it is not clear to me, what you mean by your notation. Maybe connecting it to modern notation could help in some way, but $f$ was never part of that setup in my rendition. Your third point is interesting, and some books actually kept different progressions for some time, which can allow for vertical curves also, but it changes formulas without much benifit, they say. Use vector functions instead. – String Jul 22 '23 at 09:17
  • Sorry, I meant to write $d\frac{dy}{dv}=f(dy+ddy,dv+ddv)-f(dy,dv)=...$. My remark stemmed from the fact that $\frac{dy}{dv}$ as a derivative is normally a function of $v$, but writing $d\frac{dy}{dv}=\frac{dy+ddy}{dv+ddv}-\frac{dy}{dv}$ implies to consider $\frac{dy}{dv}$ as a function of $dy$ and $dv$, more precisely as a ratio between $dy$ and $dv$, correct? Also, thank you for your other replies. Would you have any title(s) and author(s) for the books that keep different progressions? I am somehow intrigued by this approach. – The Quark Jul 23 '23 at 09:25
  • 1
    @TheQuark We are mixing Leibniz with later notations here! In a way, you are right, but we are mixing and confusing differentials and functions. The modern notion of a function $f:X\mapsto Y$ of one (or more) variables was not present at Leibniz' time. The same goes for derivatives. So $dz=d\frac{dy}{dv}$ is an oxymoron in terms of those concepts, it is namely a modified/obscure Leibnizian differential of the second order. – String Jul 23 '23 at 19:38
  • @TheQuark It is obscured by the fact that we do not have the concept of a function, and $dy$ and $dv$ are differentials. But I am playing along and assuming that a Leibnizian somehow stumbled upon or invented $z=\frac{dy}{dv}$ which in modern notation is $z=f'(v)$ such that $(z,v)$ is the curve of slopes. First when it comes to the second order derivative, the notion of progression becomes visible - there is a hidden variable, a modern function-based person would say, namely $(y,v)$ both depend on a hidden variable $t$, a sort of parametrisation. – String Jul 23 '23 at 20:00
8

The gist of the OP's explanation of why the "cancellation" of $\partial f$'s should not be allowed (and does not work) is correct, but something more can be said.

The partial derivative $\partial f/\partial x$ is the rate at which $f$ changes with respect to change in $x$, but while holding y constant. Similarly the definition of $\partial f/\partial y$ entails a rate of change while holding $x$ constant.

Manipulation of the $dx$ and $dy$ symbols separately (rather than as an ordinary derivative $dy/dx$) produces sensible results:

$$ \frac{\partial f}{\partial y} \;dy + \frac{\partial f}{\partial x} \;dx = 0 $$

which accords with the underlying premise that $x,y$ are restricted to a level curve:

$$ f(x,y) = \text{constant} $$

This sensible computation, despite appearing as a superficial manipulation of symbols, is taught in freshman calculus as implicit differentiation, so it bears consideration why this should be allowed, while "cancelling" $\partial f$'s should not. There is the hidden premise that $x$ is being kept constant when taking as a limit $\partial f/\partial y$, and similarly holding $y$ constants when taking $\partial f/\partial x$. Combining a change in $x$ with one in $y$ is then properly done by implicit differentiation, subjecting their mutual changes to a constraint that $f$ is being kept "level".

Added: A good notation is useful at least as much for what it hides/suppresses from its definition as for what it suggestively expresses. If a notation is soundly defined, any contradiction that arises from proper use has to be blamed on the underlying theory, rather than the notation itself.

Of course in hiding some parts of the definition, a notation lends itself to "abuse". As we see above thinking of derivatives as "fractions" literally is suggested by the notation, and sometimes "allowable", sometimes not.

A related pitfall having to do with first partial derivatives is their commutativity. We all "know" that under mild smoothness assumptions:

$$ \partial (\partial f /\partial x)/\partial y = \partial (\partial f /\partial y)/\partial x $$

However this depends on the pair $x,y$ being independent variables (holding one fixed while varying the other). I once tried to commute first partials while mixing Cartesian and polar coordinates in teaching a class, and promptly got a contradiction!

Consider for example the polynomial $f = x^2 + y^2 = r^2$ in both Cartesian and polar coordinates. Now $\partial (\partial f/\partial \theta)\partial x$ is identically zero, but $\partial (\partial f/\partial x)\partial \theta$ is not!

Fortunately I was able to learn from my mistake (not sure how much the students benefitted other than from entertainment value), and later it helped me appreciate why shape function derivatives do not commute in general.

So even when notations suggestively lead us astray, there may be a good lesson to be found.

hardmath
  • 37,015
  • 1
    Very good point. It could also be pointed out that the reason "why this should be allowed" is because there is in fact a consistent interpretation of the notation where $dx$ and $dy$ are differentials in a Leibnizian sense, i.e., infinitesimals. – Mikhail Katz Apr 23 '14 at 09:28
  • Exactly, you're not just performing mindless symbolic manipulations. There's meaning in what you have written, and leads to reasonable results. – Disintegrating By Parts Apr 23 '14 at 16:58
  • I don't see any mindless symbolic manipulations here, only less than fully thoughtful comments. – Mikhail Katz Apr 29 '14 at 07:49
  • 3
    @T.A.E. I find your constantly referring to other people's contributions as "mindless" to be very rude and inappropriate for this forum. An objective statement like "there is no application of this notation" or "this notation is misleading" would be more than welcome (if reasons are given for the statements), but to call others' work "mindless" is extremely presumptuous and unpleasant. – DanielV May 08 '14 at 14:02
  • 1
    @T.A.E. Furthermore, considerations of notation and rules of inference (which is what the word calculus means) are very important to work in formal reasoning. It is very difficult to actually turn "calculus" into a calculus, and this forums posts have been very enlightening to that end. – DanielV May 08 '14 at 14:05
  • When you write $\partial(\partial f/\partial\theta)/\partial x$ and $\partial(\partial f/\partial x)/\partial\theta$, the reason why it doesn't commute is because you did not define any function that is explicitly a function of both $x$ and $\theta$. It is not a proper second partial derivative since you are tacitly changing the definition of $f$ in between. – The Quark Jul 18 '23 at 16:40
  • @TheQuark: Yes, this is an abuse of notation that leads to the apparent inconsistency. I tried to highlight that idea above. – hardmath Jul 18 '23 at 19:12
  • Indeed, but what I wanted to point out is that the abuse of notation in this case is not with the partial derivative notation $\partial$ in itself, but with the use of the same letter $f$ to actually designate two different functions. So not directly related to the original question. – The Quark Jul 18 '23 at 22:06
5

As you suggest in your own question, there is in fact no contradiction in Leibniz's notation, contrary to persistent popular belief. Of course, one needs to distinguish carefully between partial derivatives and derivatives in the notation, as you did. On an even more basic level, the famous "inconsistency" of working your way from $y=x^2$ to $dy=2xdx$ is handled successfully by Leibniz who is aware of the fact that he is working with a generalized notion of "equality up to" rather than equality "on the nose". These issues were studied in detail in this recent study.

The formula $\frac{dy}{dx}=\frac{dy}{du}\frac{du}{dx}$ holds so long as we assign to the independent variable $du$ in the denominator of $\frac{dy}{du}$ the same value as that given by the dependent variable $du$ in the numerator of $\frac{du}{dx}$. On the other hand, if as is usual one uses constant differentials $du$ in computing $\frac{dy}{du}$ the formula will be incorrect. In each instance one has to be careful about the meaning one assigns to the variables, as elsewhere in mathematics. For details see Keisler.

The OP reformulated his question in subsequent comments as wishing to understand how Leibniz himself viewed his theory and why he believed it works. This seems like a tall task but it so happens that there is a satisfactory answer to it in the literature. Namely, while Leibniz was obviously unfamiliar with the ontological set-theoretic material we take for granted today, he had a rather clear vision of the procedural aspects of his calculus, and moreover clearly articulated them unbeknownst to many historians writing today. The particular paradox of the differential ratio $\frac{dy}{dx}$ being apparently not equal on the nose to what we expect, e.g., $2x$ (which in particular undermines the "tautological" proof of the chain rule in one variable) was explained by Leibniz in terms of his transcendental law of homogeneity. On Leibniz see article1 and article2.

The consistency of Leibniz's law is demonstated in the context of modern set-theoretic assumptions in terms of the standard part principle.

Mikhail Katz
  • 42,112
  • 3
  • 66
  • 131
  • Actually, your answer does not explain the need for the added negative. Where is your quantitative analysis? – Disintegrating By Parts Apr 23 '14 at 19:06
  • The OP explained this quite nicely by pointing out that the two occurrences of $\partial f$ have different meanings and therefore cannot be canceled. Furthermore $dx$ and $\partial x$ do not have the same meaning. That should apparently be enough to refute an incorrect calculation. – Mikhail Katz Apr 23 '14 at 19:09
  • @user72694 Regarding the edit about the regular chain rule: In the linked book, I find on page 88 that "When $dy/dx$ is computed with $x$ as the independent variable and $dx/dt$... with $t$... the two $dx$'s have different meanings, and the equation is not trivial." But then why is it true!? In JulianRosen's post, why can we not assign the independent variable $du$ in the first derivative the same value as the dependent $du$ in the second, so that they cancel? –  Apr 25 '14 at 16:25
  • @NotNotLogical, there is a subtle difference between the equation involving the $dx$'s and the equation invoving the $\Delta x$'s, which is explained in detail in the book. I notice you asked above about my email address; this can be found at my homepage linked at my page here. The book is written at the level of freshman calculus and in my opinion is more accessible than most epsilon, delta based standard textbooks, but this is nontrivial material after all. – Mikhail Katz Apr 26 '14 at 18:44
  • @user72694: Interesting references! They add quite a few pages to my reading list ... I hope you will comment on my post too, if you stumble upon historical misleadings in my rendering of it! I am still learning when it comes to understanding what Leibniz did in fact mean regarding the ontological issues of his infinitesimals. – String May 02 '14 at 22:20
  • @String, Leibniz's considered view was that metaphysical considerations should not stand in the way of the effectiveness of mathematical procedures. This was a remarkably modern view, so much so that it was not acceptable to his students Varignon, Bernoulli, and l'Hopital. To Leibniz the infinitesimals were mostly fictions and at the risk of lengthening your reading list even further I can recommend this. – Mikhail Katz May 04 '14 at 12:54
  • @user72694: Thank you! My possible misconception of Leibniz's view stems from my supervisor K. Andersen who happens to be Bos's "better half". She gave me the impression that she and Bos both thought, that Leibniz picked explanations to suit his audience rather than to hold any static position on the subject himself. But your study suggests that one can read behind that and identify a rather modern and formalist account as Leibniz's general point of view on infinitesimals, right? Certainly more pages I want to read! – String May 04 '14 at 14:44
  • 1
    @String, I would certainly hate to disagree with either Prof. Andersen or Prof. Bos whose opinions I highly value and respect. Furthermore, what you write does not contradict at all what we say in our paper. It is certainly beyond dispute that Leibniz, like a good pedagogue that he was, adopted his explanation to the perceived audience being addressed, and there is nothing wrong with that. – Mikhail Katz May 04 '14 at 14:53
  • @user72694: Both Andersen and Bos are very knowledgeable, indeed, and I was struck with awe reading the succinct account found in Bos PhD. That said, it should always be fair to disagree with prominent conceptions if an alternative view can be sufficiently substantiated. Still I am very uncertain about the specific standpoint of Andersen and Bos, let alone of Leibniz, about Leibniz's view on the ontology of infinitesimals. You probably know all three better than I do. Regards, String. – String May 04 '14 at 20:47
  • The link "article1" seems to be broken. –  Sep 17 '18 at 00:19
  • I don't understand when you write "On the other hand, if as is usual one uses constant differentials $du$ in computing $dy/du$ the formula will be incorrect." What do you mean by "constant differentials"? And I don't see why the $du$ should actually be the same for the "formula" to be correct. They don't have to: the formula means $(f\circ g)'=g'\cdot(f\circ g)$ and is correct when one $du$ is an independent variable and the other a dependent variable... – The Quark Jul 18 '23 at 17:41
  • @TheQuark: the expression "constant differentials" is familiar to Leibniz historians; see for example the seminal study by Henk Bos from 1974 (and of course Leibniz frequently uses the term). This can be explained as follows. The independent variable $x$ ranges through a certain interval $I_x$. Consider a uniform partition of $I$ into infinitesimal subsegments (here "uniform" means that all the subsegments have equal length). If the dependence of $u$ on $x$ is nonlinear, then the corresponding partition of the interval $I_u$ where $u$ takes its values will not in general be uniform... – Mikhail Katz Jul 19 '23 at 10:00
  • ... In calculating the ratio $\frac{dy}{du}$, one needs to use the partition of $I_u$ that's induced by the partition of $I_x$ (rather than some uniform partition of $I_u$). Then the chain rule will be satisfied. – Mikhail Katz Jul 19 '23 at 10:02
  • Than you for replying. If I understand correctly, "constant differential" is meant to say that it is assumed that $d^2u=0$? So you say that either the $dy$'s and the $du$'s are the same on both sides or the relation doesn't apply? But doesn't the chain rule $(f\circ g)'=g'\cdot (f'\circ g)$ precisely say that the formula holds even if the $dy$'s and the $du$'s have not the same meaning on each side, including then also the fact that the $du$'s that appear on the right side do not have the same meaning either? – The Quark Jul 19 '23 at 10:32
  • 1
    @TheQuark, of course, the chain rule holds in full generality :-) I was merely trying to explain why the rule $\frac{dy}{dx}=\frac{dy}{du}\frac{du}{dx}$ is not an algebraic tautology that it seems to be: the $du$ in the denominator and the $du$ in the numerator don't have exactly the same meaning. The former is an independent variable, whereas the latter is a dependent variable (depending on $dx$). The way to prove this is to start with infinitesimal $\Delta x$, compute the corresponding $\Delta u$, and if the latter is nonzero, compute the corresponding $\Delta y$. Then one has... – Mikhail Katz Jul 19 '23 at 10:41
  • ... an algebraic relation $\frac{\Delta y}{\Delta x} = \frac{\Delta y}{\Delta u}\frac{\Delta u}{\Delta x}$. Then one takes the standard part of both sides to obtain the chain rule. This is all explained in detail in Keisler. – Mikhail Katz Jul 19 '23 at 10:43
5

Leibniz notation for the second derivative suggests a version of the chain rule: $$ \frac{d^2y}{du^2}\left(\frac{du}{dv}\right)^2=\frac{d^2y}{dv^2}. $$ This does not hold in general: for example $y=u=v^2$.

Julian Rosen
  • 16,142
  • 4
    The explanation was clearly understood by Leibniz and is presented in H. Bos's seminal study of Leibnizian methodology. The point is that one can write the second derivative as $\frac{d^2y}{du^2}$ only of $du$ are constant differentials. Thus the cancellation can only occur of $u$ is a linear function of $v$, in which case the relation is of course true. One should perhaps ponder the fact that if Leibniz's notation really led to contradictions, we wouldn't be using it. – Mikhail Katz Apr 22 '14 at 21:54
  • @user72694 What study is that to which you are referring? Is it different from the one you provided in your answer? –  Apr 22 '14 at 22:24
  • @NotNotLogical, Bos's study is here. – Mikhail Katz Apr 23 '14 at 09:29
  • @user72694 Could you explain why this is so? I think that you are right in what you say, but I would like to fully comprehend the subtlety which is occurring here. And could I email you if I have further questions? I see that you have studied this area quite extensively. Thanks for the answer and links, cheers! –  Apr 23 '14 at 18:17
  • @NotNotLogical: are you referring to the issue of "constant differentials"? This is really not that mysterious and can be understood in the context of the curvature of a plane curve. It is known that given an arclength parametrisation of the curve the curvature is given by the norm of the second derivative (up to sign anyway; if one wants to get the sign for closed curves one needs to study orientation). On the other hand if the parametrisation is not arclength then the formula becomes a bit more complicated. Here arclength parametrisation corresponds to "constant differentials". In terms of.. – Mikhail Katz Apr 23 '14 at 18:22
  • ... modern infinitesimals, this can be explained by choosing a hyperfinite grid that is evenly spaced. So long as the grid is evenly spaced, one will be able to calculate the second derivative by the usual "finite difference" formula $\frac{f(x+h)+f(x-h)-2f(x)}{h^2}$. Otherwise it does not work. – Mikhail Katz Apr 23 '14 at 18:24
  • @user72694 Thanks for the response! This post has given me lots to think about, hopefully I will sort it all out in my mind eventually. –  Apr 23 '14 at 18:32
  • @user72694 I see what you are saying about constant differentials. But why then does the single-order chain rule work? Why can we cancel $(dy/du)(du/dx)=dy/dx$? The first $du$ would seem to be an independent change in $u$, whereas the second $du$ is a dependent response to altering $x$. So why do they cancel? –  Apr 25 '14 at 00:52
  • I will add a comment to my answer. – Mikhail Katz Apr 25 '14 at 08:53
  • Why not just use a simpler example for second derivatives, $$\frac{d^2x}{d^2u}\frac{d^2u}{d^2t} \ne \frac{d^2x}{d^2t}$$ for any nontrivial example? – DanielV Apr 27 '14 at 08:31
  • @DanielV Your notation is wrong. For an independent variable, the symbol "$d^2 u$" would mean $0$ (if we take it to mean anything at all). A second order difference $d^2 y$ only makes sense for a dependent variable $y(u)$. –  Apr 27 '14 at 17:52
  • @JulianRosen What is your take on the point about having constant differentials? Do you think your contradiction still stands, or would you change your view? –  Apr 29 '14 at 02:54
  • I finally got around to reading Bos' thesis, and was intrigued to find this exact contradiction addressed on Note 61 (of page 31) with essentially the same explanation that was provided by user72694 and @string. Too bad nobody quoted that! –  May 28 '14 at 01:41
3

Notation can be closely associated with contradictions. A good historical example comes from the work of Nieuwentijt who was a contemporary of Leibniz's. Nieuwentijt criticized Leibniz's approach and proposed his own notation where there are only first-order infinitesimals $\frac{r}{\infty}$ (where $r$ is an ordinary number), whereas the product of two such is postulated to vanish. He wrote a book based on this approach. Nieuwentijt's notation works for some simple calculus problems treated in his book. As we know in retrospect, it cannot be a basis for calculus, because it violates the Leibnizian law of continuity: whatever succeeds for the finite, succeeds also for the infinite, and vice versa. Namely, applying the usual rules of algebra to Nieuwentijt's system one quickly runs into contradictions.

An alleged "contradiction" in Leibniz that has been persistently reported in the literature since at least 1734 (date of publication of George Berkeley's The Analyst) is the contention that $dx$ is assumed to be nonzero at the beginning of the calculation, and zero at the end of the calculation, as for example when one wishes to write $2x+dx=2x$ at the end of the calculation for $y=x^2$. The alleged logical inconsistency can be summarized in modern notation as follows: $(dx\not=0)\wedge(dx=0)$.

This persistent claim of logical inconsistency however has no basis as Leibniz clearly and repeatedly stated in his writings that he is working with a generalized relation of "equality up to" (rather than an equality "on the nose"). In particular, Leibniz never wrote or implied that $dx=0$, contrary to Berkeley's contention. This issue was dealt with in detail in a recent article in Erkenntnis here.

Thus the alleged logical inconsistency occurs only in attacks of the critics who have misunderstood Leibniz rather than in Leibniz's work itself.

Mikhail Katz
  • 42,112
  • 3
  • 66
  • 131
  • I awarded the bounty to this answer, because not only does it address the perspective suggested about what is really at stake in this question, but it also provides arguments and references for the analysis. I found it hard to choose between this answer and that by mweiss, since mweiss was the first address the inherent problems with the nature of the question itself. I think, however, that that discussion tended to be more of a dispute about wording rather than on substance if not elaborated upon. This answer got closest to carry out such an elaboration. Thank you! – String May 09 '14 at 08:21
3

Question Update


EDIT: Concerning the issues in this thread, I am wondering if anyone has further insights into why we are permitted to multiply through by $dt$ in, for example, $$\frac{dy}{dx}=\frac{dy/dt}{dx/dt}$$ We start with $dx$ an independent change, and $dy$ the corresponding perturbation in $y(x)$. We multiply through by (presumably) an independent change in $t$. Insofar as neither of the differentials in $x$ or $y$ result from the change in $t$, why are we justified in taking the two expressions to be derivatives $y'(t)$ and $x'(t)$?


As a few comments of mine (specifically, where @JulianRosen stands on his post, and some comments directed at @Sting regarding his post) have not been responded to with the bounty time running out, I am going to post an update on how I stand right now.

The main issue for me has revolved around dealing with the example $y=u=v^2$ for which the claim $$\frac{d^2y}{dv^2}=\frac{d^2y}{du^2}\left(\frac{du}{dv}\right)^2$$ appears to be false.

I initially thought that the claim was not meaningful in the Leibniz notation, because the $du$ on the far right is a dependent varaible and will therefore not return a constant change in the second derivative, explaining why the formula is wrong.

However, the recent post by @Sting has (tentatively) convinced me that in fact the issue is much more subtle. The formula is in fact true - it just means something different from what we take it to mean. Specifically, the term $d^2y$ refers to a second-order difference - that is, a change in the change in $y$ caused by altering some independent variable (in this case $v$) - and that is all it is, nothing more. Hence the fraction $$\frac{d^2y}{du^2}$$ is simply a ratio of two differentials, one a second-order difference and the other the square of a first-order difference. A prioiri then, the above expression has NOTHING to do with our notion of a "second derivative" (that is, the derivative of the derivative). We could perhaps show that the two are equal, but this cannot be assumed. And in fact, they are not equal! As @Sting explained, we can show that $$\frac{d\left(\frac{dy}{du}\right)}{du}=\frac{d^2y}{du^2}-\frac{dyd^2u}{du^3}$$ where the left side is, by definition, the actual true "second derivative", and the right side is an expression equal to that. Hence we find that the ratio $$\frac{d^2y}{du^2}=\frac{d\left(\frac{dy}{du}\right)}{du}+\frac{dyd^2u}{du^3}$$ is actually greater than the second derivative by the term on the right, and the product in the first formula at the very top becomes $$\frac{d^2y}{du^2}\left(\frac{du}{dv}\right)^2=\left(\frac{d\left(\frac{dy}{du}\right)}{du}+\frac{dyd^2u}{du^3}\right)\left(\frac{du}{dv}\right)^2=\frac{d\left(\frac{dy}{du}\right)}{du}\left(\frac{du}{dv}\right)^2+\frac{dyd^2u}{dudv^2}$$ Re-writing, we get the claim $$\frac{d^2y}{dv^2}=\frac{d\left(\frac{dy}{du}\right)}{du}\left(\frac{du}{dv}\right)^2+\left(\frac{dy}{du}\right)\frac{d^2u}{dv^2}$$ In modern notation, this says the following. Let $y=y(u)$ and $u=u(v)$. Since we can take $d^2v\equiv 0$ the two ratios involving this are true second derivatives (see the post by @Sting) and so the claim is $$\frac{d^2}{dv^2}\big(y(u(v))\big)=y''(u(v))\cdot u'(v)^2+y'(u(v))\cdot u''(v)$$ which, of course, is valid. In other words, the "false" equation which appears in @JulianRosen's post is true, but it's meaning is more complicated than I initially thought. The Leibniz differentials need to be interpreted as broadly as possible, simply representing $n$-th order differences, without assuming them to equal $n$-th order derivatives.


That is the best I can do to communicate the current state of my (somewhat tenuous) understanding of these issues. Please feel free to comment on this post to point out any mistakes or misunderstandings that present themselves. And thank you to everyone who has participated for this amazing discussion! I hope it will be more-or-less resolved by the time I must award the bounty - and unfortunately I can't give the bounty to multiple people, although several here deserve it :) Cheers!

  • Bravo! I agree with your account leading to the relation for the true second derivatives of $y(u(v))$. Well done! Note that your use of the word differences to mean infinitesimals was not uncommon to Leibniz and his contemporaries whereas today differences is solely used for actual finite differences like $\Delta x$ and infinitesimals like $dx$ are called differentials. – String May 02 '14 at 01:01
  • The progression of the variables, to my knowledge indentified and named so by Bos, impose non-derivative-like behaviour on higher order differentials in general. In fact you can also have neither $ddv\equiv 0$ nor $ddu\equiv 0$ and still deduce meaningful relations. Some detail will follow in my next comment. – String May 02 '14 at 01:08
  • One way to leave the progression undetermined as Bos call it, ie. $ddv,ddu\not\equiv 0$, and still connect the relations to modern functions and derivatives is to introduce a new variable $t$ with $ddt\equiv 0$. Then the original equation by @JulianRosen may be written as: (see comment below) – String May 02 '14 at 01:13
  • $$ \frac{\left(\frac{d^2 y}{dt^2}\right)}{\left(\frac{du^2}{dt^2}\right)}\cdot\left(\frac{\left(\frac{du^2}{dt^2}\right)}{\left(\frac{dv^2}{dt^2}\right)}\right)^2=\frac{\left(\frac{d^2 y}{dt^2}\right)}{\left(\frac{dv^2}{dt^2}\right)} $$ – String May 02 '14 at 01:18
  • So then, using modern functions, we simply have $y(t),u(t)$ and $v(t)$ and then Julian Rosen's equation becomes $$\frac{y''(t)}{u'(t)^2}\cdot\left(\frac{u'(t)}{v'(t)}\right)^2=\frac{y''(t)}{v'(t)^2}$$ which is indeed true. Simple as that :) – String May 02 '14 at 01:21
  • Brilliant! And because $dt$ is truly independent, we can always set it to the same value and thus justify multiplying and dividing it throughout the identity. In fact, this sort of justifies all "differential claims" from a modern standpoint... very nice. PS you may want to add some of this to your post for future readers, I think it helps to further clarify your arguments. –  May 02 '14 at 02:35
  • The ability to put "algebraic" manipulations of derivatives on a sound footing owes much to the modern theory of linear operators on function spaces, in particular transform theory. I think we owe Liebnitz credit for his skilled avoidance of notational abuse. – hardmath May 02 '14 at 16:16
  • @String See my edit, if you are interested in responding. (At this point, I would recommend starting a new answer to keep things fairly 'modulated', your original answer is quite long) –  May 05 '14 at 04:12
3

I think the question is ill-posed (or, what amounts to the same thing, makes some incorrect assumptions).

Notation does not lead to contradictions. Ever. In any discipline. That is because notation does not assert anything. Notation has no truth value. Notation consists of a set of symbols for recording things, and a set of rules for manipulating those symbols. The power of Leibniz's notations is precisely that when those rules are properly followed we end up with formulas that look like familiar fraction cancellation laws, etc., which makes them easier to remember and conceptually easier to understand. If Leibniz's notation is misused, then, yes, apparent contradictions can arise -- but that is not a flaw in the notation, but rather a flaw in those who misuse the notation.

Looking over all of the answers in this thread, including the example in the OP, you will find the same kind of dialectic over and over: "Some people write < formula >, which looks like a contradiction or inconsistency, but that is only because < formula > really means < other formula >." Yes, precisely. If notation is used wrong you get wrong answers; if you use notation correctly, you get correct answers. If you get a result that you know is false, you can be sure that the notation has been misused.

Now what people generally mean, I think, when they critique Leibniz's notation as "leading to contradictions" is that certain mis-uses of the notation are very tempting, and people are prone to making them. This may be true, although I would counter that other notations (primes, dots, what have you) also have their "attractive nuisances". But that is a psychological problem, having to do with the human tendency to look for shortcuts and to perceive seemingly apparent patterns that are not really there; the fault lies not with our d's but with ourselves.

mweiss
  • 23,647
  • Interesting perspective. I agree, on an abstract level, with most of what you have said. For me however, the issue is not so much that there are "very tempting" uses/mis-uses of Leibniz' notation - it is why there are "very tempting" uses of his notation. To put it differently, Leibniz notation allows many facts of calculus to be derived easily with simple algebraic rules. This suggests that differentials are, in fact, algebraic objects and that calculus can be thought of in this way without contradiction. But there are contradictions, so how do we explain those? How literal is "$dx$"? –  May 02 '14 at 02:29
  • 1
    ...Your position would require us to build an alternative version of calculus (presumably based on $\epsilon\delta$-limits) which we could use to verify the rules derived from "Leibnizian" arguments. My question is: how can we know when Leibniz differentials produce valid results within the system itself. Denying ourselves modern knowledge - putting us in the place of Leibniz - how can we distinguish between valid and invalid arguments based on differentials? That is the core of my question. I would contend that there are "very temping" uses because, in fact, those uses are valid. –  May 02 '14 at 02:32
  • Well, then, it seems to me that your issue is not with Leibniz's notation but with his theory. Leibniz's theory was based on infinitesimals, and we know (thanks to Robinson's nonstandard analysis) that there are consistent theories in which infinitesimals exist. So the idea of infinitesimals does not, in itself, lead to any contradictions. – mweiss May 02 '14 at 02:47
  • And I'll disagree that Leibniz's notation "allows us to derive" facts of calculus. It allows us to express them simply, but I don't think most people would accept as a derivation or a proof an argument that relied on plausible-seeming notation. – mweiss May 02 '14 at 02:49
  • 1
    Of course for several centuries calculus was based on not-very-rigorous foundations, and people did all kinds of derivations using heuristics and plausible-seeming formulas. Some of those results turned out to be right, but we would not accept them as "proofs" in the modern day. Such heuristics could very well lead to contradictions, but in the fact of such a contradiction there would always be a possible rebuttal of the form "You are misusing the notation" and then we are back to square 1. Which is why I think the question is ill-posed. – mweiss May 02 '14 at 02:51
  • I am interested in how Leibniz' notation aligns with and reflects his theory. On your point about deriving facts - this is essentially the heart of the question. I am perfectly aware that superficially, these 'derivations' are nothing more than accidents of notation. However, is there a deeper understanding of what the notation represents, based on a consistent understanding of calculus apart from limits and operators seen today, which allows these to be true proofs. –  May 02 '14 at 02:51
  • But note that although they could have derived falsehoods, it appears they did not. Nobody has yet produced an explicit error that Leibniz obtained due to his misleading notation. It appears that he wielded it with total accuracy, which suggests that he was not focusing on the notation so much as he was on the ideas. He then fine-tuned his notation to reflect those ideas. So why does the notation sometimes lead to apparent errors? Must we take $dy/dx$ to be an abstract object, the limit of a certain difference quotient, or can we regard it as involving true differentials? –  May 02 '14 at 02:54
  • By "nobody has yet produced" I of course meant "nobody in this discussion" not "nobody in the literature" which I cannot claim to be well acquainted with. I think that I perhaps jumped to some conclusions in these last two posts - you definitely have a point about the "circular" nature of misusing notation. +1 for making me re-think everything :) Cheers! –  May 02 '14 at 03:01
  • Well, again, the fact that Leibniz's infinitesimal-based theory can be embedded and replicated inside Robinson's nonstandard analysis proves, pretty definitively I think, that the theory is immune to inconsistency. – mweiss May 02 '14 at 03:06
  • Right, but it doesn't explain why it's immune to inconsistency. As far as we know, it's still just a miraculous accident of notation. I'm looking for the internal justification. That is, how did Leibniz, who knew nothing of Robinson or Weierstrass, know that his results were correct? –  May 02 '14 at 03:13
  • I think you're moving the goalpost. The question was "Can a contradiction be demonstrated?" Robinson's nonstandard analysis provides a rigorous theory of infinitesimals, with which all of Leibniz's heuristics can be justified, so the answer is No, no contradiction can be demonstrated. Now you want to know how Leibniz knew his theory was sound. That is again a psychological question, not a mathematical one. Maybe he was convinced on aesthetic grounds, or theological ones. Or maybe he wasn't convinced at all and just hoped it would all work out. – mweiss May 02 '14 at 03:16
  • I see your point. I meant 'contradiction' in the sense of 'does Leibniz notation (as he used it) ever lead to a falsehood'. Can we showcase a 'trap' that Leibniz may have fallen into? The post by Julian Rosen seems to answer "yes". Leibniz was not following the complex rules given by Robinson which were proved consistent. But a few people have argued that Leibniz would not have viewed second-order differentials as we do, and so would not have erroneously claimed that equation to be true. The question, I hope, is entirely mathematical. I am looking for the justification that such a method works –  May 02 '14 at 03:31
  • And let me stress again, internal justification. Not justification based on external results such as Robinson or Weierstrass, though these may serve as helpful "checks". I want to understand - and this is ultimately what the question is getting at - what the notation really means, and what ideas it represents. I have only ever had a "fuzzy" idea of why everything seems to magically work out when using the notation. –  May 02 '14 at 03:33
  • Okay, now you're really asking for something impossible. No mathematical system can prove its own consistency. Consistency proofs, by their very nature, require you to step outside of a system and study it from without. – mweiss May 02 '14 at 04:01
  • I have to agree with mweiss to some extend. I think what NotNotLogical was really searching for was a detailed personal and maybe intuitive understanding of what Robinson proved in a presumably very technical non-intuitive way. The more times I write his name the more I see that I have to read his theory some day :) – String May 02 '14 at 09:16
  • 1
    Several comments back you wrote: "Must we take dy/dx to be an abstract object, the limit of a certain difference quotient, or can we regard it as involving true differentials?" The answer is, no, you do not need to take it as the limit of a certain difference quotient, and you can regard it as involving true differentials. (I'm not sure what you mean by "an abstract object".) Is that what you're looking for? – mweiss May 02 '14 at 13:55
  • 1
    Yes, I think what String wrote is correct. "A detailed personal and intuitive" understanding of why things work out is exactly what I am looking for. Why do Leibniz-type manipulations make sense? Not "are they true?", which several people tell me they are, just why do they work out intuitively. I think I have just had some trouble articulating what I was getting at. And I too need to investigate Robinson's work :) –  May 02 '14 at 17:18
1

Introducing a seemingly independent variable $t$

NotNotLogical asked in his latest edit of his answer, why it should be allowed to introduce $t$ and write $$ \frac{dy}{dx}=\frac{dy/dt}{dx/dt} $$ This addresses one major difficulty in Leibniz's original calculus, that regularity conditions (differentiability, continuity etc.) were only very vaguely thought of. To some extend simply because the notions we use to formulate such properties (functions, limits etc.) were not condensed and conceptualised at that time in history. They knew of singularities very well, though.

That said, maybe other notions could have done an equally proper job. However, the 17th century mathematicians were only concerned with curves of great regularity. I am not even certain whether they would have classified modern "monstrous examples" like Weierstrass Function as being curves at all.

One obvious way to introduce another variable without violating the ideas of the historical account of Leibniz's calculus could be to let $dt$ be the element of the curve $(x,y)$ in the way that $dt^2=dx^2+dy^2$. If $dx$ and $dy$ make sense (ie. the curve $(x,y)$ is sufficiently regular) this relation should make sense too.

What would never work is to introduce $dt$ without any relation to $dx$ and $dy$. Other relations than $dt^2=dx^2+dy^2$ could also work, but differentials have to be interdependent in order for the fraction-like manipulations with them to be well-founded and work.

String
  • 18,395
0

It is easy to see that first differential does not changes with shift of the variable: If $y$ is a function of $x$, then $dy = y'(x)dx$. Treating them as functions of $t$ does not break it: $dy = y'(t)dt=y'(x)x'(t)dt=y'(x)dx$. Things like $\frac{dy}{dx}=\frac{1}{\frac{dx}{dy}}$ also holds. But that's wrong for second differential and higher: $d^2y=y''(x)dx^2$, if $x$ is independent variable. But in general, for functions of $t$, $d^2y=y''(x)dx^2+y'(x)d^2x$. So the problem is crated by an assumption $d^2x=0$. For the functions of many variables, we've got similar situation, but remember that if, for example, $u=f(x,y,z)$, then $du=f'(x)dx+f'(y)dy+f'(z)dz$ while parial derivatives themselves could not be interpreted as ratios of differentials. To sum up, differential notation is not contradictory in terms of first derivatives, but must not be used in proofs of deriving rules, because the chain rule is needed to explain this property.

  • $dy/dx = 1/(dx/dy)$ only when $y$ is monotonically increasing or decreasing. – Alex R. Apr 30 '14 at 19:47
  • @AlexR. By the Inverse Function theorem, this statement is locally true for any continuously differentiable function $y(x)$ at any point where $\frac{dy}{dx} \neq 0$. – Davis Yoshida May 01 '14 at 00:00
  • @DavisYoshida: in other words locally monotonic – Alex R. May 01 '14 at 00:18
  • @AlexR. Oh I thought you meant globally. My mistake. – Davis Yoshida May 01 '14 at 00:48
  • @AlexR. What's your point though? The equivalent "formal" version $\big(f^{-1}\big)'=1/f'\big(f^{-1}\big)$ has the same conditions on it... –  May 01 '14 at 01:06
  • 1
    @user135508, the "prime" notation $'$ should not be used for variables, only for functions. This is because $y'(x)$ is ambiguous. – Mikhail Katz May 01 '14 at 13:15
  • It is actually possible to express partial derivatives as ratios, although I'm not sure that it's very helpful. Specifically, if $u = A ,dx + B ,dy + C ,dz$, then $A = (du \wedge dy \wedge dz)/(dx \wedge dy \wedge dz)$, where $\wedge$ is the wedge product from the theory of exterior differential forms. As $A$ is not simply ‘the partial derivative of $u$ with respect to $x$’ but rather ‘the partial derivative of $u$ with respect to $x$ when $y$ and $z$ are held constant’, the presence of $dy$ and $dz$ in this ratio is not unreasonable. – Toby Bartels Apr 23 '17 at 03:50
0

alright i kinda not understand the question but consider the product rule for derivatives :

d(uv)/dx = du/dx*v + dv/dx*u

uv = dx/d( du/dx*v + dv/dx*u)

uv = indefinite integral( du/dx*v + dv/dx*u)

Since the inverse of d/dx is dx/d, therfore cannot lead to a contradiction because

If dx = 0 ,then d/dx is undefined, it follows that it has no inverse, which then represents that integral d(uv)/0 can be ANY constant, C, since the RATE OF CHANGE is undefined.

For simplicity , i assume, Leibniz defined d/dx = 0 , if d/dx is undefined , in this case dx = 0.

  • You are correct about the indefinite integral of $uv$. Note the similarity to the rule of integration by parts: $$uv=\int udv+\int vdu \iff \int udv=uv-\int vdu$$ I think when you write $d/dx$ you are mixing up "operators" and "differentials". Differentials, which are being discussed in this post, are more akin to numbers that are manipulated algebraically. So you would always have $d$"something", never just $d$ by itself. The symbol $\frac{d}{dx}$ is more of a modern notation, and represents an operator (the derivative). It takes in a function, and outputs that function's derivative. ... –  May 01 '14 at 19:48
  • ... so I don't think, under any interpretation, it would make sense to "flip" $d/dx$ and get $dx/x$ - it's not really an algebraic quotient. PS If you want to make the math in your post look awesome, put dollar signs "$" to the left and right of the math. This gives a good summary of how to code everything: http://meta.math.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference –  May 01 '14 at 19:50
  • thanks for the heads up about mixing up the differentials and operators, also for the math code :) – Greg Dillon May 01 '14 at 20:53