38

I'm just learning about differential equation separability. I understand what a derivative is. One notation for derivative is $\frac{dy}{dx}$, which - misleadingly - is not a fraction. Since it's not a fraction, why are we "separating" differential equations by treating it as if it were a fraction? For example:

We have the following differential equation: $$\frac{dy}{dx} = y.$$

Then we separate the... whatever they are: $$\frac{dy}{y} = x\cdot dx.$$

What do $dy$ and $dx$ even represent when they are detached from each other? How is this valid math?

Then we integrate both sides of the equation. Even though we are integrating one side with $dy$ and the other side with $dx$, the equality is somehow magically not broken.

Also, somehow integrating $dy$ yields $y$, but integrating $dx$ doesn't yield $x$.

So my question is how do you make sense of $dy$ and $dx$ variables as they are separated from each other?

Ankoganit
  • 731
bkoodaa
  • 729
  • 2
    The answer by petru is great, and if you happen to have a copy of Stewart's Calculus, the section on differentials would be helpful to check out. – The Count Feb 13 '17 at 18:04
  • @TheCount, agreed. Additionally, while likely not needed, if you want/need a more rigorous treatment to help you understand these concepts, that falls into the realm of "non-standard analysis." Non-standard analysis seeks to make calculus rigorous without the typical delta-epsilon techniques (done by rigorously treating infinitesimals). – David Feb 13 '17 at 18:11
  • @Atte Juvonen is the right hand side xy or y? – Narasimham Feb 13 '17 at 18:15
  • @Narasimham I would assume it's $xy$ in the first equation, it would be much easier to omit a single $x$ then to accidentally write $x*$ in the second equation. – David Feb 13 '17 at 18:16
  • Berkeley's famous definition must be said: differentials are "the ghosts of departed quantities". (Of course, he said this to castigate differentials, which were used freely in his day despite not having any solid definition at the time.) – Paul Sinclair Feb 14 '17 at 00:30
  • 1
  • $\frac{\text{d}y}{\text{d}x}$ is a fraction. That's literally how derivatives are defined, and Leibniz's notation was constructed specifically to make this nature obvious. Non-standard calculus, Wikipedia, provides background on the rigorous definitions (as opposed to the "just believe it on faith" approach some classical folks advocate). – Nat Feb 14 '17 at 05:50
  • See also http://math.stackexchange.com/questions/1252405/is-it-mathematically-valid-to-separate-variables-in-a-differential-equation. – Hans Lundmark Feb 14 '17 at 07:30
  • I think probably the simplest way to understand it is that you can do anything to both sides of an equation, including integrate with respect to whatever your choose. (as long as it is done carefully) So to solve $\frac{dy}{dx}=\frac{g(x)}{f(y)}$, we actually just multiply by $f(y)$ and then integrate w.r.t. $x$, both completely uncontroversial steps: $\int f(y)\frac{dy}{dx}dx=\int g(x)dx$. The notation $\int f(y)dy=\int g(x)dx$ can just be thought of as a shorthand for this, if one is uncomfortable with splitting apart the differentials. It just so happens that $\int y y' dx = \int y dy$. – jdods Feb 14 '17 at 21:55
  • @Nat: As I've stated in comments to you, NSA requires rather strong set theoretic axioms that are actually unnecessary for real analysis. In contrast, if you just stick to asymptotic expressions you can do pretty much everything that NSA can do in a completely intuitive and rigorous way. Specifically, this post gives a brief outline of how one can interpret Leibniz's notation using asymptotic notions that captures the original idea of fractions of small changes without requiring infinitesimal quantities. – user21820 Feb 17 '17 at 07:32

7 Answers7

28

Treating $\frac{dy}{dx}$ like a fraction is, as you have correctly stated, not really correct. What's really going on is the following (to stay with your example): take the differential equation

$$\frac{dy}{dx}=y(x)$$

and divide both sides by $y(x)$ (this is not a trivial step - check out user21280's answer for an example what can go wrong here if you aren't careful; I won't focus on it any further here because I don't believe this to be the main thing you are enquiring about) to obtain

$$\frac{dy}{dx}\cdot\frac{1}{y(x)}= 1$$

Using the chain rule backwards, we rewrite the LHS:

$$\frac{dy}{dx}\cdot\frac{1}{y(x)}=\frac{d}{dx}\left(\ln|y(x)|\right)$$

Plugging this in, we have

$$\frac{d}{dx}\left(\ln|y(x)|\right) = 1$$

Next we integrate both sides with respect to $x$ to obtain

$$\ln|y(x)| = x + C$$

Exponentiating and absorbing the $\pm$ that reflects the absolute value into our new constant $D$, we have the familiar solution

$$y(x) = De^x$$

Whenever you are doing separation of variables, this is effectively what happens in the background. Treating $\frac{dy}{dx}$ as a fraction is a useful way to remember this method easily and is a lot easier to write down. It also yields the correct results, so people in more applied subjects like physics do it all the time, after having ideally seen what's behind it at least once.

I think you can also make $\frac{dy}{dx}$ being the ratio of two infinitesimal quantities rigorous using differential forms, but this is an area that I have not studied, so I can't help you there.

Tom
  • 3,289
  • 2
    This answer's plainly wrong; derivatives are defined as a fraction of infinitesimal quantities, which Leibniz's notation was constructed to reflect. Also, the chain rule comes from this definition, so your alternative approach is invalid if you choose to reject it. – Nat Feb 14 '17 at 05:45
  • 11
    @nat Derivatives can be defined in too many ways to count. The definition as a fraction of infinitesimal quantities is quite possibly the least common one since developing rigorous infinitesimals is quite hard. The most common version of a derivative definition I've seen would be using limits, though you can surprisingly easily define them using a formalistic approach as a linear operator satisfying some rules. Chain rule itself also follows too many ways to count depending on what you choose as definitions. Altogether this answer is not wrong and certainly not plainly so. – DRF Feb 14 '17 at 08:47
  • @DRF The limit-of-a-fraction definition presented in most intro textbooks is exactly identical to the hyperreal definition, which is necessarily consistent in any system that admits reals. While it's possible to define derivatives without explicitly stating that they're equivalent to a ratio, it's impossible to define derivatives without that property holding. – Nat Feb 14 '17 at 08:59
  • @DRF Actually, to say another way: Because any system with real numbers automatically has hyperreal numbers, then any system with real numbers necessarily contains a way to represent any derivative as a ratio. Then, denying that a derivative "is" a ratio is like denying that $1+1$ "is" $2$: it can only be supported by a crass refusal to acknowledge the conversion. – Nat Feb 14 '17 at 09:08
  • 3
    @nat What do you mean by "any system with real numbers automatically has hyperreal numbers"? You can certainly have models of the real numbers with no hyperreal numbers. You can always extend such a model to a model containing hyperreals, but it doesn't satisfy the same properties (though it does satisfy the same first order properties) . e.g. the resulting field is not archimedean. As for the definition being the same ... it's not. The resulting object is the same (isomorphic). (anyway we should move to chat ). – DRF Feb 14 '17 at 09:32
  • 4
    @Nat: I hereby challenge you to solve the problem in my post below using your method without looking at the hint or solution sketches. The chain rule holds only under specific conditions if you wish to treat all variables on equal footing so that NSA (non-standard analysis) can be used. Also, NSA relies on AC or at least ultrafilter on naturals, while classical analysis does not. And what if I don't believe the existence of an ultrafilter on N? – user21820 Feb 14 '17 at 11:53
18

Most explanations of the method of separating variables do not make clear that it only works on a region where the arithmetic operations are all valid, including the division by $y$. Here is an example where the method fails to find the correct answers if you anyhow perform invalid operations. (Well what do you expect?)

Solve for $y$ as a function of a real variable $x$ given that the differential equation $\frac{dy}{dx} = 2\sqrt{y}$ holds.

Even Wolfram Alpha gets it wrong. Most students and some teachers will fail to get it right, and also fail to identify their mistake when told they are wrong, because fixing the mistake will require a proper foundation in logic.

Hint

The answer is not $y = (x+a)^2$, which you would get by the method of separating variables. What went wrong? Note that the error would still be there if you used the theorem that allows change of variables in an integral. Look carefully at each deduction step. One step cannot be justified based on any axiom. Think basic arithmetic. After you get that, you need to consider cases and use the completeness axiom for reals to extend the open intervals on which the standard solution works.

Solution sketch

The field axioms only give you a multiplicative inverse when it is not zero. Now how to solve the problem? Split into cases. Note that you need to work on intervals since having isolated points where $y$ is nonzero is useless. First prove that for any point where $y \ne 0$, there is an open interval around $x$ for which $y \ne 0$. Then we can use the completeness axiom for reals to extend the interval in both directions as far as $y \ne 0$. Now we can use any method to solve for $y$ on that interval. Note that the method of separating variables is formally invalid, so we should use the change of variables substitution. But the prerequisite for that is that $\frac{dy}{dx}$ is continuous, so we need to prove that! Well, $y$ is differentiable and hence continuous, so $2\sqrt{y}$ is continuous. So we get the solution on the extended interval, and it shows that $y$ becomes zero in exactly one direction in this example. Hence after some checking you will get either $y = 0$ or $y = \cases{ 0 & if $x \le a$ \\ (x-a)^2 & if $x > a$ }$ for some real $a$.

Alternative subproof

In fact, the substitution theorem can be completely avoided as follows. On any interval $I$ where $y \ne 0$, we have $y'^2 = 4y$, where "${}'$" denotes the derivative with respect to $x$. Thus $(y'^2)' = (4y)'$, which gives $2y'y'' = 4y'$, and hence $y'' = 2$ since $y' = 2\sqrt{y} \ne 0$. Thus $y' = 2x+c$ on $I$ for some real $c$, and hence $y = x^2+cx+d$ on $I$ for some real $d$. Note that most of the above steps are not reversible and hence we need to check all the solutions we finally obtain with the original differential equation. We would get $c^2 = 4d$. After simple manipulation we obtain the same result for $y$ on $I$ as in the other solution. The other parts of the solution still need to be there.

Bottom line

Separating variables is not so simple as you might think. Many textbooks actually teach it wrong.

user21820
  • 57,693
  • 9
  • 98
  • 256
  • 3
    This is a good point that I should have made clearer in my answer. Since this didn't seem to be the focus of the OP's question, I glossed over it, but you are of course right that dividing through by $y(x)$ is not a trivial step. I have edited my answer to reflect that. – Tom Feb 14 '17 at 11:57
  • @Tom: That's great! Thanks for editing your answer; I'll remove reference to your answer now. – user21820 Feb 14 '17 at 12:09
  • 1
    Checking this out ask you'd requested. It's a good reminder that people should consider branching cases when doing math, with this being a classic hidden division-by-zero fallacy. This fallacy can be used to "prove" stuff like $1=2$, or, in this case, cause people to lose interest in the trivial solution, forgetting that it exists and can be valid over the domain where the non-trivial solution isn't. – Nat Feb 14 '17 at 14:14
  • @Nat: Have fun solving it! Rigorously, please! =) – user21820 Feb 14 '17 at 14:19
  • @user21820 Hah I have to admit that it's strange to see someone calling integrations "fun", but it's nonetheless charming! But, yeah, got it. That said, why post it in this question? I mean, the fact that branching cases need to be considered is universal to all math; it's not in any way especially related to separable ODE's. So what's the relevance here? – Nat Feb 14 '17 at 14:40
  • @Nat: The relevance here is whether or not NSA provides an easier solution (including the argument for piecing together of the curves). If not, then there's little benefit in going to NSA for differential equations, especially when it relies on weird axioms like AC. It is far easier to just use sequences encoded in Landau notation, which will also get you the chain rule following from cancellation of nonzero terms of fractions. The obvious advantages of doing this is that it is not only intuitive but rigorously justifiable even in weak systems and easily translates to the ε-δ definition. – user21820 Feb 14 '17 at 15:04
  • @Nat: See an intuitive explanation and a few examples for the method and also this application. Basically the ultrafilter lemma is unnecessary for analysis because it simply gives an arbitrary and useless ordering on sequences that do not converge. I agree that it is an elegant way of producing a non-archimedean field elementarily equivalent to the reals, but practically it's not so useful. – user21820 Feb 14 '17 at 15:09
  • @Nat: Oh if you were asking why it's relevant to this question, at first the top-voted answer performed the illegal division without checking validity, and then in fact got the right answer because it erroneously expanded the possible values for the multiplicative constant $D$ to include $0$. The example in my post shows that we can't do that and hence many textbook presentations of separating variables are wrong. A technically interesting aspect is that the erroneous method actually gives the right answer if you require the solution to be meromorphic and you divide by meromorphic expressions. – user21820 Feb 15 '17 at 05:08
  • 1
    @user21820 How we can prove that for any point where $y\not =0$ , there is an open interval around $x$ for which $y\not =0$ ? – S.H.W Feb 04 '19 at 21:53
  • @user21820 Also , please explain the usage of completeness axiom for extending the interval . – S.H.W Feb 04 '19 at 22:13
  • @S.H.W: The differential equation comes with the implicit requirement that $\frac{dy}{dx}$ exists at every point of the solution, which implies that $y$ is continuous with respect to $x$, which in turn implies by a basic real analysis fact that if $y$ is nonzero then it remains nonzero in some interval around $x$. To extend $(a,b)$ for which $y ≠ 0$ when $x∈(a,b)$, let $a' = \inf({ p : \text{$y ≠ 0$ when $x∈(p,b)$} })$ and likewise for $b'$, and then you can prove that $(a',b')$ is a maximal open interval around $x$ for which $y ≠ 0$ when $x∈(a',b')$. Note that $a',b'$ may be $±∞$. – user21820 Feb 05 '19 at 14:25
  • 1
    @S.H.W: Remember that the question is to find all differentiable functions $f$ on the entire real line that satisfy the given equation, so for any $x$ such that $f(x) ≠ 0$ we can (as explained in my previous comment) easily construct the maximal open interval $(a,b)$ around $x$ on which $f ≠ 0$, and solve the differential equation on this interval to find out what $f$ must be on this interval. In this question, you will find that $f$ is increasing to the right on any such interval, so there must be at most one such interval and it must be $(a,∞)$ for some real $a$. – user21820 Feb 05 '19 at 14:43
  • @user21820 Thanks for your response . Can you provide a general procedure for the solving separable differential equations ? (by considering the domain of solution) – S.H.W Feb 07 '19 at 19:37
  • @S.H.W: My above two comments are indeed giving the general procedure. If your domain is not the entire real line, then you accordingly have to figure out how the domain is divided into those intervals. The key point is that you can only divide by nonzero. If on some interval you get $f(y) \frac{dy}{dx} = g(x)$ for some continuous $f,g$, then you can take anti-derivative with respect to $x$ to get $\int f(y) \frac{dy}{dx}\ dx = \int g(x)\ dx + c$ (on that interval) for some constant $c$, and then $\int f(y) \ dy = \int g(x)\ dx + k$ for some constant $k$. – user21820 Feb 08 '19 at 02:49
  • In my post, I called the last step "change of variables substitution". That is the essentially unique correct way to do it. The full theorem is that if you have variables $x,y,z$ such that $\int z\ dy$ and $\frac{dy}{dx}$ are always defined (on the curve of interest), then $\int z \frac{dy}{dx}\ dx = \int z\ dy + c$ for some constant $c$. Proof is easy by chain rule: $\frac{d(\int z\ dy)}{dx} = z \frac{dy}{dx}$, thus $\int z\ dy$ is an anti-derivative for $z \frac{dy}{dx}$. For more, come to the basic mathematics chat-room. =) – user21820 Feb 08 '19 at 02:58
13

The idea of splitting the $dy$ and the $dx$ is really just a short-cut for the following:

Starting with a separable variable DE in the form $$f(x)=g(y)\frac{dy}{dx}$$ then integrate both sides with respect to $x$, so that$$\int f(x)dx=\int g(y)\frac{dy}{dx} dx=\int g(y) dy$$

...$+c$, of course.

David Quinn
  • 34,121
  • 1
    Could you expand on the final step there, where you are definitely NOT "just cancelling out the two 'dx's", but some might think that's what's happening. WHY does Int_dx[g(y).dy/dx] = Int_dy[g(y)]? – Brondahl Feb 14 '17 at 13:26
  • 2
    @Brondahl. It's the Chain Rule in reverse. If $\frac{d}{dy}G(y)=g(y)$ then $\frac{d}{dx}G(y)=g(y)\frac{dy}{dx}$ therefore the last step follows – David Quinn Feb 14 '17 at 13:35
  • 2
    @Brondahl: Not quite what DavidQuinn said. You need to ensure that $\frac{dG(y)}{dx}$ is continuous (or at least integrable in $x$). This is one of the conditions that is often overlooked in textbooks, but it can fail badly (Volterra's function). Usually this condition is satisfied for differential equations because it usually assumes differentiability and hence continuity and so on. – user21820 Feb 14 '17 at 13:56
3

You first have to understand what a differential is. They are infinitesimal difference between successive values of a variable.

$dy=f'(x)\,dx,$ is the mathematical definition of this expression.

Of course, $f'(x)= \frac {dy}{dx}$, so you can see them as the ratio of change of y with respect of x (following the definition of a differential).

Second, integrating $dy$ yields $y+C$, and integrating $dx$ yields $x+K$, both $C$ and $K$ being constants of integration.

Hope I clarified it a bit for you

EDIT:

For the case $f'(x)=1$

This means $\frac {dy}{dx} = 1$

if you integrate both parts in $dy=1dx$ you get $y=x+k$. –

EDIT2:

Remember, the derivative of a function is the slope of the tangent line at that point, in a linear function, its the slope of the function itself.

enter image description here

See? for every function $y=x+k$, they all have the same slope since in a linear function, the constant denotes where the function cuts the $x$ axis, but has nothing to do with the slope of the tangent function (itself, in this case)

petru
  • 158
  • 10
  • 1
    Isn't integrating supposed to be the opposite operation of derivation? If we derivate x + K, we get 1, not dx. – bkoodaa Feb 13 '17 at 20:51
  • Edited my answer, take a look. – petru Feb 13 '17 at 20:56
  • Thanks, but it's still unclear on why integrating dx yields x + K. If we derivate x + K, we don't get dx, even though derivation is supposed to be the reverse operation of integration. – bkoodaa Feb 13 '17 at 21:03
  • I really like this question, you are asking what most people take for granted, im gonna edit again. – petru Feb 13 '17 at 21:06
  • 7
    "They are infinitesimal difference between successive values of a variable." And how do you define "infinitesimal" and "successive" in this sentence? To define something in mathematics, you have to do so in terms that are themselves well-defined. Otherwise, it is all meaningless. – Paul Sinclair Feb 14 '17 at 00:38
  • What's your definition of tangent btw? I can't think of one that's not circular. – DRF Feb 14 '17 at 09:37
  • @petru Is the notation "dx" actually a shorthand for "1dx"? – bkoodaa Feb 14 '17 at 14:08
  • Yes, it is indeed. – petru Feb 14 '17 at 16:58
3

Something that I think has not been mentioned in the other answers is the fact that $dy $ and $dx $ are just notation here. And, more importantly, they are not necessary for separation of variables.

A separable equation is of the form $$\tag {*}g (y)\,y'=f (x). $$ One solves by recognizing that, if $G $ is an antiderivative for $g $ and $F $ for $f $, the above is $$G (y (x))'=F'(x), $$ and so we obtain the equality $$G (y)=F (x)+C. $$As mentioned by user21820, putting the equation in the form $(*) $ requires care about the case $g (y)=0$.

In the end, $dy/dx $ can sometimes be treated as a quotient. But there is no need to.

Martin Argerami
  • 205,756
2

$dy/dx$ is confusing for a lot of people, because it can stand for multiple things which unintuitively are all the same. $dy/dx$ is the rate of change with $y$ in terms of $x$. It is also the most accurate tangent line between two infinitesimal elements, which is equivalent to the ratio of two infinitesimal elements.

If you accept the chain rule, then algebraically manipulating differentials should seem more natural, they become multiplicative. At least to some degree, $dy/dx$ does represent a fraction, though you probably don't think of it that way.

If you're not comfortable with using seperable DEs, they are really just a special case for linear DEs, which don't use the seperable method. So your problem should remedy itself soon.

Kaynex
  • 2,448
  • 1
    Separable is not a special case of Linear. For example, $y' = x\cos y$ is separable, but it is not linear. – Paul Sinclair Feb 14 '17 at 17:20
  • 1
    Sorry, should have said "exact". Point being that all separable equations can be solved through other means. – Kaynex Feb 14 '17 at 17:48
1

I'm going to avoid the discussion of what $dx$ and $dy$ mean, since others have provided very good answers to that, and instead clarify the specific example in the OP:

We have the following differential equation: $$\frac{dy}{dx} = y.$$

Then we separate the... whatever they are: $$\frac{dy}{y} = x\cdot dx.$$

[....] Also, somehow integrating $dy$ yields $y$, but integrating $dx$ doesn't yield $x$.

The way you've narrated the example makes me suspect that there are a few procedural things going on that you are missing, quite apart from the (very reasonable) question about what the heck any of this means. For starters, you've made a simple algebraic mistake in separating the variables. Your second equation should be $$\frac{dy}{y} = dx$$

(Note that in the OP there was an extra $x$ on the right-hand side.)

Now the sentence "integrating $dy$ yields $y$, but integrating $dx$ doesn't yield $x$" does not really match what's going on. We're not integrating $dy$, we're integrating $\frac{dy}{y}$, so the integral on the left-hand side isn't $y$, but $\ln|y|$. Meanwhile the integral of the right-hand side is $x$. Include an undetermined constant and you've got the equation $$\ln|y| = x + C$$ Exponentiating both sides gives $$|y| = e^C e^x$$ or equivalently $$|y| = Ke^x$$ where $K = e^C$ is an arbitrary positive constant. Finally we drop the absolute value signs, which we compensate for by letting the constant be either positive or negative: $$y=Ae^x$$ where $A$ is any nonzero constant.

mweiss
  • 23,647