Deriving the formula for parametric integration (area under curve)

Question

I am currently learning about the finding the area under the curve via integration using parametric equations. I was looking at this website http://tutorial.math.lamar.edu/Classes/CalcII/ParaArea.aspx and understood the material up till this line:

"So, if this is going to be a substitution we’ll need, dx = f'(t) dt".

Separately, I was given an easy way to remember this which is that "(dx/dt) dt = dx" is the way it is since "the two dt cancel each other out". Granted it's easy to remember but it does nothing in telling me why it should be the case, let alone the fact that it's mathematically improper.

So...can someone prove to me why dx = (dx/dt) dt?

Because both x and y have been expanded parametrically as a function of t it is necessary for us to convert the "dx" in the area formula to "dt" — , Dec 08 '16 at 13:21

user407691 · Answer 1 · 2017-01-21T09:37:56.057

If you want to avoid Leibniz notation altogether (as I tend to prefer doing), you can derive the area for a parametric curve using simple Riemann approximations.

Recall the definition of the definite integral of $f(x)$ on some continuous interval $[a,b]$: $$\int_a^b f(x) dx = \lim_{n\to\infty} \sum_{i=1}^n f(x_i^*) \Delta x$$ where $x_i^*$ lies on the subinterval $[x_{i-1}, x_i]$ after subdividing $[a,b]$ into $n$ subregions of equal length $\Delta x$, such that $\Delta x = \frac{b-a}{n}$.

Now, consider a parametric curve $(f(t), g(t))$ for $t$ on some interval $[\alpha, \beta]$. We will show that the area under the parametric curve can be approximated by adding up rectangles, where rectangle $i$ will have a width of $\Delta x_i = f(t_i) - f(t_{i-1})$ and a height of $y_i = g(t_i)$. For instance, the $1^{st}$ subinterval will have a width of $\Delta x_1 = f(t_1) - f(t_0)$, where $t_0 = \alpha$. (Each rectangle's area is approximated using right endpoints, although we could just as easily have used left endpoints or midpoints.)

First, we subdivide the interval $[\alpha, \beta]$ into $n$ subregions, each of length $\Delta t = \frac{\beta - \alpha}{n}$.

Next, consider the $i^{th}$ subinterval $[t_i, t_{i-1}]$. The Mean Value Theorem guarantees that there is some value $t_i^*$ in this interval whose slope is the secant line of the endpoints. In symbols, we have $$\frac{f(t_i) - f(t_{i-1})}{t_i - t_{i-1}} = f'(t_i^*)$$ We can rewrite this equation in the form $$f(t_i) - f(t_{i-1}) = f'(t_i^*)(t_i - t_{i-1})$$ or, equivalently, $$\Delta x_i = f'(t_i^*) \Delta t$$ (Notice how closely this equation resembles the form $dx = f'(t)dt$. I will say more about this after the proof.)

Now, we have a formula for the width of an approximating rectangle. The height can then be taken to be $g(t_i^*)$, and we have the area is approximately equal to: $$\sum_{i=1}^n g(t_i^*)f'(t_i^*) \Delta t$$ Take the limit and we have the definite integral $$\lim_{n \to \infty}\sum_{i=1}^n g(t_i^*)f'(t_i^*) \Delta t = \int_{\alpha}^{\beta}g(t)f'(t) \Delta t$$ Of course, $f'(t)$ must be differentiable and $g(t)$ must be continuous on the interval $[\alpha, \beta]$ for this equation to make sense. So, if you're considering the actual curve drawn out by the parameter, it only makes sense to find the area under portions of the graph that are actual functions (you can't calculate area over a region that involves the graph curving around over itself, since $x'(t)$ would be undefined at the point of curvature). Also, if the graph retraces itself for some t on the interval $[\alpha, \beta]$, your area calculations will be repeated, and you will end up with more (or less) area than you bargained for.

To address your original question about why dx = (dx/dt)dt: the short answer is that this is the definition of a differential. To give you some insight into why this is the case, consider a normal slope defined by two points, $\frac{y_2 - y_1}{x_2 - x_1} = \frac{\Delta y}{\Delta x}$. In particular, we have that $\frac{\Delta y}{\Delta x} \approx f'(x)$. Notice here that $\frac{\Delta y}{\Delta x}$ refers to a normal, everyday fraction, so if we wanted to approximate $\Delta y$, we could rearrange the equation to $\Delta y \approx f'(x) \Delta x$ without fuss.

This can be useful in cases where we know some easily-calculable value of $f$ (say $f(a)$), and want to calculate values of $f$ for $x$ very close to $a$. For instance, if $f(x)=\sqrt x$, it is easy to calculate $f(4)$, but not so much $f(4.0034)$. Thus, we can use equation $\Delta y \approx f'(x) \Delta x$ to approximate: $$f(a + \Delta x) = f(a) + \Delta y$$ $$f(a + \Delta x) \approx f(a) + f'(a) \Delta x$$ $$f(4 + 0.0034) \approx 2 + (0.25)(0.0034) = 2.00085$$ which is a good approximation for the actual value, $\approx 2.0008498$.

The differential, defined $dy = y'(x) dx$, is an expression of the fact that, for EXTREMELY small values (infinitely small) of $\Delta x$, the change in $y$ is exactly equal to the linear approximation. In other words, $$\lim_{\Delta x \to 0} \frac{\Delta y}{\Delta x} = \frac{dy}{dx} = f'(x)$$ which is the definition of the derivative, is essentially the same thing as saying that the linear approximation for $f(x + \Delta x)$ is exactly accurate when $\Delta x$ goes to 0, which is what $dy = f'(x)dx$ is saying.

The reason why you are allowed to multiply by differentials when using substitution: It is true that $\frac{dy}{dx}$ is not a fraction; it is the derivative. A proof of why it is okay to treat $\frac{dy}{dx}$ like a fraction lies in the proof of the Substitution Rule for integration.

If $F$ is a differentiable function of $g(x)$, and $F' = f$, then $$[F(g(x))]' = F'(g(x))g'(x)$$ by the chain rule. By the Fundamental Theorem of Calculus, we can integrate both sides to obtain $$\int F'(g(x))g'(x) dx = \int f(g(x))g'(x) dx = F(g(x)) + C$$ If we let $u = g(x)$, and we assume it's okay to operate with differentials on their own, then $$\frac{du}{dx} = g'(x) \Rightarrow du = g'(x)dx$$ Plugging this in, $$\int f(g(x))g'(x) dx = \int f(u) du = F(u) + C = F(g(x)) + C$$ which we know is the correct result. Thus, if we assume it's okay to operate with differentials in the context of substitution, we arrive at the same result, meaning that it's "basically" okay to do so.

This answer was longer than I planned, but hopefully it has helped in some way.

score 1 · Answer 2 · answered Dec 08 '16 at 13:24

The best way to convince yourself that this is true, is integrating it. We understand better macroscopic scales, so if $$dx(t)=\frac{dx}{dt}dt$$ Integrating with respect $t$ we obtain $$\int_{t_1}^{t_2}dx(t) = \int_{t_1}^{t_2}\frac{dx}{dt}dt$$ Both integrands are exact differentials, hence $$x(t_2)-x(t_1) = \int_{t_1}^{t_2}\frac{dx}{dt}dt=x(t_1)-x(t_2)$$

score 0 · Answer 3 · answered Dec 08 '16 at 13:21

0

If $x$ is a function of $t$, say $x=f(t)$, then the derivative of $x$ w.r.t. $t$ is written as: $$\frac{\mbox{d}x}{\mbox{d}t} = f'(t)$$ This notation is often manipulated into $\mbox{d}x = f'(t) \, \mbox{d}t$ to be used in the substitution rule for integrals. If you have covered integration by substitution (independent of this particular context of parametric functions); you have probably encountered this notation?

More generally, for a function $y=f(x)$, the differential $\mbox{d}y$ is defined as: $$\mbox{d}y = f'(x) \, \mbox{d}x$$ And the choice of this notation nicely agrees with the Leibniz notation for the derivative (see above). Apply this to $x=f(t)$ for this specific context.

answered Dec 08 '16 at 13:21

StackTD

27,903
34
63

Thanks! More generally speaking, I'm a little uncomfortable treating "dy/dx" - or in this case "dx/dt" as a fraction - is it meant to be so? Like...I'm looking for an explanation that goes beyond treating dy/dx as a fraction - because I know dy/dx is not a fraction. – Charlz97 Dec 08 '16 at 13:25
1

You're very right to be uncomfortable with treating it as a fraction but that would lead us quite far; take a look at Is $\frac{\textrm{d}y}{\textrm{d}x}$ not a ratio? for extensive answers on this matter. – StackTD Dec 08 '16 at 13:27
1

Loosely speaking, 'abusing' this notation (by sometimes treating it as a fraction, e.g. think of the chain rule) is often done because it [sometimes] works and because it offers an easy and intuitive notation. – StackTD Dec 08 '16 at 13:30

Miguel · Answer 4 · 2016-12-08T13:51:43.837

0

I would add this idea to previous answers: you can do all manipulations in a finite setting and then taking limits.

For instance the definition of derivative is: $$x'(t)=\lim_{\Delta t\to 0}\frac{\Delta x}{\Delta t}$$

So you can assert that, when $\Delta t\approx 0$ then: $$x'(t)\approx\frac{\Delta x}{\Delta t}$$ thus $$\Delta x\approx x'(t) \Delta t$$ So in any context that implies $\Delta t \to 0$ (e.g. integration) you can replace the finite increments $\Delta x$, $\Delta t$ by the corresponding differentials, thus substitute $dx$ by $x'(t) dt$.

Not a very rigourous reasoning, but I think it gives the same intuition that was in the historical definition of differential.

edited Dec 08 '16 at 13:51

answered Dec 08 '16 at 13:36

Miguel

3,265

Taking the limits of the two parts separately seems strange. You say "$dt=\lim_{\Delta t\to 0} \Delta t$" but the limit on the right-hand side is $0$, right? But we don't mean $dt = 0$. Same for $dx$. – StackTD Dec 08 '16 at 13:42
@StackTD Of course this gives $0=0$, that is why I say that it is not very rigourous. Because the only way to state that both "zeros" have comparable rate of vanishing is going back to the quotient and taking the limit that takes us back to the definition of derivative. – Miguel Dec 08 '16 at 13:45
1

It might be a matter of taste then, but I would avoid notations like "$dt=\lim_{\Delta t\to 0} \Delta t$". I also wouldn't say this is equivalent to "$0=0$" because although the RHS is in fact $0$, the LHS is not (meant to be). – StackTD Dec 08 '16 at 13:48

Deriving the formula for parametric integration (area under curve)

4 Answers4