Clarification of the $u$-substitution theorem

Question

I came across this phrasing of the theorem justifying u-substitution:

Let $F(x)$ be an antiderivative of $f(x)$ in an interval $I.$ Let $\phi$ from $J$ to $I$, $\phi(t) = x$ be a differentiable function. Then $\int f(x)dx=\int f(\phi(t))\phi^{\prime}(t)dt.$

I am confused about the assumptions part - first of all, why can we assume that there exists a function $\phi$ such that $\phi(t)=x$?

Secondly, we know that the image of $\phi$ over $J$ is a subset of $I$. Why aren't we demanding that the image of $\phi$ will be equal to $I$, and not just a subset? In my mind, we are "losing" some $x$ values that are not given by $\phi(t)$ if it is strictly contained.

Lastly, in the final steps of the proof of this theorem, we said that $F(\phi(t))+c=F(x)+c=\int f(x)dx.$ Why can we treat $x$ just like a "dummy" variable? We assumed it equals a function $\phi(t)$ after all.

+1 : good posting. Unfortunately, the questions that you pose are difficult to answer in a manner that will be immediately understandable to you. I strongly suggest that you find (either in the Calculus book you are excerpting from, or some other Calculus book) applications of the theorem to see if they make sense. ...see next step — user2661923, Mar 18 '23 at 07:08
If you find a specific Integration problem, from a specific book, that is solved via substitution, and you are confused/unclear about the validity of one of the steps, then please post the problem, step by step, and indicate (very) clearly which specific step that you are questioning. — user2661923, Mar 18 '23 at 07:08
@user2661923 Thank you for your answer. I looked over many calculus books and applications of this theorem and they were clear to me - my problem is with those questions I posed above. — R24698, Mar 18 '23 at 07:12

score 6 · Answer 1 · answered Mar 18 '23 at 08:28

First, there is no a special meaning in the expression $\phi(t) =x$, is just a way to write that the function goes from $J$ to $I$, probably the author wrote it in this way to make more intuitive that we are substituting $x$ for $\phi(t)$ and there is no a special reason if you consider that $\phi(J)$ is a subset of $I$ or equals $I$, you can assume any and the theorem remains true, we only require that the composition $f\circ\phi$ makes sense. Now, you have to notice that the Theorem (as you wrote it) is not true, some hypotheses are missing, what we are really doing is to use the following theorem:

Let $J=[\alpha, \beta]$, $I$ be an interval and let $\phi : J\rightarrow I$ have a continuous derivative on $J$. If $f: I\rightarrow \mathbb{R}$ is continuous function. Then $(f\circ \phi)\phi'$ is integrable over $J$ and $$\int_{\alpha}^{\beta} f(\phi(t))\phi'(t) \ dt = \int_{\phi(\alpha)}^{\phi(\beta)} f(x) \ dx$$

Finally, at the final step, we are not treating $x$ as a dummy variable, the reason of why the eaquality holds, is that if $F$ is an antiderivative of $f$ and $H = F\circ \phi$, then $H$ is differentiable and $H'(t) = f(\phi(t))\phi'(t)$, this implies that $H$ is an antiderivative of $(f\circ \phi)\phi'$ and by the Fundamental theorem of calculus we have

$$\int_{\phi(\alpha)}^{\phi(\beta)} f(x) \ dx = F(\phi(\beta)) - F(\phi(\alpha)) $$

$$\int_{\alpha}^{\beta} f(\phi(t))\phi'(t) \ dt = H(\beta) - H(\alpha)$$

but notice that by definition of $H$ you have that $H(\beta) = F(\phi(\beta))$, this is the equality that you have. The theorem and the proof you mentioned looks like an engineering proof, probably that is the reason the author omit some details that a priori can cause confusion.

thank you, two questions:

what about the indefinite case?

if I got it right, the change in the integration limits from $\alpha$ to $\phi(\alpha)$ is the "adjustment" that allows $\phi(J)$ to be just a subset of I and not I itself? if not, how do you mediate the fact that there are some values of x that we are not integrating over in the integral over x in I? — R24698, Mar 18 '23 at 09:53
You are right, keep in mind that the theorem just says what happens with the integral of $f$ in the interval $[\phi(\alpha), \phi(\beta)]$ and it doesn't ensure that $[\phi(\alpha), \phi(\beta)] = I$, so yes, we loss some values of $x$ to integrate, but in practice it doesn't matter, since we usually start with a Integral of the form $$\int_{\alpha}^{\beta} f(\phi(t))\phi'(t) \ dt$$ which we want to reduce to the integral $$\int_{\phi(\alpha)}^{\phi(\beta)} f(x) \ dx$$ — Jorge S., Mar 18 '23 at 10:16

ryang · Accepted Answer · 2023-03-19T08:16:39.540

3

Let $F(x)$ be an antiderivative of $f(x)$ in an interval $I.$ Let $\phi$ from $J$ to $I$, $\phi(t) = x$ be a differentiable function. Then $\int f(x)dx=\int f(\phi(t))\phi^{\prime}(t)dt.$

I am confused about the assumptions part - first of all, why can we assume that there exists a function $\phi$ such that $\phi(t)=x$?

The sentences starting with "let" specify the notation for the objects whose relationship is displayed after the word "then". Your question is akin to asking why in the statement of Pythagoras' Theorem, we can assume that a right-angled triangle has a longest side of length $c.$ If your integrand can be expressed in terms of some function $\phi$ in a way that satisfies the theorem's conditions, then the theorem is applicable.

Secondly, we know that the image of $\phi$ over $J$ is a subset of $I.$ Why aren't we demanding that the image of $\phi$ will be equal to $I$, and not just a subset?

$I$ is in fact meant to be the image, not necessarily the codomain, of $\phi.$

Lastly, in the final steps of the proof of this theorem, we said that $F(\phi(t))+c=F(x)+c=\int f(x)dx.$ Why can we treat $x$ just like a "dummy" variable? We assumed it equals a function $\phi(t)$ after all.

An indefinite integral's variable is not dummy, because its scope extends outside the integral; in this case, variable $x$ is related to variable $t$ via the function $\phi.$

By the way, contrary to Jorge's assertion, your theorem statement is correct: the indefinite-integral version has more relaxed conditions than the standard versions of the change-of-variable theorem. You may be interested to read my stronger (i.e., slightly more general) statement & proof of the theorem and explanation there regarding dummy variables.

edited Mar 19 '23 at 08:16

answered Mar 19 '23 at 00:19

ryang

38,879
14
81
179

2

thank you. you said that I is meant to be the image of J under $\phi$. in my book and in other lecture notes, I was just a subset of the codomain and not the image of J. if we assume that I is a subset and not the image, we lose some possible x values we are integrating over. how do we get around it?
another point - I understand that x is not a dummy variable, but if it is a function of t, why are we able to integrate it in the integral with respect to x just like it was a "regular" variable? since it is an integral with respect to x, and the relation to t doesn't interest us?
– R24698 Mar 19 '23 at 08:59
1

@RTabak 1. "you said that $I$ is meant to be the image of $J$ under $ϕ.$ In my book and in other lecture notes, $I$ was just a subset of the codomain and not the image of $J.$" $\quad$ I'm saying that you need to give these authors (including the one who wrote the above excerpt from your book) the benefit of the doubt that they do mean that $I$ the image of $ϕ$ (notice that none of them explicitly indicate that $I$ is the codomain?); click on my second link above, – ryang Mar 19 '23 at 14:10
1

where my statement of the theorem clearly exhibits $I$ as the image of $ϕ;$ to understand why this is the case, examine the first line of the proof that I provided there. $\tag{}$ 2. "I understand that $x$ is not a dummy variable, but if it is a function of $t,$ why are we able to integrate it* in the integral with respect to $x$ just like it was a "regular" variable? Since it is an integral with respect to $x,$ and the relation to $t$ doesn't interest us?" – ryang Mar 19 '23 at 14:11
1

You're being incoherent: for example neither of the "it"s seem to connect to any previously-mentioned object. In any case: this theorem is also called the change-of-variable theorem, and my counter-question to you is: whether you are working purely in the $x$-world or purely in the $t$-world, why should the relationship between $x$ and $t$ need to be in the foreground? Why should being a member of organisation A automatically affect your work for organisation B? – ryang Mar 19 '23 at 14:11

Clarification of the $u$-substitution theorem

2 Answers2