I understand that integration by substitution can be justified in the following way. If $I=\int f'(g(x))g'(x) \, dx$ then $I=f(g(x))+C$. If we make the substitutions $$ u=g(x) \text{ and } du=g'(x)dx $$ then $I$ becomes $\int f'(u)du=f(u)+C=f(g(x))+C$, which is the same. However, I often see substitutions that don't seem to take this form. For example, a common way to evaluate $$ I=\int \sqrt{1-x^2} \, dx $$ is by setting $x=\sin u$ (or, to be more precise, $u=\arcsin x$). The derivative of $\arcsin x$ is $$ \frac{1}{\sqrt{1-x^2}} \, , $$ which does not appear in $I$. It doesn't seem like $I$ is of the form $\int f(g(x))g'(x) \, dx$. So why is this substitution justified? Are we still doing the chain rule in reverse, or is something else going on?
-
1There are various techniques of integration that involve "substitution", but specifically $u$-substitution is a fairly direct consequence of taking a derivative. Describing it as "doing the chain rule in reverse" seems overly elaborate to me; the chain rule is for differentiating the composition of functions, and most times when you set $u = g(x)$, the composition of functions is pretty trivial. The big distinction between $u$-substitution and trigonometric substitution is the direction in which the integral gets rewritten. I can elaborate if you like, as an Answer. – hardmath Nov 12 '20 at 19:43
-
@hardmath Yes, that would be very helpful, thank you. It would be useful to know the difference between $u$-substitution and trigonometric substitution. – Joe Nov 12 '20 at 19:56
3 Answers
The fact that something does not appear does not mean it is not there. You can always multiply and divide; $$\sqrt{1-x^2}=\frac{1-x^2}{\sqrt{1-x^2}},\qquad \lvert x \rvert <1. $$
Let me give some more detail. Note that, for $\lvert x \rvert <1$, $$\label{1}\tag{1} \sqrt{1-x^2}=\frac{\cos^2(\arcsin x)}{\sqrt{1-x^2}}, $$ which equals $f(g(x))g'(x)$ for $f(u)=\cos^2 u, g(x)=\arcsin x$. And so, $$ \int \sqrt{1-x^2}\, dx = \int \frac{\cos^2(\arcsin x)}{\sqrt{1-x^2}}\, dx = \int \cos^2(u)\, du.$$ As you can see, this is exactly the application of the chain rule that you have mentioned. The usual technique of letting $u=\arcsin x$ and computing $dx$'s and $du$'s is just a practical method to compute \eqref{1}.

- 9,727

- 32,319
-
-
2Thank you for posting this answer, Giuseppe. I wonder if this approach can be used more generally. If $I = \int f(g(x)) , dx$, then rewrite $I$ as $$\int f(g(x)) \cdot \frac{1}{g'(x)} \cdot g'(x) , dx , .$$ Then let $h$ be the function defined by $$h(x)=\frac{f(x)}{g'(g^{-1}(x))} , .$$ This means that $$I=\int h(g(x))g'(x) , dx = H(g(x)) + C , ,$$ where H is an antiderivative of h. The only requirement appears to be that $g$ is one-to-one, so that $g^{-1}$ exists. If this argument is correct, then it appears that substitution can be used in pretty much any context. – Joe Nov 17 '20 at 12:18
-
1@Joe: sure. That's exactly the formal way of expressing the usual manipulations that one does when solving integrals. – Giuseppe Negro Nov 18 '20 at 21:52
-
-
That’s great to hear. I think that answers all of my questions, so I’ll be accepting this answer. – Joe Nov 18 '20 at 23:39
There is a small glitch in the answers given by both Giuseppe Negro and Joe. In the comments, Joe generalized Giuseppe's answer by multiplying and dividing by $g'(x)$, which might be undefined for some values of $x$. Joe's answer involves using $\frac{dx}{du} \frac{du}{dx} = 1$, but it could happen that sometimes $\frac{dx}{du} = 0$ and $\frac{du}{dx}$ is undefined. (The restriction $|x|<1$ that appears in Giuseppe's answer hints at the problem.)
Here is an example that illustrates the problem. Consider the integral $$ \int \frac{dx}{(\sqrt[3]{x})^2 + 1}. $$ Note that the integrand is defined and continuous for all $x$. We might solve it with the substitution $u = g(x) = \sqrt[3]{x}$, $x = u^3$, $dx = 3u^2\,du$ as follows: \begin{align*} \int \frac{dx}{(\sqrt[3]{x})^2+1} = \int \frac{3u^2\,du}{u^2+1} &= \int 3 - \frac{3}{u^2+1}\,du\\ &= 3u - 3\tan^{-1}(u) + C = 3\sqrt[3]{x} - 3\tan^{-1}(\sqrt[3]{x}) + C. \end{align*} This answer is correct, but the justification proposed by Giuseppe and Joe fails at $x = 0$. The problem is that $\frac{du}{dx} = g'(x) = 1/(3 (\sqrt[3]{x})^2)$, which is undefined at 0. So you cannot simply multiply and divide by $g'(x)$ to justify this answer. And you can't use the chain rule to differentiate the answer at $x=0$, because $\sqrt[3]{x}$ is not differentiable at 0.
Nevertheless, the answer is correct for all $x$, including $x=0$. Here is a way to justify it. First, we write the integral as a definite integral. For any $a$ and $b$, $$ \int_a^b \frac{dx}{(\sqrt[3]{x})^2 + 1} = \int_{\sqrt[3]{a}}^{\sqrt[3]{b}} \frac{3u^2\,du}{u^2+1} = (3\sqrt[3]{b}-3\tan^{-1}(\sqrt[3]{b})) - (3\sqrt[3]{a}-3\tan^{-1}(\sqrt[3]{a})). $$ This can be justified by the substitution $x = u^3$, $dx = 3u^2\,du$, which will transform the second integral into the first. Note that $\frac{dx}{du}$ is defined for all $u$, so this works even if the interval of integration includes 0. Now rewrite this result with different letters: $$ \int_a^x \frac{dt}{(\sqrt[3]{t})^2+1} = (3\sqrt[3]{x}-3\tan^{-1}(\sqrt[3]{x})) - (3\sqrt[3]{a}-3\tan^{-1}(\sqrt[3]{a})). $$ Rearranging, we have $$ (3\sqrt[3]{x}-3\tan^{-1}(\sqrt[3]{x})) = \int_a^x \frac{dt}{(\sqrt[3]{t})^2+1} + (3\sqrt[3]{a}-3\tan^{-1}(\sqrt[3]{a})). $$ Finally, by the fundamental theorem of calculus we have $$ \frac{d}{dx}(3\sqrt[3]{x}-3\tan^{-1}(\sqrt[3]{x})) = \frac{1}{(\sqrt[3]{x})^2+1}, $$ which justifies the answer.
The upshot is that the method does work, but to avoid small glitches a slightly different justification is needed.
For more details, see my book Calculus: A Rigorous First Course, Section 8.4: Substitution with Inverse Functions.

- 2,746
A more sophisticated way of making substitutions is given in Michael Spivak's Calculus. Integrals of the form $$ \int f'(g(x))g'(x) \, dx $$ can be easily solved by substituting $u=g(x)$. In fact, many of these integrals are so simple that you can do them in your head. However, the substitution $u=g(x)$ can still be made even if the factor $g'(x)$ does not appear. In general, $$ \int f(g(x)) \, dx \tag{*}\label{*} $$ can be solved in the following way, provided that $g$ is one-to-one: \begin{align} u &= g(x) \\ x &= g^{-1}(u) \\ \frac{dx}{du} &= (g^{-1})'(u) \\ dx &= (g^{-1})'(u)du \end{align} This means that \eqref{*} can be transformed to $$ \int f(u)(g^{-1})'(u) \, du \, , $$ which in practice often makes the integral simpler to solve than before. Hence, $$ \int f(g(x)) \, dx = \int f(u)(g^{-1})'(u) \, du \, . $$ The validity of this approach can be demonstrated by differentiating $\int f(u)(g^{-1})'(u) \, du$ with respect to $x$: \begin{align} \frac{d}{dx}\int f(u)(g^{-1})'(u) \, du &= f(u)(g^{-1})'(u)\frac{du}{dx} \\ &= f(u) \frac{dx}{du} \frac{du}{dx} \\ &= f(g(x)) \, . \end{align}

- 19,636
-
This may be a great mneumonic, but does it justify the rule? It's hard to call this a proof. Can you show how this proves the correctness? The integrands are different functions, and, even Spivak himself writes that these equations "can't be taken literally". (I'm not sure what a non-literal equation is or how it can be used in a proof.) – SRobertJames Jun 14 '23 at 13:11
-
@SRobertJames: The issue with the formula $$ \int f'(g(x))g'(x) , dx = \int f'(u) , du $$ is that the notation $\int f(x) , dx$ ought to refer to the set of functions $F$ such that $F'=f$. Under this interpretation $\int f'(g(x))g'(x) , dx$ should mean be the set of functions which contains $f\circ g$ (and every function belonging to this set equals $f\circ g$ up to a constant function). However, $\int f'(u) , du$ should refer to the set of functions containing $f$, so in general, the two sets are not equal. – Joe Jun 16 '23 at 11:38
-
@SRobertJames: In actual practice, however, we tend to write things like $\int x , dx = \frac{x^2}{2}+C$. Under this interpretation, $\int f'(g(x))g'(x) , dx$ means not the set of functions containing $f\circ g$, but rather the formal expression "$f(g(x))+C$". If we adopt the convention that "$u=g(x)$", then everything works out fine. Now, you might complain that this is not rigorous – and you would be right – but the issue is not with integration by substitution per se, but rather the fact that integration notation is somewhat sloppy to begin with. – Joe Jun 16 '23 at 11:45
-
@SRobertJames: My answer shows, fairly convincingly, that the formal method for integration by substitution for indefinite integrals produces correct results, even though it looks dubious (for instance, because we are treating $du/dx$ as a fraction). But anyway, you can always check whether the dubious method works simply by differentiating your purported antiderivative of the function. – Joe Jun 16 '23 at 11:50