2

Reading the following intuitive explanation of the Change of Variable Theorem, I wonder if the explanation provided can be made into a rigorous proof. As of now, I'm merely interested in the case of Riemann integration:

Theorem: let $A\subseteq \mathbb{R}^n$ be a $C^1$ injective function such that $\det g'(a) \ne 0$ for any $a\in A$. If $f:g(A)\to \mathbb{R}$ is (Riemann) integrable, then $$\int_{g(A)}f = \int_A(f\circ g) |\det g'|.$$


Other proofs I've seen take a different, seemingly less intuitive route. As an example, Spivak's Calculus on Manifolds first applies multiple 'reductions' (e.g. showing that it is sufficient to prove the result whenever $g$ is a linear transformation), and continues by applying induction on the dimension $n$ of $\mathbb{R}^n$.

Sam
  • 4,734
  • 1
    Since when is the Change of Variables Theorem called the reverse chain rule? The main difficulty with turning intuition into rigor here is that the Riemann integral requires a rectangular partition. Even linear maps rarely map rectangles to rectangles, and the case of a general $C^1$ diffeomorphism is considerably more troublesome. – Ted Shifrin Jan 11 '23 at 21:13
  • @TedShifrin you are right: whenever one deals with more than one variable the result is called "Change of Variable Theorem". I changed the post accordingly. I often think of it in my head as the (generalized) "Reverse Chain Rule" for hopefully obvious reasons though. – Sam Jan 11 '23 at 21:18
  • Even in one variable, the proof is the chain rule and the FTC. I would never think of it as a reverse chain rule, but so be it. – Ted Shifrin Jan 11 '23 at 21:34
  • @TedShifrin - As an intermediate step, we could define an integral using non-rectangular partitions, and prove that it's equivalent to the Riemann integral. I've taken that approach in https://math.stackexchange.com/questions/4829489/how-bad-is-the-error-in-the-volume-of-a-transformed-region-when-the-transformat – mr_e_man Dec 18 '23 at 00:05
  • @mr_e_man Proving that this works is the bulk of the non-inductive proof that’s in my multivariable mathematics textbook. It’s a non-trivial effort. – Ted Shifrin Dec 18 '23 at 01:16

1 Answers1

6

Can it be done? Sure, with lots of effort and pain. But, I think it is much easier to follow the relatively slick approach in Spivak (given the purposes he has in mind), or just develop measure theory and Lebesgue integrals (so the definition is much more flexible, and you can handle limits with much greater ease), so that it becomes much more obvious where the true difficulty lies regarding the geometry and analysis.


In one-dimension, the reason why the theorem is so simple is because of the chain rule AND the FTC. In 1D, continuous functions map intervals to intervals, so the geometry is completely trivial, and furthermore, by using the FTC and chain rule, one by-passes all the geometry/analysis regarding how a mapping $g$ distorts lengths/volumes (in particular the geometric role of $g'$ as the change-of-length-factor is underemphasized, and merely regarded as the correct thing to place in order to undo the chain rule). So, in this regard, the 1D version is highly special/exceptional.

As soon as you go to two dimensions, you'll immediately recognize the crazy jump in difficulty because images of sets are no longer easily characterized (it's already not so obvious for linear maps, let alone non-linear maps, as mentioned in the comments). Using Riemann-integrals alone just complicates matters because it is defined in a very ad-hoc manner: Riemann integrals are properly defined only on closed rectangles with sides parallel to the coordinate axes; actually carefully extending the definition to other types of sets (even simple things like a rotated rectangle or an open disk, or a bounded open set) requires quite a bit of finessing. Couple this with the above fact that images of sets can be pretty wild, and that volumes scale according to a completely non-obvious factor, just goes to show how non-trivial the theorem gets when you try to convert the intuition into an airtight theorem and proof (stated with generality that is sufficient for most practical applications). Also, one major issue is that you have to somehow manage the effects of Lebesgue-measure zero sets even though at the stage of Riemann-integrals people often don't develop many of their properties (e.g a lot of this mess is hidden behind Spivak's definition of the extended integral right after the section on partitions of unity).

If I remember correctly, Hubbard and Hubbard's text might have a proof which more closely (compared to Spivak) resembles your linked approach (but then again, I'm not sure if they prove things with sufficient generality). But if you read their proof, you'll see that a lot of effort is spent trying to manage what are essentially just measure-zero effects, and the fact that you're not dealing with perfect rectangles.


Lebesgue integrals (which I think you know based on some of your previous questions) are completely ignorant to the geometry of sets so that already relieves us of a big headache (all we care about are $\sigma$-alebras, and we just need the one relatively easy-to-prove technical lemma which says locally Lipschitz maps preserve the Lebesgue $\sigma$-algebra). So, regardless of how crazy the sets $A$ and $g(A)$ are, the Lebesgue integral is completely unaffected. Also, Lebesgue-integrals and measure theory are very nice and intuitive in the sense that it is very easy to state and prove an abstract version of the change-of-variables theorem "if I have two measure spaces and a mapping between the two, then an integral over the first is related to an integral over the second by such and such formula". Once you understand the recurring theme in math that given two spaces, you can translate information from one space to another (under some hypotheses), you'll see that the abstract-change-of-variables theorem is very natural and obvious. So, with this, we now arrive at \begin{align} \int_{g(A)}f\,d\lambda&=\int_Ag^*(f\,d\lambda)=\int_A(f\circ g)\,d(g^*\lambda). \end{align} The only thing we have to work for now is to express the pullback measure $g^*\lambda$ on $A$ back in terms of the usual Lebesgue-measure $\lambda$. Well, that is precisely why the Radon-Nikodym theorem exists. Note that with Lebesgue integrals, it is at this stage where technical difficulties begin because the Radon-Nikodym theorem is non-trivial. But once we accept it, we can write \begin{align} \int_{g(A)}f\,d\lambda&=\int_Ag^*(f\,d\lambda)=\int_A(f\circ g)\,d(g^*\lambda)=\int_A(f\circ g)\frac{d(g^*\lambda)}{d\lambda}\,d\lambda. \end{align} So, the only thing left now is to calculate the Radon-Nikodym derivative in terms of more familiar quantities. Here, Lebesgue's differentiation theorem (non-trivial) tells us that for a.e $x\in A$, \begin{align} \frac{d(g^*\lambda)}{d\lambda}(x)=\lim_{r\to 0^+}\frac{\lambda(g(B(x,r)))}{\lambda(B(x,r))}, \end{align} i.e the Radon-Nikodym-derivative is the limit of ratios of volumes. It is at this stage where you can use your corresponding knowledge of the linear case, and some technical analysis to prove that this equals $|\det Dg_x|$. Thus, we finally get \begin{align} \int_{g(A)}f\,d\lambda&=\int_Ag^*(f\,d\lambda)=\int_A(f\circ g)\,d(g^*\lambda)=\int_A(f\circ g)\frac{d(g^*\lambda)}{d\lambda}\,d\lambda=\int_A(f\circ g)\cdot |\det Dg|\,d\lambda. \end{align} So, to summarize, the proof when using Lebesgue integrals consists of the above 4 equalities. The first and second are very easy and routine, and merely amount to "transferring information from $g(A)$ to $A$". The third and fourth steps use the Radon-Nikodym theorem and Lebesgue's differentiation theorem, both of which are non-trivial, but they are the rigorous incarnation of our intuition regarding "infinitesimal ratios of volumes of the transformed set to the original set". The above four equal signs are how I actually think about the rigorous proof of the change of variables theorem in $\Bbb{R}^n$ (which I have written up completely here).

So, you see with Lebesgue integrals, it easily addresses the issue of how to relate integrals over two different spaces (i.e first equal sign above), which already offers a great technical simplification compared to Riemann. The only drawback is that we have two different measures: $\lambda$ and $g^*\lambda$. Bringing things back to the same measure is where the true analytic difficulties lie (and this is aptly reflected in the above proof as well). So I think this way, you can more clearly see where/how the intuition is being converted to rigor.

peek-a-boo
  • 55,725
  • 2
  • 45
  • 89