1

Context: Self Studying Analysis.

Currently, I'm looking at the Inverse Function Theorem (Baby Rudin), and the proof just seems so contrived (especially the introduction of the contraction which has so many "useful" properties) even though I understand every step that he takes. This is not the only proof where I run into this situation, and I was wondering if there might be any resources out there that can bridge the gap between the final presentation (of the proof) and the thought process that motivates how the proof is constructed?

I suppose another way of phrasing this is - what do I need to work on in order to come up with my own proofs? Because it seems as though studying the proofs directly doesn't seem to help much in developing the ability to construct (more involved) proofs. Also, I've been working on all the exercises in every chapter so far, and would only look at the answer if I am unable to do it after some time (a few hours), but I'm not so sure that this is the most effective way to generate the insight that underpins constructing proofs from scratch.

Sean Lee
  • 1,295

3 Answers3

4

Regarding your second question, to learn to prove stuff, read a LOT of proofs, and write a LOT of proofs.

Also, learn many examples and many counterexamples. Learn what's true and what's not true. In other words, build your scaffolding of mathematical knowledge as high and as strong as you can. You'll need that knowledge when writing proofs.


Regarding "contrived" proofs, to gain insight into how they are derived, you might dig into the history of the theorem, finding the oldest proofs that you can, and then seeing how the proof evolved in later research papers and textbooks, up to the present.

For this particular "contrived" proof of the inverse function theorem, perhaps one might have an issue with one of the tools in that theorem, such as contraction mappings. Perhaps it seems a rather "contrived" to go out of the way to consider contraction mappings when all you want to do is prove the Inverse Function Theorem.

Let me spin out a story.

Imagine you're studying an old theorem, let's call it Theorem A. You've learned the old, original proof, and it's the same proof everyone learns. It's a long proof, it's clunky, but it's kind of straightforward: do this first, yeah that makes sense; then do this, yeah that makes sense; and so on, and eventually you get to the end, it all makes sense to you, and maybe you could even reproduce the proof in all its long clunkiness.

But, you are convinced that there's some underlying principle that goes into this proof, something really basic and useful, if only you could figure out what that principle is and extract it from the proof.

And then you realize what that principle is. You write that principle down:

Theorem X ...

Now, you don't know yet whether Theorem X is true. But you decide to suspend your disbelief for a moment (an important skill for every mathematician to master), and instead to focus on verifying your intuition: Does Theorem X actually imply Theorem A? And when you write out the proof that Theorem X implies Theorem A, you see how beautiful and elegant the argument is (somewhat might even call it a "contrived" argument). That convinces you: There might really be something here, if Theorem X is actually true. And so you set out to prove Theorem X, and you succeed. And then, lo and behold, it turns out that Theorem X can be applied all over the place, for many, many different things.

In this story, of course, Theorem A is the Inverse Function Theorem, and Theorem X is the Contraction Mapping Principle. The Contraction Mapping Principle is indeed very useful, with many many other applications beyond the Inverse Function Theorem. For example, you can also use the Contraction Mapping Princple to prove the Existence and Uniqueness Theorem for linear ODEs (see the very nice little book by Victor Bryant entitled "Metric Spaces").

So, what's really going on in Baby Rudin is that he doesn't just want to give you a straightforward proof of the Inverse Function Theorem, however clunky it might be. He wants you also to learn a very important and useful abstact theorem, namely the Contraction Mapping Principle.

Lee Mosher
  • 120,280
3

As suggested in a comment, it's hard to give a reasonable answer in this generality; if you show us a specific "contrived" proof people may be able to supply some motivation for the contrivance. But in this generality the question comes very close to "How do I learn to prove stuff?", and while good answers to that question exist they tend to be book-length.

Contrivances in General

Of course a big part of the answer to "How can I learn to prove stuff?" is "Learn the standard techniques in the area". Note

If a contrived trick gets used several times it has by definition become a standard technique.

Contractions

So. How did someone come up with that proof of IFT? Probably she or he knew this:

If you have to show that $f(x)=a$ has a solution it's often useful to find a different function $g$ such that you need to show $g$ has a fixed point.

This is so because there are various simple ways to show $g$ has a fixed point, for example by showing $g$ is a contraction. The first person to give that proof of IFT had seen several examples of solving equations by finding fixed points, so he or she decided to try it. It worked, thus adding to the list of examples where that particular trick is useful.

IFT

Some years ago I put IFT on my personal list of things I know how to prove. Not because I'd learned a proof. I never really feel I understand anything unless I've figured it out for myself; I convinced myself that I could prove IFT by changing it to a problem about fixed points.

Let's see if I can still do that. The actual theorem has a lot of technicalities, but the essence of it, the hard part, is this:

Theorem (most of IFT). Suppose $f:\Bbb R^n\to\Bbb R^n$ is $C^1$, $f(0)=0$ and $Df(0)$ is invertible. There exists $\delta>0$ such that if $|y|<\delta$ then there exists $x$ with $f(x)=y$.

The proof is immediate from the heuristic above plus one other "trick", which we can "motivate" (since I came up with the trick myself I just have to explain how I thought of it).

Motivation. Fix $y$ with $|y|$ small. We need to show that $$f(x)=y$$has a solution. This is the same as showing that $g$ has a fixed point, where $$g(x)=f(x)-y+x.$$

So it's enough to show that $g$ is a strict contraction. By a little calculus, it would be enough to show there is a convex neighborhood $C$ of the origin such that the operator norm of the derivative satisfies $$||Dg(x)||\le\lambda<1\quad(x\in C).$$But $$Dg(x)=Df(x)-I$$(that is, $$Dg(x)h=Df(x)g-h.)$$Hmm. Since $Dg$ is continuous, this would show that $||Dg(x)||$ was small for small $x$ if we had $Df(0)=I$. Hmm... Aha. The condition $Df(0)=I$ is actually WLOG!

Proof. Or actually not so much a proof as an explanation of the main step in the proof; I leave it to you to insert the quantifiers and the logic to convert this into something I'd actually accept as a proof from a student.

First, replacing $f$ by $f\circ\psi$ for an appropriate diffeomorphism $\psi$, we can assume that $$Df(0)=I.$$ (For example, if $T=Df(0)$ let $F(x)=f(T^{-1}x)$; showing $f(x)=y$ has a solution is the same as showing $F(x)=y$ has a solution.)

Fix $y$ and define $g$ as above. Since $Df$ is continuous there exists $\epsilon>0$ such that $$||Df(x)-I||\le 1/2\quad(|x|<\epsilon).$$So $$||Dg(x)||<1/2\quad(|x|\le\epsilon)..$$ So $g$ is a strict contraction, hence $g$ has a fixed point.

I bet that's more or less the same as the proof in the book, since it's the obvious (sorry) way to try to prove it using a contraction. (If it looks very different I conjecture that's because he didn't start with the "WLOG $Df(0)=I$"; if so try rewriting the proof in the book starting with $Df(0)=I$ tthat will simplify a lot of the formulas, probably making it all look more like what's above.)

Note. For a minute it seemed like that was a proof that $f$ is surjective, which of course is nonsense, because I didn't see where we used the fact that $y$ is small.

But duh. What's above reads as though the theorem was "Any strict contraction has a fixed point", which is also nonsense. The actual theorem is of course

Theorem(Banach) If $X$ is a complete metric space and $g:X\to X$ is a strict contraction then $g$ has a fixed point.

So it's enough to prove this:

There exists $r>0$ such that if $0<\rho<r$ there exists $\delta>0$ such that if $|y|<\delta$ then $g(X)\subset X$, for $X-\overline{B(0,\rho)}$.

Hint: $$|g(x)|\le|y|+|g(0)-g(x)|\le |y|+c|x|.$$

0

Try these monographs aimed at undergraduate first-timers (to real analysis).

Title Author Publication Year
How to Think about Analysis Lara Alcock 2014
Writing Proofs in Analysis Jonathan M. Kane 2016
An Introduction to Proof through Real Analysis Daniel J. Madden, Jason A. Aubrey 2017
The Real Analysis Lifesaver: All the Tools You Need to Understand Proofs Raffi Grinberg 2017
Real Analysis With Proof Strategies Daniel W. Cunningham 2021
Analysis with an Introduction to Proof, 6th ed. Steven R. Lay, Richard G. Ligo 2023

Dave L. Renfro recommends other monographs at

H7 De
  • 1