10

This is a very soft and potentially naive question, but I've always wondered about this seemingly common phenomenon where a theorem has some method of proof which makes the statement easy to prove, but where other methods of proof are incredibly difficult.

For example, proving that every vector space has a basis (this may be a bad example). This is almost always done via an existence proof with Zorn's lemma applied to the poset of linearly independent subsets ordered on set inclusion. However, if one were to suppose there exists a vector space $V$ with no basis, it seems (to me) that coming up with a contradiction given so few assumptions would be incredibly challenging.

With that said, I had a few questions:

  1. Are there any other examples of theorems like this?
  2. Is this phenomenon simply due to the logical structure of the statements themselves, or is it something deeper? Is this something one can quantize in some way? That is, is there any formal way to study the structure of a statement, and determine which method of proof is ideal, and which is not ideal?
  3. With (1) in mind, are there ever any efforts to come up with proofs of the same theorem using multiple methods for the sake of interest?
Isochron
  • 1,126
  • 1
    I think that in many cases the diferences in dificulty come from the level of the tools used. For example prooving the Pitagoras theorem with vectorial calculus is really easy, but do it only with geometric school level tools is more complicated. – Brian Britos Simmari Jan 18 '23 at 20:30
  • 1
    Is it more complicated? I don't think so - see this collection of easy geometric proofs here. – Dietrich Burde Jan 18 '23 at 20:41
  • 3
    A long time ago I asked a similar question on MO: https://mathoverflow.net/questions/43820/extremely-messy-proofs – Qiaochu Yuan Jan 18 '23 at 20:54
  • 2
    Not directly related to your questions but: I think the Zorn's lemma example for bases is a little misleading because if you just use choice and recursion directly, the proof is very similar to the intuitive proof for the finite-dimensional case: just keep choosing vectors for as long as possible while maintaining linear independence, and you'll get a basis. In my opinion ZL actually obscures the simplicity. – blargoner Jan 18 '23 at 21:43
  • 1
    Not a theorem proof, but Fortuné Landry spent a lot of work looking for the factorization into primes of the number $2^{58}+1 = 5 \cdot 107367629 \cdot 536903681$. When it was realized that in general $$2^{4k+2}+1 = (2^{2k+1}-2^{k+1}+1)\cdot (2^{2k+1}+2^{k+1}+1)$$ then it became a triviality to factor $2^{58}+1$. See Aurifeuillean factorization. – Jeppe Stig Nielsen Jan 19 '23 at 07:06
  • Reminds me of the proof that the area of intersection of two concentric unit squares is greater than $3/4$. – Dan Jan 19 '23 at 07:16
  • Maybe the 'simpler' proof uses more powerful Lemmas? – Bram28 Jan 19 '23 at 14:41

4 Answers4

15

Very often the first proof of a result which appears in the literature is extremely messy because the mathematician who proved it is working at the very edge of what is possible with the tools of the day; then it gets simplified over time as other mathematicians better understand what is going on and develop better machinery for streamlining the proofs. These first proofs are typically not presented to students because they are terrible, but the disadvantage of not knowing them is that you don't see how valuable the machinery that streamlines the modern proofs is.

There are many examples of this sort of thing, some of which you can find at this MO question; here's one that I came across while writing a blog post about the Sylow theorems. It is about

Cauchy's theorem: if a finite group $G$ has the property that its order $|G|$ is divisible by a prime $p$, then $G$ has an element of order $p$.

There is an extremely slick proof of this theorem which comes from consider the set of solutions to the equation

$$\{ (g_1, \dots g_p) \in G^p : g_1 g_2 \dots g_p = e \}$$

and then considering the action of the cyclic group by rotation $(g_1, g_2 \dots g_{p-1}, g_p) \mapsto (g_2, g_3, \dots g_p, g_1)$, which you can see in the link. It takes maybe three sentences to give.

By contrast, Cauchy's original proof took 9 pages. He does it by explicitly constructing the Sylow $p$-subgroups of the symmetric group, then (I believe Cauchy was working at a time when "finite group" always meant "finite group of permutations" so for him all finite groups were already embedded into symmetric groups) using a clever counting argument to show that if a finite group $G$ has the property that $p \mid |G|$ and also embeds into another finite group which has Sylow $p$-subgroups, then $G$ has an element of order $p$; you can see the details in the link. I give a very abbreviated sketch of the proof; the full construction of the Sylow $p$-subgroups of the symmetric group is very tedious (I have never seen anyone give it in full, and tried doing it in a follow-up blog post but gave up because it was too tedious).

This is a good example of what I mean; Cauchy was working at a very early time in group theory before anyone had even defined an abstract group, and people just didn't understand group theory that well yet. There was not even the notion of a quotient group at the time. Once group theory was better understood better proofs were possible. Actually I have no idea who the above slick proof of Cauchy's theorem is due to nor how many decades it took after Cauchy's original proof for someone to find it.

Cauchy's original proof does have the advantage that it is much closer to being a proof of the first Sylow theorem. It has a generalization due to Frobenius which shows that if a finite group $G$ embeds into a finite group $H$ which has a Sylow $p$-subgroup, then $G$ must have a Sylow $p$-subgroup. And then you can prove Sylow I by exhibiting the Sylow $p$-subgroups of the symmetric groups, or somewhat more easily, the general linear groups $GL_n(\mathbb{F}_p)$, then invoking Cayley's theorem.

Qiaochu Yuan
  • 419,620
  • Very interesting! – Dietrich Burde Jan 18 '23 at 21:18
  • 1
    The slick proof is due to McKay, see https://math.stackexchange.com/questions/238385/clarification-of-mckays-proof-of-cauchys-theorem-for-groups – Brauer Suzuki Jan 19 '23 at 05:31
  • Rephrasing your first paragraph, something which IMHO should always be emphasized by teachers to students: "there is not much point knowing the smart way to solve a problem unless you've tried it the hard way first". – David Jan 24 '23 at 22:02
8

Questions like this always remind me of Hamiltons search for a multiplication in $\mathbb{R}^3$ which somehow extends the multiplication of real and complex numbers (*). He searched, in vain, for years. And he was, at his time, a really famous and renowned mathematician.

Today it is easy to see that this is not possible. If a vector space $V$ with odd dimension is a division algebra, then for $0\neq a\in V$ the map $x\mapsto ax$ is a linear map which must have a real eigenvalue $\lambda$ (due to the mean value theorem). If $v\neq 0$ is an eigenvector we have $(a -\lambda) v =0$, so $a= \lambda e$. Since $a$ was chosen arbitrarily $V$ must be isomorphic to $\mathbb{R}$.

What Hamilton was lacking were the concepts and definitions involved in this short proof. There is a lot of power and hidden knowledge and the result of decades of research effort in the definitions we are taught today when we learn mathematics.

((*) Summarized from the introduction to an article by Köcher and Remmert, from the book 'Numbers' by Ebbinghaus, Hermes, Hirzebruch et al., German version, Springer 1983)

Thomas
  • 22,361
  • 7
    To put this in context, Hamilton defined the quaternions in 1843. Grassmann didn't publish the first text on linear algebra until 1844, the word "matrix" was not coined until 1848, vector spaces were not defined until 1888, and eigenvectors and eigenvalues weren't defined in generality until 1904. So all things considered Hamilton did pretty well to even find the quaternions! – Qiaochu Yuan Jan 18 '23 at 21:11
  • @QiaochuYuan Highly appreciated, thanks a lot. – Thomas Jan 18 '23 at 21:22
  • 3
    To amend the context given by @QiaochuYuan, Hamilton got very excited when he read Grassmann's work, realizing probably how much simpler his quaternions will seem once people understand this new theory (linear algebra, that is), which is actually a very good insight. I read about that years ago in the biography Engel wrote about Grassmann in his edition of his works. Cf. my comment to https://mathoverflow.net/a/121601/27465 – Torsten Schoeneberg Jan 18 '23 at 22:29
  • @Thomas I don't understand the step from $(a - \lambda) v = 0$, for one eigenvector $v$, to $a = \lambda e$. – Robert Furber Jan 20 '23 at 12:32
  • @RobertFurber this is probably because $a-\lambda$ already is a too sloppy way of writing this down, it should read $(a - \lambda e) v = 0$. ($a$ is an element of $V$ and it's operation on $V$ via the algebra multiplication identifies it with a linear transformation. $\lambda$ is just a real number, and it's operation on $V$ through scalar multiplication is just the same as the operation of $\lambda e$ on $V$). – Thomas Jan 20 '23 at 14:15
  • @Thomas I was OK with that, I just can't see why the fact that $a$ and $\lambda e$ agree on the eigenspace for the eigenvalue $\lambda$, spanned by $v$, implies they are equal (on the rest of $V$). – Robert Furber Jan 20 '23 at 14:57
  • 1
    @RobertFurber if $V$ is an algebra and $a\in V$, then it induces a linear map $E_a v:= av$. On the left, you have a linear map applied to $v$. On the right, you have the algebra product. The same applies to $(a-\lambda e)v$. The linear map view tells you that there is a real $\lambda$, such that the map induced by $a-\lambda e$ has a nontrivial kernel. The algebra view tells you, that the product of the corresponding algebra element with $v\neq 0$ is $=0$. Since we asssume that $V$ is a division algebra, this implies one of the factors is $=0$, so $a=\lambda e$. – Thomas Jan 20 '23 at 15:12
  • @Thomas Thanks, that is clear now, we divide by $v$ (should have thought of that). I did not mind the "sloppy" way of writing it down at all. – Robert Furber Jan 20 '23 at 15:20
2
  1. Yes, there are many other examples of this. More or less every famous result (well, you wanted a "simple proof" for it) will be "incredibly challenging", or even impossible with a different proof. I think of Fermat's last theorem, the Poincare conjecture, or the weak Goldbach conjecture, just to name a few. Of course, the word "simple proof" depends on the context. Perhaps one day the proof for FLT is considered to be "simple" in comparison to other proofs.

  2. No, I don't think this is apparent from the statement alone. Take Fermat's last theorem. How could one "quantise" this beforehand, that a solution without elliptic curves and without modular forms will be (probably) extremely challenging, and much more difficult than the proof we have?

  3. Yes, they're famous theorems, where people have tried to find as many proofs as possible. Three examples are the Pythagorean theorem, the quadratic reciprocity law, or the fundamental theorem of algebra. The wikipedia article here mentions that " Several hundred proofs of the law of quadratic reciprocity have been published."

Dietrich Burde
  • 130,978
  • 1
    "The book The Pythagorean Proposition by E. S. Loomis, the second edition of which was published in 1940, is a collection of 370 different proofs of the Pythagorean theorem." https://mathlair.allfunandgames.ca/pythprop.php – Gerry Myerson Jan 18 '23 at 23:19
0

Burnside's theorem that every finite group of order $p^aq^b$ (where $p,q$ are primes) is solvable has a short proof using character theory and a much longer proof without characters. See M. Isaacs' books for both proofs.