18

I suppose this is a question about mathematical convention. In a problem in Introduction to Probability by Bertsekas and Tsitsiklis, they ask the reader to prove an identity. But then their proof is mostly words:

Problem 3.* Prove the identity $$A \cup \Bigg( \bigcap_{n=1}^\infty B_n \Bigg) = \bigcap_{n=1}^\infty\big(A \cup B_n\big).$$

Solution. If $x$ belongs to the set on the left, there are two possibilities. Either $x \in A$, in which case $x$ belongs to all of the sets $A \cup B_n$, and therefore belongs to the set on the right. Alternatively, $x$ belongs to all of the sets $B_n$ in which case, it belongs to all of the sets $A \cup B_n$, and therefore again belongs to the set on the right.

Conversely, if $x$ belongs to the set on the right, then it belongs to $A \cup B_n$ for all $n$. If $x$ belongs to $A$, then it belongs to the set on the left. Otherwise, $x$ must belong to every set $B_n$ and again belongs to the set on the left.

In mathematics, why is this allowed? Can you say that this is more correct a proof that is, "Oh, it's obvious!" or "Just keep distributing $A$ over and over ad nauseum and you get the term on the right"?

I'm not trolling. I'm genuinely curious as to how thorough one must be when using words as proof.

Boris Valderrama
  • 768
  • 6
  • 19
jds
  • 2,274
  • 3
  • 24
  • 35
  • 13
    I think it's great. It's airtight, erudite, and to the point. You want to keep notation under control in mathematical exposition. – ncmathsadist Jan 19 '19 at 23:39
  • 6
    There are a few formulæ to denote the objects. I'll add that you can you've really understood a problem if you can solve it in words. – Bernard Jan 19 '19 at 23:43
  • 1
    Eschew obfuscation. Don't use complicated symbols if you don't need to. – Alex S Jan 20 '19 at 01:16
  • 8
    Edward Nelson's "A PROOF OF LIOUVILLE'S THEOREM". (That paragraph is the entire Proc. Amer. Math. Soc. paper.) – Keith McClary Jan 20 '19 at 03:08
  • 16
    I am very confused by the question. Are x, ∈, ∪ and Bn words? If the answer is no, then the proof is not "just words". If the answer is yes, then proofs consisting only of mathematical symbols are also "just words". What is this question actually asking? – Eric Lippert Jan 20 '19 at 05:18
  • I've re-read this question many times, and I can't find anything unclear about it. I don't believe it should have been closed. It asks why the use of natural language in a proof is OK, and how rigorously such language can and should be used. The question is then summarised in the final sentence. – timtfj Jan 23 '19 at 23:56
  • 1
    @gwg To make the question less "unclear", it might be worth adding a summary along these lines at the end: • In a proof, can words be used rigorously as symbols? • Can a formal proof use only words? – timtfj Jan 24 '19 at 01:12
  • 1
    @timtfj, thanks for the feedback. I don't mind that the question is on hold; I got my answer. That said, your summary question, "Can a formal proof use only words?" is almost exactly the title of the post, "Can a proof be just words?" I was not unclear and the condescension in the comments (not you) is pretty par for the course here. Doesn't matter. I learned something. – jds Jan 24 '19 at 21:28
  • @ncmathsadist "you want to keep notation under control" but do you? I mean I find notations and symbols a hell of a lot easier to understand than wordy proofs. Complicated notation only get replaced by complicated esoteric word jargon. One of the things I love about math is people from all over the world, all speaking different languages, still understand 2+3=5 even their words for 2, 3, 5, addition, and equals are different in their own languages. Putting words in proofs don't get rid of complicated symbols, they just make what's being said even more fuzzy (and take up twice the space). – Pineapple Fish Oct 04 '21 at 19:17
  • @ncmathsadist IMO the typical math "proof" absolutely sucks as a teaching&explaining tool because you have to already know that the statement ought to be true to begin with. You don't get the important theorems by randomly stumbling about from axioms. Proofs don't tell you where mathematical truths come from, all they do is check them. I think that casual explanation, and casual explanation of notation, is the best teaching tool, and that's where words have a role. I'm just saying that in formal proof I think terms of manipulating the notations, not the words. – Pineapple Fish Oct 04 '21 at 19:31

10 Answers10

59

Exactly as thorough as you would have to be using any other kinds of symbols. It's just that vast messes of symbols are hellish for humans to read, but sentences aren't. Adding symbols to something doesn't make it more rigorous, less likely to be wrong, or really anything else. Symbols are useful for abbreviating in situations where this adds clarity, and making complex arguments easier to follow, but shouldn't be used where they do not help in this regard.

user3482749
  • 6,660
14

Yes they can and I'm of the opinion that symbolism and notation should be avoided unless it serves to simply the presentation of the material or to perform calculations. For example you want to cut a cube so that each face has a three by three grid of smaller cubes similar to the Rubix cube and with a little thought and experimentation once might conjecture that six is the minimal number of cuts. The best proof of this that I know of is simply "Consider the faces of the center cube." They require six cuts because there are six faces and it follows immediately. No symbols or calculation but still logical and mathematically sound.

CyclotomicField
  • 11,018
  • 1
  • 12
  • 29
  • I think your example proof only shows that 6 is a lower bound, not that it is a minimum. – Paŭlo Ebermann Jan 20 '19 at 00:26
  • 4
    +1 Spot on, though I feel obliged to say that Rubik's Cube is named after Rubik (who I think invented it to demonstrate group theory). – timtfj Jan 20 '19 at 00:30
  • 1
    @PaŭloEbermann that's correct and one would need to ensure a six cut solution exists which I assumed would have been found during the formation of the conjecture. – CyclotomicField Jan 20 '19 at 02:55
  • 2
    (A tangent: just because each face has a tic-tac-toe pattern of cuts on it does not immediately imply that there is any "center cube" at all. To assert that there is one seems to presuppose that the obvious six-cut solution is unique ...) – hmakholm left over Monica Jan 20 '19 at 03:29
  • The way I first heard the "Rubik's cube" problem, it was phrased in terms of cutting a $3\times3\times3$ cube into $27$ unit-sized cubes. I suspect the problem was popular before Ernő Rubik was even born. The "Rubik's cube" formulation just requires a bit more work to show that if all cuts must be planar then you will get $27$ unit-sized cubes in the end and it can be done in six cuts. The trick is seeing that there is no trick to get the same result in fewer cuts. – David K Jan 20 '19 at 19:18
  • 1
    @TheGreatDuck: One possible interpretation of the problem as stated here would be that you're allowed to: (a) cut the cube in to three layers, (b) cut the upper and lower layer into 9 small cubes eace, (c) cut just the four corners (half cubes) off the middle layer, leaving a big octagon. That produces the specified pattern on the surface yet no center cube. – hmakholm left over Monica Jan 24 '19 at 14:22
  • @HenningMakholm Laymen never seem to come to that conclusion but given the creative attempts here I'll put this on my list of questions that only confuse mathematicians. – CyclotomicField Jan 24 '19 at 17:05
11

Natural language for expressing mathematical statements can be indeed vague and ambigous. However, when you study mathematics, one thing you will usually learn at the beginning is how to use mathematical terminology in a rigid, unambigous way (at least for communication with other people trained in mathematical terminology). This process takes usually some time if you are not a genius (I guess it took me about two years at the university until I became reasonable fluent), so unfortunately I fear I cannot tell you a small set of rules which kind of language is "right" for mathematical proofs, and which is "wrong". This is something you can only learn by practicing.

Hence, the answer is IMHO "yes, words are fine, when used correctly by a trained expert". (Amazingly, one could say the same about more formal proofs using symbols.)

Note that historically, before the 18th century, proofs using natural language was the de facto standard in mathematics. Most of the symbolic notation we usually use today was developed in the 18th and 19th century.

Doc Brown
  • 210
  • It depends on the symbol. The most basic, ( such as = and + ) are several centuries older. – Paul Sinclair Jan 20 '19 at 01:22
  • @PaulSinclair: sure, see my edit. – Doc Brown Jan 20 '19 at 06:52
  • I think what is important is that symbols used must have one single and unambigious definition and it cannot be simplify further (that is, it is not a name of a placeholder for something else). Mathematical symbols too are liable to have many definitions. Most mathematical symbols, even most elementary symbols like "+" and "=" are loaded and must be interpreted case by case according to the context. I'm disappointed that mathematics is not as rigorous as I once thought. – zeynel Nov 17 '23 at 09:07
7

Two points:

(i) Historically, all proofs were done in words—the use of standardised symbols is a surprisingly recent development. This is obscured a bit because a modern edition of, say, Euclid's Elements is likely to have had the words translated into modern notation.

(ii) Before symbols can be used they have to be defined, and ultimately that definition will be in words. It's easy to forget this, especially with ones that we use all the time and learnt in childhood. But, for example, we once had to learn that $2+3=5$ was short for "Two things together with three things is the same as five things".

Though a lot of us learnt instead that $2+3=5$ meant "Three things added to two things makes five things".

Now, these two definitions are different. One makes $2+3$ into an operation done to $2$, and treats $=$ as an instruction to carry it out; the other says that the number on the right has the same value as the expression on the left. The notation, though, doesn't make this distinction, and it's possible to spend years using the $=$ sign as though it meant "put the result of the operation on the left on the right".

So in this case we've got one string of symbols ($2+3=5$) a correct definition and a misleading definition. And how do we clarify the correct meaning of the symbols? By choosing which verbal definition to use. The precision is in the words (at least if they're well chosen).

Of course, more advanced symbols will most likely have some mathematical symbols in their definitions—but ultimately, we'll get back to words.

timtfj
  • 2,932
  • 1
    Bad example. I've yet to see a translation of Euclid's elements with modern symbolism interposed. About the only symbolism in Euclid is the labeling of points, lines, or other geometric elements, and Euclid did this himself. The only change translators make is to use the Latin alphabet instead of the Greek. – Paul Sinclair Jan 20 '19 at 01:28
  • 1
    Indeed a significant proportion of people seem to be using the $=$ as if it meant, "the next step in the procedure I'm thinking about is to write down the following", with no particular consideration of how that next step relates to what is already on the paper. – hmakholm left over Monica Jan 20 '19 at 03:33
  • @PaulSinclair I was thinking really of school textbook versions—for example my father had Euclid as his geometry textbook at school, and as I remember it was riddled with $=$ and $\therefore $ signs and the variius geonetriccal symbols for parallel, perpendicular, angle, triangle etc. I think more modern translations treat it as an ancient text to be represented as closely as possible, rather than as something people are going to learn geometry from. – timtfj Jan 20 '19 at 10:56
  • @HenningMakholm It seems to come up every so often on Mathematics Educators SE either a problem to be combated, or a suggested reason for other problems. – timtfj Jan 20 '19 at 11:00
  • @PaulSinclair I'll edit the example later. I suppose a 1940s school textbook is only a modern translation for certain contextualisations of "modern". – timtfj Jan 20 '19 at 11:07
  • Translations of Euclid purely into words are longer & generally harder to understand than those with a judicious use of symbols. II 6 may be worded "If a straight line is bisected and a straight line is added to it in a straight line, then the rectangle contained by the whole with the added straight line and the added straight line together with the square on the half equals the square on the straight line made up of the half and the added straight line." Even just parsing that is hard. In symbols, $(2a+b)b+a^2=(a+b)^2$ is clear & unambiguous. – Rosie F Jan 20 '19 at 15:12
  • @RosieF - that is certainly more clear as to what it means for numbers, but it completely misses almost everything Euclid is saying. And trying to think about it in that way is going to make his development harder to understand instead of easier, because it is unnecessary with this sort of algebra. Therefore you will miss the ideas of what he is doing. Euclid's geometric number theory is antiquated now, so the only reason to be studying it is to understand those ideas. – Paul Sinclair Jan 21 '19 at 03:28
  • 3
    @TheGreatDuck:I've seen people write things like. "Find the inflection point of $2x^3+x^2$. Solution: $2x^3+x^2=6x^2+2x=12x+2=0$ so $x=-1/6$." – hmakholm left over Monica Jan 24 '19 at 14:29
7

For your particular example:

Just keep distributing $A$ over and over ad nauseum and you get the term on the right.

would not be a convincing proof. This is not because it is in words, however -- words are perfectly fine.

But it fails to convince because the intersection is over an infinite family of sets. Your proposal would work fine for a finite intersection, in that it gives a recipe for constructing an algebraic proof that would itself be convincing. And in ordinary mathematics a convincing recipe for a convincing proof is itself as good as the real thing.

But for an infinite intersection, the algebraic calculation you're describing never ends! No matter how many steps you do, there will still be an intersection of infinitely many $A_i$s that have yet to be distributed over in your expression. So your recipe does not lead to a finite proof, and infinite things (to the extent they are "things" at all) are not convincing arguments.


There are ways to convert some cases of infinitary intuition into actual convincing proofs, but they have subtle pitfalls, so you can't get away with using them -- no matter whether with words or with symbols -- unless you also convince the reader/listener that you know what these pitfalls are and have a working strategy for avoiding them. Typically this means you need to explicitly describe how you handle the step from "arbitrarily but finitely many" to "infinitely many" (or in more sophisticated phrasing: what do you do at a limit ordinal?).

A somewhat unheralded part of mathematics education is that over time you will get to see sufficiently many examples of this that you collect a toolbox of "usual tricks". When communicating in a situation where you trust everyone knows the usual tricks you can often get away with not even specifying which trick you're using, if everybody present is experienced enough to see quickly that there's one of the usual tricks that will obviously work.

5

Yes, it's perfectly acceptable to write proofs using mostly words.

In modern mathematics, all statements can be written using only the symbols $\forall, \exists, \vee, \wedge, \implies, \lnot, \in, (, )$ and a countable collection of variables. Notice that each of these symbols have a slight english meaning as well:

  • $\forall$ : for all
  • $\exists$ : there exists
  • $\vee$ : or
  • $\wedge$ : and
  • $\implies$ : implies
  • $\lnot$ : not

In most "heavily-worded" mathematical proofs, the words used are often a rough image of the precise symbols above. For a small example from your proof:

if $x$ belongs to the set on the right, then it belongs to $A \cup B_n$ for all $n$

translates to

$$(x \in \cap_{n = 1}^\infty(A \cup B_n)) \implies \forall n( n \in \mathbb{N} \implies x \in A\cup B_n))$$

and that's not even the most precise form, as there are ways (using the primitive symbols above) to translate the union, intersection, and $\mathbb{N}$ symbol to their more primitive forms. Can you imagine translating your entire proof into this symbolic form? The rough image (the "word form") of this formalism is often enough for the reader to understand the precise meaning of Theorems and their proofs.

Metric
  • 1,503
2

Behind the proof system is logic... you need to write a reasoning that is fool proof and can be reproduced by the reader to lead to the same conclusion, and every step of the proof must be unambiguous and without "exceptions" (if there are special cases, they must be stated). As long as this is respected, the proof is correct and complete. When you see a symbolic proof, you can still read it in plain language, as long as you understand what it means, so there is no real difference (as long as the proof is rigorous, without "holes" or ambiguous statements).

Note that this excludes statements such as "this is obvious". You need to tell the reader of the proof what steps to take in his own mind to come to a single unmistaken conclusion. This part is very important - not understanding this leads some people to rejects proofs as opinions (all pseudoscience relies on this fallacy).

Now, just as words are just notation for thoughts, so are symbolic expressions just short notation for longer words. Symbolic notation has the advantage to being language-independent, and exact within their previous agreed upon definition. They a lot of times simplify things in algebra, arithmetics and functional analysis, where reasoning just follows simple steps without decision making and reasoning.

However, when it comes to logic, deduction, and other high-level thought processes, notation gets clumsier and a lot of times harder to understand (there are symbols for "therefore" and statements such as "A implies Β", but the author might not choose to use them). Instead of calculations, you have something that very much resembles formal computer programs, and fewer people are trained to read them fluently.

Think of lawyers: law is written in "english", but most "everyday english" isn't used, because it's ambiguous. Instead, the words are meticulously put together to try to cover all the corner cases and have only one interpretation (so much, that for a layman, the text is almost incomprehensible). The metaphor is not the best, because in lawmaking, there is no rigorous foundation (no true axioms) to rely upon, but I hope you understand the point.

orion
  • 15,781
  • Words are indeed notation for thoughts. The trouble is that some mathematical thoughts have such a complicated structure that natural language isn't powerful enough to indicate that structure unambiguously, even if language has a word for each of the simplest elements of those mathematical thoughts. – Rosie F Jan 20 '19 at 15:21
2

Yes. All proofs can be written in words. While some will say that this is because you can use words in certain ways with formal descriptions and such, ultimately the real reason is because all mathematical symbols and statements correspond to written words! Now this isn't to say that things cannot get messy, but for instance take $4 + 5 = 9$. That is a symbolic statement. There is nothing fundamentally wrong with me instead saying that four plus five equals nine. The same could be said with a lot of other statements. Obviously some things will get messy due to lack of proper names, but I think one would be hard pressed to find something that cannot be expressed in words.

However, in the problem 3 example you give the proof does use words. I think the problem here is that you are confusing "proof" with "algebra/symbol manipulation". If you write a proof with nothing but math symbols I wouldn't really call that a proof. Perhaps on stack exchange it might qualify, but seriously to whoever does that - wrap it in a sentence and don't be lazy.

When I was taught proofs there were a few basic rules.

  1. Proofs are a piece of writing. Everything must be complete English sentences.

  2. Never use the word "obvious" or any synonyms. They are filler words and are usually placeholders for "I'm too lazy to do this or have a lack of knowledge".

  3. Never state things in the form "if done repeatedly" or the form "if continue doing this over and over we obtain". It can create pitfalls if you use the same language with infinite steps rather than finite steps. Instead say, things like "expanding the equation further we obtain" or "integrating three more times we get".

  4. Don't write equations in words verbatim. In other words, if you have $4 + 5 = 9$ don't write "four plus five equals nine". Technically this has no bearing on the validity of the proof, but it's annoying for the reader.

  5. Write in formal language and keep it succinct. Don't go into details about your thought process and how you came up with the proof. Write that separately if wanted, like in a response.

The list has probably evolved for me over time, but I think this is the crux of what you need to make sure you do in a proof. And yes, saying something is obvious is technically alright if the "proof" is a sarcastic response to someone asking for a proof of something truly obvious such as asking for a proof of "4 + 5 = 9" in the context of a proof of a calculus identity. In that case saying it's true because it's assumed to be true in the context of that proof is alright, because you don't have to rebuild the entire foundation of arithmetic when proving that integration by parts is a valid integration formula, not unless you have some unusual desire to do that.

user64742
  • 2,207
  • But written words can't express the structures we need to describe formulas. Try to distinguish, using only written words, among $\frac{\sin x}{\sqrt{a+b}}$, $\frac{\sin x}{\sqrt a+b}$, $\sqrt{\frac{\sin x}{a+b}}$, $\frac{\sin x}{\sqrt a}+b$, $\sin\frac x{\sqrt{a+b}}$, $\sin\frac x{\sqrt a+b}$, $\sin\frac x{\sqrt a}+b$. Dodges such as "all over" only get you so far. – Rosie F Jan 20 '19 at 15:38
  • And sometimes "obvious" might even stand for "if we check more thoroughly, it might turn out to be false" ;) – Hagen von Eitzen Jan 20 '19 at 18:15
  • And other times it may mean "obvious". I find it is poor advice to just advocate to never indicate when something is easy or straightforward. In certain situations, it is also terrible writing to present all sorts of routine details; it is not a matter of laziness. – Andrés E. Caicedo Jan 20 '19 at 18:19
  • 1
    "The ratio of sine x by the square root of the sum of a and b", "the ratio of sine x by the sum of the square root of a and b", "the square root of the ratio of sine x by the sum of a and b", the sum of the ratio of sine x by the square root of a and b", "sine of the ratio of x by the square root of the sum of a and b", "sine of the ratio of x by the sum of the square root of a and b", and "the sum of sine of the ratio of x by the square root of a and b", respectively. – user3482749 Jan 22 '19 at 14:44
  • 2
    I would like to add the style rule: always express numerical values as numerals—save written-out numbers for things like "there are three cases to consider" and avoid '$n$ equals three", which reads unnaturally. If a number is being manipulated mathematically, make it look like a number. (I keep seeing "one" used for the value $1$ in posts, and it reads very oddly!) – timtfj Jan 23 '19 at 02:54
  • 2
    @timtfj I agree. "One" can sometimes even be ambiguous: "if $p$ is a prime, $k$ isn't one". Some writers use "unity", e.g. "roots of unity", which also reads oddly. – Rosie F Jan 24 '19 at 06:43
  • @TheGreatDuck I think that what is good or best style depends on where the proof is published. If it's in a paper in a peer-reviewed journal, perhaps it may omit routine steps -- readers may trust that the routine work is correct. However, I sometimes find myself interested in a math.SE answer & have to copy the proof & work through it step by step to fill gaps in inferences that weren't obvious to me at first reading. Then web sites don't have space-restrictions as tight as those of maths journals; e.g. here, omitting stuff just for space reasons cuts in only at 30000 characters. – Rosie F Jan 24 '19 at 07:18
  • I like this answer and your elaboration. $+1$. – Clayton Jan 25 '19 at 20:27
2

All proofs are words. Display and inline equations are words, phrases, and sentences written using a significantly more precise syntax and semantics than most natural languages, but they're still words.

When you speak to tell your friend, "$x = 2$", do you say words or do you somehow switch to some other mode of communication?

Eric Towers
  • 67,037
0

I will prove that "proofs of only words" exist.

Assume not. That is, assume that all proofs must involve more than words. If this were true, then this proof would be impossible, yet it is completed with this sentance.


Another from a textbook somewhere.

All people can be catagorized according to some catagorization.

Proof by construction: divide all people into two categories, those who believe this statement and those who do not.

  • I like this. But I think that for it to be non-circular, you have to allow untrue proofs to be proofs. Otherwise, it uses its own truth as a prerequisite for working. – timtfj Jan 23 '19 at 02:39