4

I wondered whether notation in mathematics can lead to actual mistakes in proofs and deductions or whether notation is just style that doesn't matter logically. I read that there are formal rules to introduce new notation that can be found here: How could we formalize the introduction of new notation? or https://en.wikipedia.org/wiki/Extension_by_definitions.

I assume that these rules are the way they are because using arbitrary notation can actually lead to mistakes, which is why one is ristricting new notation to those rules. However, I can't really come up with examples where it might lead to mistakes and I haven't seen this talked about in any of my lectures thus far. I have always seen notation as a form of style of authors which doesn't make a difference, but since I read about those rules I am quite confused when seeing notation and often think about whether that notation can actually harm. Something that comes to my mind is the following. In peano arithmetic, there exists an object $S(x)$ for all $x \in \mathbf{N}$, called the successor of $x$. Suppose $n$ is a natural number, then one could (informally, this is none of the rules above) suppress the dependency of $n$ and call this element $m$, or in short: $m:=S(n)$. All one does here is assign a new name to an existing object, but by doing so suppress one property. This is done when defining $1:=S(0)$ for example. Something that is also done sometimes is introducing a new symbol for an object that is not unique. Again, this is not captured by the rules above. There are probably more examples, but I hope that this is already illustrating what I think about. Also, I have read that "$:=$" is informal, and not allowed formally, why is that? I feel like this might be connected.

In short, should I think about notation or is it rather irrelevant, besides being a good way to increase intuition and understanding? Can it ever lead to mistakes? Thanks!

  • 2
    There are MUCH better issues to think about... but hey... it is your call. – David G. Stork Apr 04 '22 at 21:17
  • 1
    Although your Question focuses on the introduction of new names, you might be interested in reuse of names for related but technically inconsistent concepts, or what is commonly called abuse of notation. One of course wants to avoid errors and typically an author calls attention to such reuse for that reason. – hardmath Apr 04 '22 at 21:32
  • 1
    @DavidG.Stork It is not that I actively picked that issue, but rather that it came to my mind naturally and I couldn't get rid of it yet, probably because I didn't have a good answer. If I could I totally would! – user1578232 Apr 04 '22 at 21:42
  • Oh yes, most definitely. Careless notation can conceal reasoning errors. Abuse of notation is likewise, although this is more often valuable/useful. – ryang Apr 05 '22 at 03:59

3 Answers3

11

Here's an example: We introduce a notation for the sum of finitely many numbers $a_1,a_2,\ldots, a_n$ as $$ \sum_{k=1}^na_k.$$ We introduce a strikingly similar notation for series: $$ \sum_{k=1}^\infty a_k,$$ but this is a totally different kind of beast. For example, $$ \sum_{k=1}^na_k=a_n+\sum_{k=1}^{n-1}a_k,$$ whereas $$ \sum_{k=1}^\infty a_k=a_\infty+\sum_{k=1}^{\infty-1}a_k$$ makes no sense. (Not to mention that the sum is always defined in a ring, allows permutation of the summands, and whatnot).

Another example: We introduce the notation $$f(n)=O(g(n))$$ as a shorthand for "$f$ does not grow faster than $g$", or formally. $$ \exists c\in\Bbb R\colon \exists m\in\Bbb N\colon \forall n>m\colon |f(n)|<c|g(n)|$$ In spite of the equals sign being part of the notation, this must not be considered as an equality! Normally, from $a=b$ and $c=b$, we can conclude $a=c$, but $$ n^2=O(2^n)\quad\text{and}\quad 2n^4=O(2^n)$$ you should not conclude $$ n^2=2n^4.$$

  • Thanks, I especially liked the second example. Do you have an advice for avoiding such mistakes? I guess one just has to be aware of the context and then one naturally does the "right" thing, but I don't know if that is too reliable, actually. This is particularly important since one introduces new notation that isnt covered by rules all the time, as far as I can see it. – user1578232 Apr 04 '22 at 21:39
  • 1
    @user1578232: In the context of Big O notation, it helps to be familiar with the more generalized use of the notation. For example, we might informally write $ \sin x = x + \frac{x^3}{3!} + O(x^5) + \cdots $, even though formally, $ O(x^5) $ is a set and shouldn't appear in that context. The formal notation for that is much more tedious and circuitous, however. So O(whatever) informally means "eh, it looks like (whatever) but I don't feel like writing out all the details," but at the same time it has a formal definition, which can be extended to usage like this one. – Kevin Apr 05 '22 at 07:10
  • 1
    Of course, $f=O(g)$ should really be $f\in O(g)$, but people often write the former. – J.G. Apr 05 '22 at 10:15
3

There is no way to define notation without notation or something even less informal. Similarly, in logic, there is meta-logic that you use to prove things within a fixed logical system. This meta-logic comes from nowhere, just as much of our notation comes from nowhere and has no formal definition.

Notation should be chosen to minimise misconceptions, highlight intuition, avoid confusion and make writing and reading mathematical ideas quick. Sometimes these choices conflict with each other.

You have given one common example, elision of variable dependencies in a choice of variable name, that is sometimes needed but can cause mistakes. In analysis, say, we might choose an $\epsilon$ and then based on that choice find a $\delta$ such that... It can sometimes be helpful to write $\delta_\epsilon$ to reinforce that it is chosen differently for different $\epsilon$, but this is not often done in analysis so we don't get nested subscripts like in $\exists p\in E:\exists \epsilon_p>0:\forall \delta>0:\exists x_{p,\epsilon_p,\delta}\in E:|x_{p,\epsilon_p,\delta}-p|<\delta_\epsilon\land |f(x_{p,\epsilon_p,\delta})-f(p)|\ge\epsilon$ (definition of "$f:E\rightarrow \mathbb{R},E\subseteq \mathbb{R},$ is not continuous").


One notation that persistently caused me confusion comes from $\lambda$-calculus, where concatenation means function application, so $fx$ means $f(x)$. The whole point of $\lambda$-calculus is that you can apply any term to any other term, so you get lots of nested applications. The operation is not associative, because $f(g(x))\neq (f(g))x$ (something you don't have to think about in most other contexts as $f,g$ and $x$ would be different types of objects). However, you still elide the brackets, so $fgx=(fg)x=(f(g))x$, not $f(g(x))$. When you start writing terms like $(\lambda abcd.dcba)\boldsymbol{\Omega iii}\equiv(\lambda abcd.dcba)((\lambda x.xx)(\lambda x.xx))(\lambda x.x)(\lambda x.x)(\lambda x.x)$, rather than $((((\lambda a.(\lambda b.(\lambda c.(\lambda d.((dc)b)a))))((\lambda x.xx)(\lambda x.xx)))(\lambda x.x))(\lambda x.x))(\lambda x.x)$ (there'll definitely be a mistake or two in there somewhere...), you come to see why the notation we use is quicker to read and easier to read. Nonetheless, the notation would regularly cause me to make errors where a function took two inputs, the first of which was a function, and I would try to apply the first input to the second input, rather than passing them as the two inputs to the function.


Another example is where notation is chosen intentionally to draw a connection between certain properties, but this cannot be taken too far. We define exponentiation, $e^x$, first for natural numbers $x\in\mathbb{N}$, and then extend this to real numbers $x\in\mathbb{R}$ and eventually to all complex numbers $x\in\mathbb{C}$. It holds for any of these cases that:

$$ e^x=\sum_{n=0}^\infty\frac{x^n}{n!}$$

For $x\in\mathbb{N}$, this is a discovery: we define $e=\lim_{n\rightarrow\infty}(1+\frac{1}{n})^n\approx 2.71828$ or however else you like, and $e^2=e\times e$ and $e^3=e\times e\times e$ and so on, and then discover that the above equality holds. For $x\in\mathbb{C}$, however, this may well be our starting definition. But if you want to prove properties of $e$, you had best know what definition you are starting with to avoid circularity.

Beyond this, we can start defining things like $e^{\left[ \begin{array}{cc} 1&2\\ 3&4 \end{array} \right]}$, the "exponential" of a square matrix, by the infinite sum definition... but replacing $x\in\mathbb{C}$ with a matrix. Here, we will get ourselves into real trouble if we do not update our intuition of exponentials: it is no longer the case that the magical equivalence $e^x=\sum_{n=0}^\infty\frac{x^n}{n!}=\lim_{n\rightarrow\infty}(1+\frac{x}{n})^n$ will make any sense.

A.M.
  • 3,944
0

Here is a frequent source of errors, especially under the tag Boolean algebra on this site.

In Boolean algebra, it is common practice to use the notation $x+y$ and $xy$ for the addition and product of $x$ and $y$, respectively, and $\overline x$ for the negation of $x$. This is not a bad notation, but it becomes ambiguous when you write $\overline x\overline y$. It is actually mostly a $\LaTeX$ problem, easily solved by writing $\bar x$ ($\texttt{\bar x}$) instead of $\overline x$ ($\texttt{\overline x}$), but even so, $\bar x \bar y$ is frequently confused with $\overline{xy}$.

J.-E. Pin
  • 40,163