Formalism of syntax in first order logic

Question

I am attending a introductory model theory course this semester. The professor started talking about formulas and sentences without paying much attention to the formalism of the syntax. However, as the course progressed, I began to notice that we need to put the formulas in a set because Zorn's lemma is needed to prove theorems like the compactness theorem. So I was thinking probably we would need to define formulas and sentences as elements of the free monoid generated by the formal symbols.

This construction leads to many questions that were overlooked and that have to be checked, for example, the necessity of introducing parentheses. My thoughts are: Are these technical details that can somehow be avoided, or are they necessary as the foundation of first order logic?

And see Logical and non-logical symbols in a logic system as well as In Mathematical Logic, What is a Language? — Mauro ALLEGRANZA, Mar 13 '24 at 12:06
It is formal because we have to follow rigorousely the syntactical specifications, in order to avoid errors and ambiguities. Details may be different, like the usage of parentheses: we may omit them adoptong abbreviations that can improve human readibility, but we have to consider tha, in princple, formulas must be "machine processed". — Mauro ALLEGRANZA, Mar 13 '24 at 12:40
FOL is not a super complicated language, so for most work with FOL I think the professor can get away with not defining a rigorous formal syntax. But when you start proving technical meta-logical results about FOL you really should lay out that syntax. And, as Mario says, if you want a computer to work with FOL, you definitely need it. — Bram28, Mar 13 '24 at 13:01

Rob Arthan · Answer 1 · 2024-03-27T20:52:09.047

The usual formal approach in textbooks is to treat syntactic classes such as formulas or terms as subsets of the set of all strings (lists) of symbols drawn from a small vocabulary (yes, the set of all lists is a free monoid, but that fact doesn't add much value in this context: you just need to know that you can construct new lists from old by prefixing or postfixing a list with a symbol or by concatenation of two lists). The subset is typically defined by saying that is the smallest set closed under certain constructions.

So, for example, we might have a vocabulary comprising variables $x_1, x_2, \ldots$, a single (binary) operator symbol $+$ together with brackets and commas as punctuation symbols. We could then define the set $\cal T$ of all terms to be the smallest set of strings of these symbols that:

Contains each string "$x_i$" comprising a single variable.
Contains "$(t_1+t_2)$" whenever it contains $t_1$ and $t_2$ (here I have taken $t_1$, prefixed it with "(", postfixed with "+", concatenated it with "t_2" and then postfixed with ")").

Here, for readability I am letting $\LaTeX$ show the strings with spaces and these are to be ignored. So "$x_1$", "$x_2$" and "$(x_1+x_2)$" are all terms as is "$((x_1+(x_2+x_3))+x_4)$", but "$+x_1$", "$x_1+x_2$", "$((x_1+x_2))$" and $(x_1+x_2)+x_3$" are not.

Typically, formal reasoning about syntactic classes is done at the lowest level by complete induction on the length of the strings. With the above definition, an important property you might want to prove of $\cal T$ is that any term $t \in \cal T$ is either a single variable or has the form $(t_1+t_2)$ for some uniquely determined $t_1, t_2 \in \cal T$. (The definition is very precise about the placement of brackets to ensure this.) Having proved a result like that, you know that every term is either atomic (a variable) or is uniquely represented by combining two sub-terms with the operator symbol (together with some brackets that are just there to make things unambiguous). So you can prove properties of terms at a slightly more abstract level by induction over the structure of a term: i.e., by showing that a property holds of atomic terms and holds of any term if it holds of its sub-terms.

As has been suggested in the comments, in an introductory course, you may be able to skip over some of these details, e.g., just by assuming the more abstract point of view, where a term can be viewed as a tree with leaves labelled by atomic symbols and nodes labelled by operator symbols (and this is how you would represent syntax in a computer implementation of formal syntax). With such an approach you would use informal linear representations of the trees like "$(t_1 + t_2) + t_3$" where the brackets are just there to show the order of construction of the tree. You would also very likely adopt conventions for leaving those brackets out, as we do in arithmetic and most programming languages.

[Disclaimer: I have allowed the set of atomic symbols in my example to be infinite: some authors would object to that and require me to work with a finite vocabulary, so that my infinite set of variables would be defined as strings of of symbols of some particular form (just as happens in the definitions of programming languages.]

I find it a shame that more emphasis is not placed on structural induction in logic courses. It's used a lot, it's more satisfying than induction on formula length, and it's at the core of some current proof assistants (e.g. Coq, which is based on the calculus of inductive constructions). — Sambo, Mar 27 '24 at 21:08
@Sambo: I am very much inclined to agree with you. I suspect a lot of authors don't want to digress into the theory of trees. Hopefully, things will change over time as the barriers between the mechanized theorem proving world and the traditional world of mathematical logic break down. Have you come across John Harrison's book? It's a great resource in this area. — Rob Arthan, Mar 27 '24 at 21:27

score 2 · Answer 2 · answered Mar 19 '24 at 14:25

I'd argue the technical details are necessary but can be avoided. To say you have to study the syntactic details to understand what logic is about is similar to say you have to learn a programming language to understand what an algorithm is. Or you have to know PA to understand arithmetic. Or you have to know how to construct real numbers from ZFC to study real analysis.

Mathematical logic is arguably about syntax, otherwise it would not exist. It's a fundamental insight of Hilbert: No matter how you think about (a theorem or a proof), eventually it can be represented as a string of symbols. To study syntax, it's a basic tool to do induction on the length of symbols (to prove theorems about term, formula and proof), so some weak theory of arithmetic (e.g. PRA ) is necessary (and sufficient). For example, Manin introduced parentheses and used basic arithmetic to prove the "unique reading lemma": there is a unique way to pairing the parentheses; But "the necessity of introducing parentheses" is false, see e.g. Shoenfield's book that used the Polish notation. One of the reasons for not paying too much attention to the details of syntax is its flexibility: Like designing a programming language, it needs to be precise but this can be achieved in many different ways. Or studying syntax is similar to write a compiler for a programming language but if you can read the programs you don't have to know how to mechanically decode.

However, none of us think in terms of syntax (except perhaps Edward Nelson, whose essays are strongly recommended though you don't have to agree with him). Even if Godel's incompleteness theorem is completely about syntax (in particular, he replaced the semantic notion "truth" by the syntactic one "provable".), we always try to attach some material understanding like "self-reference" to it (otherwise, could even Godel be able to construct his proof?). Model theory is a powerful tool, and to use it we have to assume some kind of set theory axioms which is semantically more satisfying as we mentally attach some Platonic reality to it.

But since logic is all about syntax, how do you understand a result proved using set theory such as ZFC (in particular with the help of AC)? Well, it can be interpreted roughly as follows: the result depends on the consistency of ZFC (which is a syntactic assumption, and equivalent to the consistency of ZF). So we think in terms of semantics (if it's hard to imagine a model of ZFC, we all have no trouble to trust the standard model of PA. Even better, we don't have to play dumb, but assume almost anything we know about mathematics even when studying logic.), but the results can always be interpreted in pure syntax (this is not the only philosophical approach, but probably the best in terms of pedagogy).

Formalism of syntax in first order logic

2 Answers2