Understanding Metatheory and the Broader Picture of Foundational Set Theory

Question

So I'm trying to put together a clearer picture of what is going on when we study set theory. I'll describe my current picture which I'd appreciate some feedback on, and I'll ask some specific questions as well.

So from the start: if we take a Platonist perspective (which I was taught as the most pedagogically effective philosophy to have when learning set theory) then we assume sets in some way or another exist along with the intuitive properties like membership. Then when we list the ZFC axioms (which can be done via some bootstrapping process without need for sets) and we are just saying that sets satisfy these objects. Using our intuitive mathematical reasoning and the axioms we can develop all everyday mathematics, including mathematical logic. Is it fair to say that this intuitive notion of a set and mathematical reasoning is the `most' meta metatheory?

However, now having developed mathematical logic, using this metatheory we can consider ZFC formally as a mathematical object along with a sequent calculus (which I believe can also be developed purely syntactically without the need for sets) and use results like the completeness theorem to reason about the mathematical object of ZFC. In particular, using the metatheory we can say $ZFC \not\vdash CH$ and $ZFC \not\vdash \neg CH$. This however says that there is no formal proof of $CH$ or its negation. However, if a formal proof is just a mathematical object that is made to faithfully represent our informal notion of a proof within the metatheory, then how do we know that there is no informal proof of $CH$ or $\neg CH$? I guess to show that a formal proof is the same as an informal proof would require us to step outside the current metatheory so that we may talk more concretely about it, but this is not possible as it is the `most' meta, so we just believe that this formal and informal notion of a proof agree?

That's my current picture, but here are a couple of other questions, hopefully not making this post to long.

I am confused about how to know when we are working in a metatheory, and more generally what a metatheory is? For example I can't make sense of this paragraph in the lecture notes of a class I took: "For sets $x_1,...,x_n$, we let {x_1,...,x_n} be the set containing exactly $x_1,...,x_n$. (We could prove this exists by induction on n, but one then has to ask where this induction takes place. At this stage it would take place in the metatheory (which is fine). Only once the Axiom of Infinity is introduced could we endeavour to prove a corresponding internal version within set theory.)" What is induction in the metatheory here?

Another question is how to make sense of classes. In ZFC they are informally defined and we think of something belonging to a class if and only if it satisfies some logical formula. We when work classes and say some property of them holds, we are really just saying, if something satisfies this formula of the class then it has such and such a property? I also understand that in Von Neumann-Bernays-Gödel set theory classes are given a formal existence.

I am quite confused and want to try get the big picture as straight as possible, one that puts me in a comfortable position to proceed with learning more set theory. I am currently working through Kunen's introduction to independence proofs where there is a strong emphasis on distinguishing between the theory and the metatheory, so any response and even helpful references are really appreciated.

You write: "I am confused about how to know when we are working in a metatheory". The term metatheory refers to the theory you work in; that's exactly what the term means. This is in contrast to the object theory, which is the theory that you study. — Z. A. K., Feb 18 '24 at 05:16

Soundwave · Answer 1 · 2024-02-18T16:27:42.163

I'll take a crack at this.

Any time you make a logical argument, you maybe presume some things that are true (your axioms) and always presume some rules about how you can work with true premises to reach true conclusions (your rules). Together, they form a theory.

Maybe I'm doing some set theory, and I'm working in the theory $\mathsf{ZFC}$. Or maybe I'm doing some arithmetic and I'm working in Peano Arithmetic, $\mathsf{PA}$. But then I get curious, or maybe a little manic, or maybe my name is Hilbert, and I start thinking things like "I wonder if can I be sure the assumptions of my theory don't contradict each other?" or "I wonder if I can prove everything that is true with this theory?" or "My name is Hilbert and I can prove everything with this theory." Then a smart kid by the name of Gödel comes a long and shows that actually, you can't. You can construct a model of $\mathsf{PA}$ inside $\mathsf{PA}$ and then show $\mathsf{PA}\not\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$.

It sounds like this all probably isn't news to you. But here's the key point: $$\mathsf{ZFC}\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$$

and trivially so! $\mathsf{PA}$ is intended to model that natural numbers and very basic properties of them, so the natural numbers within $\mathsf{ZFC}$ serves as a model quite readily.

The takeaway here is that whether or not something is provable is not just a function of its truth, but also of the theory the proof is to be conducted in. Always. Even (especially) if this theory is informal.

When we're proving theorems about a theory, whether explicitly like making statements like the ones above, or implicitly by say, doing axiomatic set theory and working within a particular object theory like $\mathsf{ZFC}$, the theory the theorems are argued in is called the metatheory. In fields like model theory, semantics, and constructive mathematics, we're often explicit about what our metatheory is. In logic and other axiomatized fields, we're usually at least careful to be precise about what properties it has. In other fields, less so these things.

We are always working in a metatheory.

What metatheory? Well, it depends, and also often doesn't matter; much of mathematics is not all that precise about it. Here's a selections of properties that are important:

Is your metatheory classical?
Informal status: if unstated, almost certainly yes.

That is, is $P\vee\neg P$ true for all statements $P$? It's very intuitive that this should be true, and used frequently. "Either P or not P. If P, Q. Also if not P, Q. Thus, Q." But it has some nasty properties that lead some fields of math to avoid it. Due to how intuitive it is, it's really easy to footgun yourself by accidentally using it in informal reasoning when you mean to avoid it. It also likes to hide in other seemingly innocuous statements. Informal reasoning tends to just take it as fact.

To what extent does your metatheory admit infinities?
Informal status: Usually. Sometimes rejected so that the metatheory has nice properties w.r.t. computation or as a philosophical exercise. Countable infinities crop up here and there informally without concern. Uncountable infinities are very suspicious (but also very rare).

Is your metatheory okay with making arguments about an infinite collection? Generally the bar is "effectively enumerable," as in in principle you could go through all the elements in a particular order. Then the infinity is tamer, and this becomes the same as asking...

Does your metatheory admit induction?
Informal status: Usually. May be avoided as a philosophical exercise.

Not much more to say here. Talking about unending generalities is useful, it's done frequently, only the strictest of finitists rejecting this as a philosophical exercise.

I will mention two theorems that are common to take for granted that do come from metatheoretic induction, because knowing that they are arguments in the metatheory and not provable in their respective theories is an important point of order: the deduction theorem, that if you can deduce $Q$ from assuming $P$ then you can conclude $P\rightarrow Q$, and predicate extensionality, $\forall x\forall y[x = y\rightarrow\phi(x)=\phi(y)]$.

Does your metatheory admit choice?
Informal status: Depends. If you're doing very object level math where the metatheory is implicitly some sort of set theory, like analysis or topology, then usually yes, although the use of choice is generally noteworthy. Otherwise usually not, although category theorists are notoriously bad about this.

This one is mind-bending, and a whole thing. Basically argument forms like "There is an X, consider one such X" or "Consider each X in turn", which are innocuous if you make finitely many such choices, can become a whole other thing if you have to make infinitely many such choices or consider over an infinite set. Much has been written about choice and why one may consider leaving it behind, I will not pretend to be able to summarize it succinctly here.

Those are some of the properties you might consider. Technically there's other non-trivial properties, like for example whether talking about "functions from A to B" even makes sense, but virtually every metatheory takes them for granted so you will rarely worry about them. It otherwise usually doesn't matter exactly what the metatheory is, because the point of much of mathematics is that the metatheory is abstracted away, so you can focus on the object theory. Most cases you will be working in a classical metatheory with induction and infinity, with or without choice. It might as well be $\mathsf{ZFC}$ or $\mathsf{ZF}$, depending. If you're more into modern foundations you might consider the metatheories to be $\mathsf{MLTTW + EM}$ with or without non-constructive choice. If you're really out there you might consider yourself working in a topos with or without enough projectives. It really... just... doesn't matter. Pick your poison. There isn't a best theory, and none of these theories, even unspecified informal ones, are immune to formalization and analysis by yet a more metatheory, if you were so inclined. As you might be catching on to...

There isn't a "most" meta theory. No one theory is "best" or "canonical."

So how do we know that the informal theories that we adopt casually are fair to be considered to coincide with are formal ones?

Well, at one point, we didn't! For a time we had $\mathsf{ZF}$, and then choice was discovered when close inspection of informal theories of analysis showed that presuming the reals admitted a well-order was a non-trivial assumption. In the very beginning we had unrestricted comprehension, until we found out that it's inconsistent. But at this point, we've learned a lot about theories of logic and sets, and we know all of known informal mathematics (modulo some sticky points at the limits of category and type theory) can be modelled by $\mathsf{ZFC}$, and furthermore that $\mathsf{ZFC}$ is a comically powerful theory, so identifying informal reasoning and $\mathsf{ZFC}$ is a conservative assumption.

We identify formal and informal proofs because the formal notion of proof in modern mathematics is very strong, and it's a conservative assumption to identify them.

But just as we've been wrong about comprehension and choice in the past, we could be wrong about this too. Indeed, informal reasoning outside of the known reaches of $\mathsf{ZFC}$ is common, and is important to steering how research is conducted. For example, you ask working computer scientists if $\mathsf{P}=\mathsf{NP}$, and they likely have an informal argument (probably) that they are not equal. Most research then takes the flavor of trying to find descriptions of these classes that are precise and tractable enough to distinguish them conclusively.

Once an argument coalesces into something formalizable, it becomes natural to then ask how this relates to other theories with known properties. Can the premises be proven from another theory? If not, is it philosophically sensible to include these premises in our metatheory from herein out? Without answering these questions, an informal argument really fails to be very meaningful. After all, as far as logic is concerned "suppose $\neg\mathsf{CH}$, then $\neg\mathsf{CH}$" is as good an informal proof as any that the continuum hypothesis is false, but it doesn't elucidate anything. Hence why we asked these questions, and discovered it was independent from $\mathsf{ZFC}$, and that both supposing it and its negation have interesting consequences.

We don't know that an informal "proof" of the continuum hypothesis doesn't exist (and I'd argue there's actually some fairly convincing arguments for taking it as true), but we do know it is independent of $\mathsf{ZFC}$, so whatever metatheory that proof may come in, it is stronger than $\mathsf{ZFC}$

It's tempting to cling onto the informal notion of proof as some truth, but in the echo of every proof is always a metatheory. If that metatheory is philosophically compelling, then you may consider the proof philosophically compelling as well. If you don't, then the proof is perhaps a mere mathematical curiosity (or triviality, as the case may be).

It's metatheories all the way down. Where you decide to stop is a matter between you and the fractal.

And, a humble word of advice from a constructivist: don't try to get smart with $\mathsf{ZFC}$. It's got hands. It's probably already whispering more lies than truths. It'll treat you well.

Classes: An Appendix

Aah yes. Classes. The budding set theorist's first point of metatheoretic subtlety. Here's the deal with classes. If we're playing strictly by the rules, there are no classes. They are merely syntactic abbreviations, as follows: $$\{x\ |\ \phi(x)\}\in\{y\ |\ \psi(y)\} := \exists x[\forall y[y\in x\leftrightarrow\phi(y)]\wedge x\in\{y\ |\ \psi(y)\}]$$ $$x\in\{y\ |\ \psi(y)\} := \psi(x)$$ This is confusing, both because you very quickly build up layers upon layers of nested abbreviations, and because informal arguments start getting a little fast and loose about whether classes are really objects of the theory. But they are just abbreviations.

Yes, classes are just abbreviating statements about the properties of sets. Working with classes is in a very precise sense just working with properties of their elements.

The idea is, in principle, you could unroll those abbreviations and pin down the formalities, and you'd just get theorems about intensely complicated statements not involving classes. You just really don't want to do that. For example, here's a humble extension axiom for $\mathsf{ZFC}$ concerning large sets. With 163 symbols (not including the expansion of equality which is taken as primitive in this metatheory), it's basically illegible without abbreviations. But it can be done.

Now $\mathsf{NBG}$ is just a theory that says, look, if we're serious about classes being "just an abbreviation" and our informal treatment of classes kinda sorta seems like they're sets, surely this should be formalizable. And it turns out it is, and furthermore, it has a nice, expected property: we say $\mathsf{NBG}$ is a conservative extension of $\mathsf{ZFC}$, meaning that if provable statement of $\mathsf{NBG}$ looks like a statement of $\mathsf{ZFC}$ (or in the case of $\mathsf{NBG}$ even stronger: if it can be interpreted as a statement of $\mathsf{ZFC}$ by the natural interpretation of classes in $\mathsf{ZFC}$), then that statement is also provable in $\mathsf{ZFC}$. This is why informally we consider the theories interchangeable, because there's a very narrow formal sense in which they prove the same things.

Potato · Answer 2 · 2024-02-19T14:50:57.770

Regarding what constitutes the metatheory, Kunen defines it as "basic finitistic reasoning" in his Foundations of Mathematics (in I.7.2) and elaborates in his section III:

We conclude with two additional questions: What exactly is the metatheory? Why is it beyond reproach, as we said above, for even consistent?

For the first question, unfortunately, we cannot say exactly. Roughly, as we said, the metatheory is basic finitistic reasoning about finite objects such as finite numbers and finite symbolic expressions. One could attempt to give a precise definition of exactly what finitistic reasoning is — for example, we could say that it is what can be formalized within the system PRA mentioned above. But if you look at the definition of PRA (or of any other formal system), you will see that to understand the definition, you need to understand already basic finitistic reasoning. That is, starting from nothing, you can't explain anything.

(I think implicit here is the claim that finitistic reasoning is essentially the most minimal, conservative reasoning system possible. So if we use it as the metatheory, even though the reasoning is informal, there will be no doubts about its validity or the validity of our foundational project.)

For the second question, unfortunately, we cannot say exactly. Presumably, the metatheory uses the same reasoning that we use to reason about the world around us, so this "must" be correct or else we would all be insane but of course, that's not a proof. But, we can contrast this question with the question of whether ZFC is consistent. If ZFC turns out to be inconsistent, this may not be relevant to most people outside of pure mathematics. Perhaps the inconsistency lies with cardinals $\gamma$ such that $\gamma = \beth_\gamma$; then this might not even affect many set theorists. Of course, we would have to revise the axioms so as not to generate such cardinals, just as Zermelo had to axiomatize Cantor's informal set theory so as not to produce $\{x: x \notin x\}$. But much of set theory and independence proofs involves cardinals around $2^\aleph_0$ and $2^{2^{\aleph_0}}$, and these might survive the revision unscathed. But, if there are inconsistencies in ordinary finitistic reasoning, then we would all have to revise the way we think about the world. This is an interesting question in general philosophy (how do we know we are sane?), but it departs from the philosophy of mathematics.

I also find the following quote from Mileti's Modern Mathematical Logic (Section 8.2) helpful:

If you believe that there exists a real world of true mathematics, then there is no issue. We do normal mathematics in that world, and when we say the “set of axioms of ZFC,” we mean Axiomatic Set Theory set in that sense, not in the sense of ZFC itself. In other words, the metatheory is the world of normal mathematical practice. Working in that metatheory, we are just building a toy world axiomatized by AxZFC, which is capable of embedding mathematics. Of course, we should not confuse the real world of true mathematics with this toy first-order theory, just like we should not confuse the actual laws of physics with a computer simulation of the laws of physics. Thinking about exotic models of ZFC is certainly fun and interesting, and can give us insight into which other axioms we might adopt, but a first-order theory is not meant to be a perfect reflection of mathematical reality. Instead, it is a playground that we can explore and analyze in order to gain insight about the true nature of mathematics (just as a computer simulation can give us insight into the physical universe).

The simulation analogy to me feels very apt. To explain why, let me digress a bit.

When Kunen discusses recursion theory in his Foundations, he gives a detailed discussion of the Church–Turing thesis, which basically says that the formal notation of computable functions he gives via $\Delta_1$ sets exactly coincides with our informal, pre-theoretic notion of "computable by an algorithm." Of course, one cannot prove this "rigorously." However, Kunen gives two pieces of evidence. First, we've been studying computability for many decades and have no plausible counterexample to the thesis. Second, all the "reasonable" attempts to formalize the informal notation have ended up being equivalent (the $\Delta_1$ description is provably equivalent the the Turing machine one, for example).

Kunen compares this (somewhat humorously) to the "Newton thesis" that the derivative of momentum (that is, the $F$ in $F = ma = \frac{d}{dt} p$) correctly captures our intuitive notion of "force." Again, there is no way to prove this, but our vast experience with physics suggests it is correct.

To the Church–Turing thesis and the Newton thesis, we might add the "ZFC thesis": all normal mathematical practice is formalizable in ZFC. The justification is the same: some combination of experience and the fact that all other proposed formalizations end up being equivalent, or nearly so. We take this as evidence that ZFC is sufficient to "simulate" our real mathematical practice. By "nearly so," I mean that we may quibble about things like the axiom of choice or whether to admit classes, like in NBG, but this doesn't affect the ability the formalize everyday mathematics. I'm also ignoring proposed formalizations that are not set-theoretic, like type-theoretic foundations, but their strength can again be understood in terms of set theories, so again there's nothing essentially new happening.

(Incidentally, the simulation analogy explains why it is silly to worry about "junk theorems" as in this question, at least according to the Platonist view adopted by your question and this answer. To worry about whether $3 \in \pi$, or whether $\pi$ is "actually" a set confuses the real concept of $\pi$ with its representation in ZFC. This is similar to being worried that a computer simulation of colliding billiard balls is telling us that billiard balls "truly are" bits stored in computer memory with properties like "existing at so-and-so address in memory." Just because we can answer mathematical questions by translating them into questions about sets doesn't imply that the objects of those mathematical questions "truly are" sets or have set-theoretic properties; all claims of "junk theorems" are a result of this conceptual confusion and hence wrong.)

For the formalist point of view, see III.2 of Kunen's Foundations and the discussion surrounding the above quote from Mileti's book.

Regarding your specific question about "induction in the metatheory," the author is just flagging that the discussion at that point is taking place in the metatheory. Induction is part of basic finitistic reasoning and so should be unobjectionable when used to justify that the set $\{x_1, \dots, x_n\}$ exists.
For basic discussion of classes, see Jech's Set Theory (Third Millennium Edition), pages 5 and 6.

Although we work in ZFC which, unlike alternative axiomatic set theories, has only one type of object, namely sets, we introduce the informal notion of a class. We do this for practical reasons: It is easier to manipulate classes than formulas.

So it's an eliminable shorthand. In a true formalization you would just use the appropriate formula. (This is explained further in the "appendix" at the end of Soundwave's answer.)

Thanks for the great response, the simulation analogy is almost clicking.. If I'm to understand correctly, from the Platonist perspective, the mathematical universe exists in some abstract sense independent of language etc (as do propositions about these objects). Then in axiomatic set theory, platonically believing in only sets, experience shows this is enough to formalise the entire mathematical universe whereby a formal proof, $ZFC \vdash \phi$, coincides with our intuitive notion? So in this formalisation, or interpretation of ZFC, it may be true that $3\in \pi$, but not in our `reality'? — space_kale, Feb 21 '24 at 10:50
I guess I am confused then about where sit when doing `everyday maths'. Are we in the simulation working with set theoretic encodings of abstract mathematical objects, or are we somewhere else? When one does group theory for example, we require a set and a binary operation satisfying some axioms. How do we know how to work with a binary relation without reducing it to be defined in terms of sets? One could then ask the same thing of the all numbers, and then wouldn't we get set theoretic encodings and junk theorems again? — space_kale, Feb 21 '24 at 11:06
@space_kale Thanks for your comments. For your first question, I agree with what you say. In particular, from the Platonist perspective, it may be true that $3_{\mathrm{ZFC}} \in \pi_{\mathrm{ZFC}}$, where the subscript ZFC stands for "the encoding of this number as a set in ZFC", but it is not true that $3 \in \pi$, where $3$ and $\pi$ are the numbers we talk about in everyday mathematics. — Potato, Feb 21 '24 at 12:29
@space_kale The questions in your second comment are more philosophical, so I can only offer my own opinion. I believe that everyday math typically takes place outside the "simulation" of ZFC, and we only enter the "simulation" when we want to talk about metamathematical issues, or our informal reasoning leads to an obvious error or ambiguity. High school students competently manipulate the real numbers everyday without knowing about how to encode them as Dedekind cuts, for example (and knowing about this encoding wouldn't help them at all). — Potato, Feb 21 '24 at 12:32
Groups were defined and used in the 1800's before ZFC existed. So clearly binary relations and the other necessary ingredients can be explained informally/intuitively, perhaps with an informal notion of "set" or "collection," without entering the "simulation" and picking up junk theorems. — Potato, Feb 21 '24 at 12:35
Also, the irrelevance of set theory (beyond the basic definitions) to most of research mathematics, and the fact that most mathematicians probably couldn't state the axioms of ZFC, suggests that they basically all operate outside of the "simulation." — Potato, Feb 21 '24 at 12:42

Julio Di Egidio · Answer 3 · 2024-02-19T16:51:14.783

So from the start: if we take a Platonist perspective (which I was taught as the most pedagogically effective philosophy to have when learning set theory) then we assume sets in some way or another exist along with the intuitive properties like membership.

We might do that, or we might e.g. take a more neutral formalist approach, so no metaphysical import: ultimately, I'd rather contend that the most "pedagogically effective" philosophy (as far as any non-philosophical discipline is concerned, at least) is no philosophy at all, e.g. we might just think of/present any theory, mathematical or otherwise, as a game.

Then when we list the ZFC axioms (which can be done via some bootstrapping process without need for sets) and we are just saying that sets satisfy these objects.

Satisfy these properties, rather. Right, but that does not solve, rather moves the foundational goalpost one step deeper, to the "bootstrapping" process now, namely the bootstrapping of a formal logic to even start writing statements of any (formal) theory.

Here is what I think is a wonderful introduction to what that really looks like starting from blank page. Notice that it is about classical logic then standard set theory, which are not the only possible logics and set theories, but the method is what matters (he eventually proceeds to mathematics for physics, which is the goal of the course, but that is beside our point): Frederic Schuller, Lectures on Geometrical Anatomy of Theoretical Physics (YouTube), lectures 01 and 02 in particular.

Using our intuitive mathematical reasoning and the axioms we can develop all everyday mathematics, including mathematical logic. Is it fair to say that this intuitive notion of a set and mathematical reasoning is the `most' meta metatheory?

No: formal logic (mathematical or otherwise, mathematical is an application) is a pre-requisite to writing any theory: it is the (formal) language in which the theory is written (see my notes above), and it comprises (simply put) logical symbols and rules of inference.

That said, and to the crucial point of your question, it is not meta-theoretic that is relevant to the foundational problem/construction, it is pre-theoretic that we are talking about! (Schuller dubs it "pre-formal": indeed I do not think there is a standard terminology for it.)

In simple terms, we must at least be able to distinguish symbols and count to even start talking about any logic, and those are intuitive notions that it is pointless to formalize (which, by the way, would be meta-theoretical: a theory about another theory), since the very formalization endeavour in principle (I mean, to begin with) has to rely on those intuitive notions...

A quick clarification: If I understand correctly, you are using "pre-theoretic" to mean basic finitistic reasoning. This what the question asker is calling the "metatheory." That is the terminology Kunen's book (referenced in the question) uses. — Potato, Feb 19 '24 at 19:40
You might call it "basic reasoning", finitistic or not, but it's pre-theoretic, and it's just not the same thing as meta-theoretic: in particular, and as I have said, it becomes meta-theoretic if you start theorising about it... and I do not think Kunen is saying otherwise, but I am not particularly an expert on Kunen. — Julio Di Egidio, Feb 20 '24 at 12:01

Understanding Metatheory and the Broader Picture of Foundational Set Theory

3 Answers3

Classes: An Appendix

Linked