8

A minor technical issue with ZF (and other set theories such as Morse-Kelley) is that if one isn't careful, the axioms will admit degenerate model, in which there are no sets at all. The axiom of pairing is typical here:

If $x$ and $y$ are sets, then $\{x, y\}$ is also a set

If there aren't any sets, then the axiom is vacuous.

Some presentations get around this with an explicit axiom of the empty set:

$$\exists x.\forall y. y\notin x$$

From this, we know there's at least one empty set; from extensionality we can can prove that it's unique, and this justifies assigning a symbol $\varnothing$ to it. (Kelley's original presentation of MK has instead an axiom “there exists a set”, and this, plus specification, is enough to prove the existence of the empty set.) Then pairing and union and specification get us a universe of other sets. Fine.

But many presentations omit the axiom of the empty set. Instead, they claim, the axiom of infinity asserts the existence of a set $\omega$, and with specification we can get $\varnothing$ as a subset of $\omega$.

But it seems to me that this approach doesn't actually work. The axiom of infinity states: $$\exists S. (\varnothing\in S) \land (\forall y\in S. y\cup\{y\}\in S)$$

If we're using the axiom of infinity to prove the existence of the empty set, then at this point the symbol $\varnothing$ doesn't refer to anything, and is meaningless, and we have no business using it in the axiom.

The symbol “$\varnothing$” also appears in the axiom of regularity, but there it isn't so problematic, because it only appears in the context “$x=\varnothing$”. By this we actually mean “$\lnot\exists y. y\in x$”, so the issue is merely sloppy notation. But for the axiom of infinity the problem is deeper and it seems to me it can't be fixed notationally, because the axiom of infinity demands that “$\varnothing$” denote an actual object.

Both the Wikipedia presentation of ZF (which claims to follow Kunen) and the Mathworld presentation (claims to follow Jech) have this defect.

I'm not trying to suggest that set theory itself is flawed; the issue is a minor technical one and is easily resolved by including an axiom of the empty set or even an axiom that asserts the existence of some set. My question is whether the typical presentation of set theory is defective. Or perhaps there is no defect and my understanding is deficient?

MJD
  • 65,394
  • 39
  • 298
  • 580
  • 5
    Isn't $\varnothing \in S$ the same as $\exists X (X\in S \wedge X = \varnothing)$? And you agreed that $X = \varnothing$ is not problematic. – Linear Christmas Jan 02 '22 at 12:42
  • 1
    I suppose that works. If you do that you've embedded the axiom of the empty set inside of the axiom of infinity, because (expanding “$X=\varnothing$”) you have said $\exists X(\ldots\land \lnot \exists y(y\in X))…$. – MJD Jan 02 '22 at 12:47
  • Yes, that's right. Very specifically, we claim that there is a set without elements that exists as part of another set (for example, $\omega$). One may then show that the empty set exists also by itself via specification, so one may omit the "there exists a(n empty) set" axiom. A presenter chooses between omitting the axiom, or keeping the small redundancy, or keeping it only to remove it after commentary, or adding it back when looking at some set theory without infinity, etc. – Linear Christmas Jan 02 '22 at 13:10
  • 2
    I think you may also replace the "$\varnothing \in S$" part with simply "$S \neq \varnothing$". So an axiom of infinity in the form $$\exists S(\exists X(X\in S)\wedge \forall y (y \in S \Rightarrow \exists u (u \in S \wedge y \subsetneq u))).$$ You shall have to do more work this way in the beginning to develop usual stuff, and I have not put in the work myself, so could be mistaken. – Linear Christmas Jan 02 '22 at 13:31

3 Answers3

6

If you remember correctly, the symbol $\varnothing$ does not appear in the language of set theory. It is syntactic sugar. In reality it masks the following:

$$\exists z(\forall w(w\notin z))$$

In particular, it posits the existence of a set which is empty. The same goes for $\cup$ and $\{y\}$ that you've used in the statement of the axiom of infinity.

But, and here's the thing, at the end of the day, mathematics, including set theory, is done by humans for other humans. So we need this syntactic sugar, otherwise everything becomes overly formal and annoying, which makes it also somewhat inaccessible for newcomers.

Asaf Karagila
  • 393,674
  • 3
    In general I agree with your point about overly formal and annoying. But the original purpose of ZF was to put set theory on a firm foundation after the catastrophic failure of naive set theory around 1905. So this is one of the rare places where it is important to not cut any corners. I was hoping for a more substantive response from you in particular. – MJD Jan 02 '22 at 12:50
  • 5
    What kind of substantive response do you want? You're using syntactic sugar to mask exactly that bit you're complaining about. There were some very significant exploits uncovered in the last few years. It's more important to people to know they were fixed, than to know how they work. If you truly care, you'll spend the necessary two drinks to think about it and figure it out by yourself. – Asaf Karagila Jan 02 '22 at 12:57
  • 1
    I'll try to pose a shorter and more coherent question, thanks. – MJD Jan 02 '22 at 17:28
  • 2
    The substantive response I was hoping for was a statement like “Yes, this is a technical error, although an unimportant one” or “No, this is not a technical error, because you have misunderstood the following: …”. But I thought it over more carefully as you suggested, looked to see how Jech handled it, and came to the conclusion that it is a genuine error. Thanks for your help. – MJD Jan 08 '22 at 17:35
  • 1
    @AsafKaragila, I think you misunderstood what bothers MJD. Yes, the "same" goes for \bigcup, but we have the axiom of union, ensuring that \bigcup A exists (is a set) whenever A exists. The "same" goes for {x}, but we have the axiom of pairing, ensuring that {x} exists whenever x exists. MJD's thesis (which is wrong, but for a completely different reason which I'll address in a separate comment) is that in the same tone, we should have the axiom of empty set to justify our usage of \emptyset--which in fact is given explicitly by some authors, though obviously not by all. – Veky Jan 09 '22 at 22:55
  • 1
    @Veky: No, the missing point here is that with the exception of Infinity, all the axioms start with $\forall$, meaning they will vacuously be true in the empty structure. – Asaf Karagila Jan 10 '22 at 00:41
  • Well, yes, but "the same goes" for \bigcup A once you have A too, and then you need the axiom of union in order to justify that piece of notation. The \emptyset is the same (at the first glance), only at the leafs of your proofs instead of internal nodes. – Veky Jan 10 '22 at 07:24
  • @Veky: I'm not sure why you keep bringing up unions. The axiom starts with a universal quantifier. – Asaf Karagila Jan 10 '22 at 08:24
  • 1
    @Asaf Those are two completely separate points. Imagine there is a "complement" operation in there. You could also say that "a^c" is simply "positing an existence of a set x such that "\forall y(y\in x\leftrightarrow y\notin a)", but it would make no sense, you'd still have to have an "axiom of complements" to do that. Alternatively, imagine a constant symbol "V" being there. You could also say that "V" is simply "positing an existence of a set x such that "\forall y(y\in x)", but it would make no sense, you'd still have to have an "axiom of universal set" to do that. – Veky Jan 10 '22 at 17:25
  • I hope I've shown you that "whether an operation presupposes any sets" and "whether an operation always gives a set" are two orthogonal questions. Whether the axiom is a Pi or a Sigma sentence is independent of whether the axiom holds. – Veky Jan 10 '22 at 17:27
  • 1
    @Veky: I can't follow your logic, I'm sorry. (Also, *please* use MathJax in the comments. I can compile $\rm\LaTeX$ code in my head, yes, but it's not as convenient for reading as you'd like your readers to be.) – Asaf Karagila Jan 10 '22 at 19:11
  • @AsafKaragila Re your comment: What did you mean by ''There were some very significant exploits uncovered in the last few years"? Could you provide a reference of some kind? And especially, what is meant by "in the last few years"? :-) I am happy to open a beer and try to think by myself, but the trouble is I do not know what is being referred to. Thank you! – Linear Christmas Nov 19 '22 at 16:40
  • 1
    @LinearChristmas: I meant things like the heart bleed exploit, the openssl one, etc. – Asaf Karagila Nov 19 '22 at 18:05
  • @AsafKaragila Oh! It was a semi-tangential metaphor. Silly me... I feared that, somewhere, someone clever discovered problems with syntactic sugar in set theory... ;-) – Linear Christmas Nov 19 '22 at 18:11
6

This is similar to Asaf Karagila’s answer:

Let’s start more generally: Whenever we have some formula $\phi$ with free variable $x$ in some language and we can prove $\exists! x. \phi$ in some theory, we can extend the language by a new constant symbol $c$ (representing the unique $x$ making $\phi$ valid) and add the axiom $\forall x. (x = c \leftrightarrow \phi)$ to get a (basically) equivalent theory that allows us to talk about $c$.

To demonstrate this equivalence, we need to provide a way to transform formulas in the extended language (i.e. containing $c$) into formulas without $c$ such that the new theory proves a sentence with $c$ if and only if the original theory proves the transformed sentence without $c$.

We do this by induction over the structure of formulas. I won’t do everything here, but for a relation symbol $R$, we replace $R(c, t_1, \dots, t_n)$ (with terms $t_1, \dots, t_n$ not containing $c$, say) by $\forall x. (\phi \rightarrow R(x, t_1, \dots, t_n))$. You can do very similar things if $c$ is hidden deeper inside a term or appears in multiple positions.

Note that you can perform the transformation from the last paragraph even when you don’t have a proof of $\exists!x. \phi$ (yet). This is what happens in the presentation of the axiom of infinity. You perform this transformation with $\phi := \forall y. \neg(y \in x)$ and $c := \varnothing$. Your axiom becomes $$ \exists S. (\forall x. ((\forall y. \neg(y \in x)) \rightarrow x \in S)) \wedge (\forall y \in S. y \cup \{ y \} \in S) $$ (where technically, you would need to perform a very similar process for $\{ y \}$ and $\cup$ as well). This is now a perfectly fine axiom (not using $\varnothing$) that still ensures the existence of a set (an infinite set even). From this, you can prove the existence of the empty set (i.e. the necessary sentence $\exists! x. \phi$), justifying the introduction of the symbol a posteriori.

Of course, the introduction of the notation is somewhat circular if done this way, but from a purely logical point of view, nothing circular is going on.

Eike Schulte
  • 3,232
3

You're missing a bit of context: in which logic you're establishing your ZF presentation? It is usually done in first order logic, where you can prove the logical theorem $\exists x(x=x)$, therefore, "there exists a set" which you can use for further constructions. For example, in Hilbert's calculus, the formula is just a shortcut for $\lnot\forall x(x\ne x)$, and it is easy to prove (by contradiction) using universal instantiation.

Of course, there are logics (such as inclusive logic) which do allow for empty domains of discourse, but those are usually more complicated (i.e. have more special cases) and surely not "default" for presentation of ZF. In such logics, "there exists a set" must be taken as a non-logical axiom, since it doesn't follow logically. But if you do that, the question arises, why not simply use first-order logic, if you're postulating away the only option this more complicated logic gives you? :-)

Couchy
  • 2,722
Veky
  • 343
  • 1
    Thanks. Why do you think $\exists x(x=x)$ implies that there must be a set? Logic doesn't talk about sets. For that you need some axioms of sets. It is not hard to imagine a model in which one has an entire Von Neumann universe of proper classes, none of which is a set. – MJD Jan 10 '22 at 00:12
  • 2
    @MJD That might be at least a superficial gap in the case of MK, where the objects of the domain are classes, but in the ZF case, they are sets, so it does suffice to have $\exists x (x=x)$ as a logical axiom to conclude "there is a set" within that paradigm. (But I wouldn't agree with this post that "first order logic" necessarily disallows empty domains... in my experience, both conventions are used, though the convention that disallows is somewhat more common.) – spaceisdarkgreen Jan 10 '22 at 01:43
  • 2
    @MJD "Logic doesn't talk about sets". Yeah, logic is flexible this way and there's no reason why the "$\in$" in the language needs to be interpreted as a membership relation on some collection "real" sets (or classes) in the metatheory. But if we're working internally in ZF, then "set" is no more and no less than an informal name we give to the objects of the domain of discourse. That PA talks about "natural numbers", ZF talks about "sets", MK talks about "classes (and sets)", are just cues for the intuition. But then, in ZF, what does "there is a set" mean besides "$\exists x(x=x)$"? – spaceisdarkgreen Jan 10 '22 at 02:15
  • @MJD But then they would be sets for all intents and purposes. In the same way as finite von Neumann ordinals are natural numbers, since they form a model of PA. It is really immaterial that their internal implementation is using something else. Read about Hilbert's idea of "geometry of chimney-sweeps". ;-) – Veky Jan 10 '22 at 07:11
  • @spaceisdarkgreen I think this is just a nomenclature dispute (though I'd like if you can point me to a reputable source that presents first order logic in inclusive way). In any case, the logic in which ZF is being presented, at least by those authors such as Kunen and Jech, surely is meant to not allow empty domains. – Veky Jan 10 '22 at 07:13
  • 2
    @MJD \exists x(x=x) implies "there is a set (equal to itself)" in the same way that \exists x\forall y(y\notin x) implies "there is a set that no set is an element of". How can you understand what domain of discourse is in one case, and not in the other? :-) – Veky Jan 10 '22 at 07:18
  • @Veky Yes, definitely just a nomenclature dispute (and I'd acknowledge that "somewhat more common" is a bit of an understatement, particularly regarding older texts). I don't believe either Jech or Kunen give a development of 1OL semantics... Kunen lists "set existence" as an axiom but acknowledges that "usually" it is a consequence of the logic, thus redundant to state. But in my experience, set theorists often make a point of e.g. the axiom of infinitely disallowing empty universes. But of course this doesn't make much of a difference practically. – spaceisdarkgreen Jan 10 '22 at 15:50
  • @Veky The attitude that nonempty structures should be included is common, particularly in model theory/algebraically minded people... Hodges' Model theory book has them. But then again many model theory texts don't (cf Tent and Ziegler, who disallow, but note that this convention is "annoying") See also https://math.stackexchange.com/questions/45198/whats-the-deal-with-empty-models-in-first-order-logic. – spaceisdarkgreen Jan 10 '22 at 15:51
  • @Veky (The part in my first comment about Jech and Kunen not developing 1OL semantics was just silly... of course they need to do something here to talk about models of set theory, just not right at the beginning where I was looking. Both Jech and Kunen (the old one - 1980) seem to naively allow empty structures, though the newer Kunen book explicitly doesn't... though of course this is unimportant practically since ZF doesn’t have an empty model, regardless. And of course the first sentence of my second reply is a typo... I meant "empty", not "nonempty".) – spaceisdarkgreen Jan 10 '22 at 17:11
  • @spaceisdarkgreen Well, of course model-theory people find it annoying that empty structures are disallowed. :-P But it's always so when you see just one side of the story. Hodges is a nice example of integrity that is required of a mathematician in such a case: see his "Warning" on page 41. He never even hints at a particular "proof calculus", because really all of them either are complete for the class of nonempty structures, or otherwise are completely unexplored. To me, FOL is not really logic if it only has a semantic part developed--if I want a proof-system-challenged logic, I use HOL. – Veky Jan 10 '22 at 18:03
  • 1
    @Veky I have little experience with this, but while I have certainly heard it is "annoying" to use/develop a proof system that is complete/sound wrt FOL with inclusive semantics, I don't think I buy that this is "completely unexplored"... see for instance JDH's answer in the post I linked above or the comments and answer here – spaceisdarkgreen Jan 10 '22 at 20:18
  • @spaceisdarkgreen Thanks for the reference. You have broadened my horizon a bit, but I still think it's not really FOL. Mendelson (and JDH's comments) nicely show what we're missing then: we can only consider relational signatures (no constant nor functional symbols, so the whole term-empire crumbles down), no truth of formulas (only on sentences), no formulas with free variables in proofs, no valuations (as total functions that we know and love)... and the modification of (A4) really seems hacky (formula mustn't be a sentence). But yes, you have answered my question, and thank you for that. – Veky Jan 11 '22 at 11:49