22

The book I am working from (Introduction to Set Theory, Hrbacek & Jech) gives a proof of this result, which I can follow as a chain of implications, but which does not make natural, intuitive sense to me. At the end of the proof, I found myself quite confused, and had to look carefully at the build-up to see how the conclusion followed. I get the steps, now - but not the intuition.


The authors took sets $X$ and $Y$ and assumed injections $f: X \rightarrow Y$, $g: Y \rightarrow X$. Since $X \sim g[f[X]]$ and $X \supseteq g[Y] \supseteq g[f[X]]$, and $Y \sim g[Y]$, the authors went to prove the lemma that $A \sim A_1, A \supseteq B \supseteq A_1 \implies A \sim B$.

For the lemma, they defined recursive sequences $\{A_n\}_{n \in \omega}, \{B_n\}_{n \in \omega}$ by $A_0 = A, A_{n+1} = f[A],$ $ B_0 = B, B_{n+1} = f[B].$ Since $A_0 \supseteq B_0 \supseteq A_1 \implies f[A_0] \supseteq f[B_0] \supseteq f[A_1]$, we get from induction that $A_{n+1} \subseteq A_n$. Putting $\{C_n\}_{n \in \omega} = \{A_n - B_n\}_{n \in \omega},$ they noted that $C_{n+1} = f[C_n]$ (since $A_n \supseteq B_n$ inductively, $f[A_n - B_n] = f[A_n] - f[B_n]$). Putting

$$ C = \bigcup_{n=0}^\infty C_n, \text{ } D = A - C,$$

they noted that $f[C] = \bigcup_{n=1}^\infty C_n$, and that $h(x): A \rightarrow f[C] \cup D$ can be defined by sending $x \in C \rightarrow f(x),$ and $x \in D \rightarrow x$. It is clear that $h$ is one-to-one and onto.

Inductively, for $n > 0$, $C_0 \cap C_n = \varnothing$; it follows that $C_0 \cap \bigcup_{n=1}^\infty C_n = C_0 \cap f[C] = \varnothing$. We know that $C_0 \cup f[C] \cup D$ = A, and since all three sets are disjoint, we may conclude that $f[C] \cup D = A - C_0 = A - (A - B) = B$. Thus, our bijection $h$ maps $A$ to $B$, as we wanted.


What is the intuition here? What were H&J, or Cantor, Bernstein, etc. thinking when they went to prove this - the "high-level" idea? Is there a clearer proof, in the sense of thought process (not necessarily shorter)?

user26857
  • 52,094
Chris
  • 4,865
  • 1
    If I can count all the $Y$s using $X$s and all the $X$s using $Y$s, then there must be the same number of $X$s as $Y$s. – Kevin Carlson Oct 30 '12 at 22:33
  • 2
    Little remark: If you just care about the result, there's a (in my opinion very beautiful) proof described on the wiki page on the theorem. (I think it's clearer. Obviously it's also shorter.) – k.stm Oct 30 '12 at 23:29

6 Answers6

16

$\newcommand{\ran}{\operatorname{ran}}$Here’s one way to think about it. Suppose that $y\in\ran f$; then we can pull $y$ back to $f^{-1}(y)\in X$. If $f^{-1}(y)\in\ran g$, we can pull it back to $g^{-1}(f^{-1}(y))\in Y$. If we continue this pulling back, one of two things must happen: either we reach a dead end at a point of $X$ or $Y$ that can’t be pulled back (because it’s in $Y\setminus\ran f$ or $X\setminus\ran g$), or we don’t.

Let $X_0=X\setminus\ran g$, the set of points of $X$ that cannot be pulled back at all, and let $Y_0=Y\setminus\ran f$. More generally, for each $n\in\omega$ let $X_n$ be the set of points of $X$ that can be pulled back exactly $n$ times, and let $Y_n$ be the set of points of $Y$ that can be pulled back exactly $n$ times. Finally, let $X_\omega$ and $Y_\omega$ by the subsets of $X$ and $Y$, respectively whose points can be pulled back infinitely many times.

At this point a sketch helps; it should show the partitions $\{X_n:n\le\omega\}$ of $X$ and $\{Y_n:n\le\omega\}$ of $Y$, and it should include arrows indicating what parts of $X$ get mapped to what parts of $Y$ and vice versa. To avoid having arrows crossing, I’ve taken $X$ and $Y$ apart in the following diagram.

$$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1&\overset{g}\longrightarrow&X_2&\overset{f}\longrightarrow& Y_3&\overset{g}\longrightarrow&X_4&\dots&X_\omega\\ Y_0&\overset{g}\longrightarrow&X_1&\overset{f}\longrightarrow&Y_2&\overset{g}\longrightarrow&X_3&\overset{f}\longrightarrow&Y_4&\dots&Y_\omega \end{array}$$

Each of the arrows is a bijection, so I can break up the diagram into $\omega$ self-contained parts. The first two parts are:

$$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1\\ Y_0&\overset{g}\longrightarrow&X_1 \end{array}\qquad \begin{array}{} X_2&\overset{f}\longrightarrow&Y_3\\ Y_2&\overset{g}\longrightarrow&X_3 \end{array}$$

Ignoring $X_\omega$ and $Y_\omega$ for the moment, I can rearrange the rest of the diagram to give me a bijection from $X\setminus X_\omega$ to $Y\setminus Y_\omega$:

$$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots \end{array}$$

Finally, I claim that $f[X_\omega]=Y_\omega$: everything in $X_\omega$ can be pulled back infinitely often, so everything in $f[X_\omega]$ can be pulled back infinitely often, and therefore $f[X_\omega]\subseteq Y_\omega$. On the other hand, if $y\in Y_\omega$, then $y$ can be pulled back infinitely often, so it must be possible to pull $f^{-1}(y)$ back infinitely often, and therefore $f^{-1}(y)\in X_\omega$. Thus, $Y_\omega\subseteq f[X_\omega]$ as well. The diagram above can now be completed to show a bijection from $X$ onto $Y$:

$$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots\\ X_\omega&\overset{f}\longrightarrow&Y_\omega \end{array}$$

The bijection is defined piecewise, but that’s no problem.

There are a few details to be filled in to make this a fully rigorous proof, but I think that it does give a reasonable idea of one possible intuition.

Added: Here’s a very rough sketch. Arrows from left to right are (parts of) $f$, and arrows from right to left are (parts of) $g$.

enter image description here

Brian M. Scott
  • 616,228
  • I'm having a hard time following this. Where you note that "at this point, a sketch helps" - what does this sketch look like? How have you taken the sets X and Y apart into their partitions? – Chris Oct 30 '12 at 23:38
  • @user1296727: I thought that the diagrams made that pretty clear; I don’t have very good facilities for doing so, but I’ll try to add a picture in a bit. – Brian M. Scott Oct 31 '12 at 00:00
  • Thank you for the diagram; it helps. When you say something is in $X_n$, do you mean that, by $n$ iterations of alternately pulling it back into Y, and then X, and then Y... you can get to something which can't be pulled back? (Otherwise, I don't see why the functions are bijections.) – Chris Oct 31 '12 at 00:26
  • @user1296727: Yes, that’s exactly right. Each pull-back is from one set to the other, so they must alternate. – Brian M. Scott Oct 31 '12 at 00:28
  • This looks like Eilenberg's Swindle. See http://en.wikipedia.org/wiki/Eilenberg%E2%80%93Mazur_swindle#Other_examples – Baby Dragon Sep 20 '13 at 21:42
11

It is a reasonably long and seemingly complicated proof, but actually the idea is very simple (or at least it is in the proof I've seen). We have injections $f:X\to Y$ and $g:Y\to X$. To build a bijection $h:X\to Y$ we need to send each point in $X$ to a unique point in $Y$, making sure we hit everything.

We can start by saying $h(x) = f(x)$. But of course, this doesn't hit everything if f is not surjective. But , for each element $y$ we don't hit in $Y$ (i.e. the set ${Y\setminus{f(X)}}$) there is a unique element $g(y)$ in $X$ which we can make map to $y$. So we edit $h$ such that now $$h(x) = \begin{cases} g^{-1}(x) & \mbox{if } g^{-1}(x) \notin f(X) \\ f(x) & \mbox{otherwise} \end{cases}$$

So this time we hit everything we didn't get first time from the $g^{-1}(x) $ part. But now we're missing some other parts! (namely the values $f(x)$ where $x \in g(Y\setminus f(X))$ but $f(x) \notin Y\setminus f(X)$.) We can define the sets that we "miss" in each iteration of improving $h$ as $C_n$ and say $C = \displaystyle \bigcup_1^{\infty}C_n$. Then defining $$h(x) = \begin{cases} g^{-1}(x) &\mbox{if } g^{-1}(x) \in C \\ f(x) &\mbox{otherwise} \end{cases}$$

And by thinking of how we built it up, it kind of makes a bit more sense as to why it is a bijection.

Please note that this isn't the most efficient way to carry out the proof, but I explained it in the way that I would come up with it whilst trying to see why it works, rather than what I'd write down formally. Also please note it's been a long time since I've seen the proof and I may have gotten some of the specifics wrong (please anyone correct me if this is the case!), but this is the right general idea (I hope!)

Tom Oldfield
  • 13,034
  • 1
  • 39
  • 77
11

My favorite proof is due to R. H. Cox, and I learned it from the wonderful Handbook of Analysis and its Foundations by Schechter, from which I've taken the illustration below. The basic idea is that an injection from $Y$ to $X$ identifies $Y$ with a subset of $X$. So we can reduce the problem to showing that if we have $Y\subseteq X$ and $f:X\to Y$ is an injection, then $|X|=|Y|$ and this is less confusing. let $C=X\backslash Y$. The sequence of sets $\big(f^n(C))$ is disjoint since $f(X)\subseteq Y$ and $f$ is injective. Let $S=\bigcup_{n=0}^\infty f^n(C)$. Define $g:X\to Y$ by

$$g(x) = \begin{cases} f(x)&\mbox{if } x\in S \\ x & \mbox{if }x\notin S. \end{cases}$$

It is easily verified that $g$ is a bijection.

Remark: According to Kunen's book on the foundations of mathematics, Dedekind already proved this version of the Cantor-Bernstein theorem in 1887.

enter image description here

Michael Greinecker
  • 32,841
  • 6
  • 80
  • 137
  • Yes, this theorem has quite the history. It's buried in a link [maybe in the comments] in one of my answers about the topic. – Asaf Karagila Oct 30 '12 at 23:40
  • @Asaf: I actually found the original reference online. I guess you refer to this comment by Theo. – Michael Greinecker Oct 30 '12 at 23:52
  • This proof is more succinct, but I don't know about clear. Why the heck would I expect your function $g$ to be a bijection? Why would I expect that this continued iteration of $f$ from the image of $X/Y$, over and over again, coupled with $X/Y$, would give me something in bijection with $X \cap Y$?! – Chris Oct 30 '12 at 23:54
  • @Michael: Yes, that's the one. – Asaf Karagila Oct 30 '12 at 23:57
  • @user1296727 Look at the picture. – Michael Greinecker Oct 31 '12 at 00:03
  • 1
    I did...? (Please do be explicit, I'm juggling two proofs at once.) – Chris Oct 31 '12 at 00:04
  • @user1296727 I think it should be clear why $g$ is a bijection. The identity maps $X\backslash S$ to itself and $S$ is constructed so that $f$ is injective on it and $f(S)\subseteq X$. Clearly, every point in $X\backslash f(S)$ is in $X\backslash S$ since $f(S)\subseteq S$. – Michael Greinecker Oct 31 '12 at 00:13
  • Let me rephrase. I get how the proof wraps up - $f(x)$ takes things in $S$ to $Y$, and something not in $S$ was in $X \cap Y$ to begin with, thus the identity mapping suffices, so $g$ is surjective (and obviously injective). What I don't get is, why would I ever expect (starting from scratch) that this whole argument - your post - would get me a bijection? – Chris Oct 31 '12 at 00:19
  • In this setting, you have o make somehow use of the $f$. You do not have to worry about points in $Y$ not being the image under $f$, since you can always apply the identity function there. So apply $f$ only to $X\backslash Y$. But you cannot map the points in $f(X\backslash Y)$ tot hemselves since they are alread in the range of $f$. So apply $f$ again.. The point is not that it is obvious that the result is correct, for this the proof using ordinals works better, but if there is a constructive proof at all, it should look something like this. – Michael Greinecker Oct 31 '12 at 00:27
  • @madmatician Basically, in order to make the proof comprehensible in terms of the picture. The proof doesn'tstrictly require it, the intuition IMHO does. – Michael Greinecker Aug 06 '16 at 23:29
2

As you pointed out, to prove Cantor-Schroeder-Bernstein theorem, one needs to prove the following lemma:

If $A_1 \subset B \subset A$ and $|A_1|=|A|$, then $|B|=|A|$.

According to the hypothesis, there exists some one-to-one mapping $f$ from $A$ onto $A_1$. So, we need a one-to-one mapping $g$ from $A$ onto $B$. How can we find that?

One may think that we should embed the set $B$ in the set $A$ by inclusion so that we find the required mapping as the inverse of the inclusion mapping, that is, the mapping$$h:A \to B \\ h(x)=x.$$This mapping is one-to-one, but it cannot be defined on the whole set $A$ because for $x \in A-B$ $h(x)=x$ is not contained in the set $B$.

One may think that we should use the mapping $f$ as our required mapping, that is,$$k:A \to B \\ k(x)=f(x).$$This mapping is one-to-one, but it cannot cover the whole set $B$ because the range of the function $f$ is the set $A_1 \subset B$ and there may be some $x\in B$ which is not in the set $A_1$.

As we see there are two extreme approaches for finding the required mapping; the first one covers the whole set $B$, and the second one covers the whole set $A$. So, it is intuitively expectable that to find the required mapping we should use both the mappings (approaches) simultaneously so that both the sets $A$ and $B$ are covered in a one-to-one manner.

But, there exists a problem. If we use both the mappings simultaneously, for example $g: A \to B \quad g(x)=\begin{cases}f(x) & \text{if } x\in C; \\ x & \text{if } x \in A-C \end{cases}$ for some subset $C \subset A$, then we may miss either the onto property or the one-to-one property, because the ranges of the pieces of $g$ may overlap.

So, our original problem is reduced to finding some subset $C \subset A$ such that the ranges of the pieces of the mapping are disjoint and the union of them is equal to the set $B$. Now, how to find such a $C$?

Here is an idea. Since the function $h(x)=x$ cannot be defined on $C_0=A-B$, as explained above, let us map this subset of $A$ by the function $k(x)=f(x)$ into the set $B$. So, we obtain the mapping$$g_0(x)= \begin{cases}f(x) & \text{if } x \in C_0; \\ x & \text{if }x \in A-C_0 \end{cases}.$$But, we have missed the one-to-one property because the ranges of the pieces overlap (In fact, $f[C_0]$ is contained in the range of the second one, since $f[C_0] \subset f[A] \subset A-C_0$).

So, we need to remove the problematic points $C_1=f[C_0]$ from the domain of the second piece (since the domain and the range of the function $h(x)=x$ are the same) to retain the one-to-one property. However, since we need to define the mapping $g$ on the whole set $A$, we need to add such points to the domain of the first piece. So, we obtain the mapping$$g_1(x)= \begin{cases}f(x) & \text{if } x \in C_0 \cup C_1; \\ x & \text{if }x \in A-(C_0 \cup C_1) \end{cases}.$$But, we have missed the one-to-one property because the ranges of the pieces overlap (In fact, $f[C_1]$ is contained in the range of the second one, since $f[C_1]=f^2[C_0] \subset f^2[A] \subset A-(C_0 \cup C_1)$).

So, we need to remove the problematic points $C_2=f[C_1]$ from the domain of the second piece (since the domain and the range of the function $h(x)=x$ are the same) to retain the one-to-one property. However, since we need to define the mapping $g$ on the whole set $A$, we need to add such points to the domain of the first piece. So, we obtain the mapping$$g_2(x)= \begin{cases}f(x) & \text{if } x \in C_0 \cup C_1 \cup C_2; \\ x & \text{if }x \in A-(C_0 \cup C_1 \cup C_2) \end{cases}.$$ $$\vdots \qquad \vdots \qquad \vdots$$ But, we have missed the one-to-one property because the ranges of the pieces overlap (In fact, $f[C_{n-1}]$ is contained in the range of the second one, since $f[C_{n-1}]=f^n[C_0] \subset f^n[A] \subset A-(C_0 \cup C_1 \cup \cdots C_{n-1})$).

So, we need to remove the problematic points $C_n=f[C_{n-1}]$ from the domain of the second piece (since the domain and the range of the function $h(x)=x$ are the same) to retain the one-to-one property. However, since we need to define the mapping $g$ on the whole set $A$, we need to add such points to the domain of the first piece. So, we obtain the mapping$$g_n(x)= \begin{cases}f(x) & \text{if } x \in C_0 \cup C_1 \cup \cdots \cup C_n; \\ x & \text{if }x \in A-(C_0 \cup C_1 \cup \cdots \cup C_n) \end{cases}.$$ $$\vdots \qquad \vdots \qquad \vdots$$ This pattern motivates us to define the mapping $g$ as follows.$$g(x)=\begin{cases}f(x) & \text{if } x \in C; \\ x & \text{if } x \in A-C \end{cases}, \qquad C= \bigcup_{n=0}^{\infty }C_n$$Noting that$f[C]=\bigcup_{n=1}^{\infty }C_n$, we can easily see that the mapping $g$ is one-to-one because each of its pieces is and the ranges of the pieces are disjoint and it is onto the set $B$.


Addendum

Looking at how the $C_n$'s are constructed, one may think that the existence of the set $C$ (and so the proof of the theorem) relies on the existence of some infinite set like $\mathbb{N}$ to be able to define the sets $C_n$'s recursively. However, in this section we show that such a view is not correct.

In fact, to obtain the bijective mapping $g$, we need some sets $C$ such that the values of the function $f$ at the points of $f[C]$ do not lie outside of $f[C]$. The existence of such a set can be guaranteed by applying some fixed-point theorem (Knaster-Tarski Theorem) to some monotone function of sets, as follows.

Let $F: \mathcal{P}(A) \to \mathcal{P}(B)$ be monotone, i.e., if $X \subset Y$, then $F(X) \subset F(Y)$ ($\mathcal{P}(A)$ is the power set of $A$). Consider the set $T= \{ X \subset A \mid F(X) \subset X \}$. It can be easily seen that $\overline{X}=\bigcap T$ is the least fixed point of $F$ (Proof: $A \in T$, so $T \neq \varnothing$ and so $\overline{X}=\bigcap T$ can be defined. Since $F$ is monotone and for any $X \in T$ we have $\bigcap T \subset X$, $F(\overline{X}) \subset F(X)$ for every $X \in T$, so $\overline{X} \in T$.

Since $F$ is monotone and $F(\overline{X}) \subset \overline{X}$, we have $F(F(\overline{X})) \subset F(\overline{X})$, so $F(\overline{X}) \in T$.

However, since $\overline{X} \subset X$ for every $X \in T$, we have $\overline{X} \subset F(\overline{X})$. Thus, $F(\overline{X})=\overline{X}$.

If $F$ has some other fixed points $X'$, i.e., $F(X')=X'$, then $X' \in T$. Since $\overline{X} \subset X$ for every $X\in T$, we conclude that $\overline{X}=\bigcap T$ is the least fixed point of $F$).

Consider the function $F(X)=(A-B)\cup f[X]$. Clearly, it is monotone, so the set $C=\overline{X}$ defined above is its least fixed point.

Now, we can easily see that the mapping $g:A \to B$ defined by$$g(x)=\begin{cases}f(x) & \text{if } x\in C; \\ x & \text{if } x \in A-C \end{cases}$$ is one-to-one and onto the set $B$ (We only need to note that $$\begin{align}f[C] \cup (A-C) & =f[C] \cup (A-((A-B) \cup f[C])) \\ & = f[C] \cup ((A-(A-B)) - f[C]) \\ & =f[C] \cup (B-f[C]) \\ & =B \end{align}$$(Please note that in the above calculation we have used the fact that $f[C] \subset A_1 \subset B \subset A$) and$$\begin{align}f[C] \cap (A-C) & =f[C] \cap (A-((A-B) \cap f[C])) \\ & = f[C] \cap ((A-(A-B)) - f[C]) \\ & =f[C] \cap (B-f[C]) \\ & = \varnothing \end{align}$$(Please note that in the above calculation we have used the fact that $f[C] \subset A_1 \subset B \subset A$)).

Now, the least fixed point of the function $F$ can be obtained recursively as follows.

Clearly the function $F$ is continuous, meaning that for any nondecreasing sequence of subsets of $A$, $\langle X_i \mid i \in \mathbb{N} \rangle$, $X_i \subset X_j$ whenever $i \le j$, we have$$F \left ( \bigcup_{i \in \mathbb{N}}X_i \right ) = \bigcup_{i \in \mathbb{N}} F \left ( X_i \right ).$$Let us define recursively $X_0=\varnothing$, $X_{i+1}=F(X_i)$ and then define $\overline{X}=\bigcup_{i \in \mathbb{N}}X_i$. Clearly, the $\langle X_i \mid n \in \mathbb{N} \rangle$ is a nondecreasing sequence of subsets of $A$. So we have$$\begin{align}F \left ( \bigcup_{i \in \mathbb{N}} X_i \right ) & = \bigcup_{i \in \mathbb{N}}F(X_i) \\ & = \varnothing \cup F(X_0) \cup F(X_1) \cup \cdots \\ & = X_0 \cup X_1 \cup X_2 \cup \cdots \\ & = \bigcup_{i \in \mathbb{N}} X_i. \end{align}$$Thus, $\overline{X}=\bigcup_{i \in \mathbb{N}}X_i$ is a fixed point of $F$.

Now, if $X'$ is another fixed point of $F$, since $F$ is monotone and $\langle X_i \mid i \in \mathbb{N} \rangle$ is a nondecreasing sequence of subsets of $A$, we have$$\varnothing \subset X' \quad \Rightarrow \quad X_1=F(\varnothing ) \subset F(X')=X' \\ X_1 \subset X' \quad \Rightarrow \quad X_2=F(X_1) \subset F(X')=X' \\ \vdots \qquad \vdots \qquad \vdots \\ X_{n-1} \subset X' \quad \Rightarrow \quad X_n =F(X_{n-1}) \subset F(X')=X' \\ \vdots \qquad \vdots \qquad \vdots$$So, $\overline{X}=\bigcup_{i \in \mathbb{N}} \subset X'$. Thus, $\overline{X}$ is the least fixed point of $F$.

Hence, we conclude that the fixed point of the function $F(X)=(A-B) \cup f[X]$, $C$, must be of the form$$\begin{align}C & =(A-B) \cup ((A-B) \cup f[A-B]) \cup ((A-B) \cup f[A-B] \cup f[f[A-B]]) \cup \cdots \\ & = C_0 \cup (C_0 \cup C_1) \cup (C_0 \cup C_1 \cup C_2) \cup \cdots \\ & = \bigcup_{i \in \mathbb{N}}C_n\end{align}$$(Please remember that $f$ is injective), which was already obtained from our original argument.

Therefore, the existence of the set $C$ can be confirmed without needing existence of some infinite set like $\mathbb{N}$. Now, if $A$ is a finite set, then $B$ must be equal to $A$ and so $C=A-B=\varnothing$. But, if $A$ is an infinite set (so the existence of some infinite set has been already assumed in our theory), then the set $C$ is constructed from an infinite chain of sets, as explained above.

Later
  • 722
  • 2
  • 8
  • 24
1

When Y is a subset of X, Y can be assumed a proper subset of X otherwise the result is trivial. f is then called a reflection. If Z is a subset of X that contains Y then X is partitioned into Z and the rest of X which can be imagined as a frame of Z. This partitioning is carried over by f to Y, and then again to the image of Y under f and so on. Thus we obtain a sequence of disjoint frames and a sequence of nested subsets of Z. Picture the frames as an infinite stack. f then pushes the stack one level down. So if you take f on the stack of frames and the identity on the subsets you get a 1-1 mapping from X onto Z.

This is the essence of Dedekind's proof mentioned above. cf. my book Arie Hinkis: Proofs of the Cantor-Bernstein Theorem, https://doi.org/10.1007/978-3-0348-0224-6

Arik
  • 86
0

The OP asks

What is the intuition here?

If $f$ is not surjective and $g$ is not surjective can we construct a subset $\hat f$ relation of, say

$\quad \displaystyle \Gamma = \text{Graph}(f) \cup {\big[\text{Graph}(g)\big]}^{-1}$

that puts $X$ and $Y$ into a $\text{1:1}$ correspondence?

(if $\rho$ is a relation $\rho^{-1}$ is the converse relation

The function $f: X \to Y$ supplies

$\quad x \leftrightarrow f(x)$

The function $g:Y \to X$ supplies

$\quad y \leftrightarrow g(y)$

Every $x \in X$ has a $\text{1:1}$ arrow with $x$ on the left.

Every $y \in Y$ has a $\text{1:1}$ arrow with $y$ on the right (flip the arrow since you are going from $X$ to $Y$).

So you are looking

enter image description here

for $\Gamma$ and you want to (gently) move some of the fingers (arrows) around to reach all the knuckles.

With the gentle approach, you would start by looking at some $y \in Y$ not in the image of $f$ and try to 'get something going' using $g$.

As you analyze these problem spots and try to find a remedy, you'll find that using $\Gamma$ as the 'spare parts store' will logically force the incremental build of $\hat f$. The flash of insight - fix all the problems spots $Y \setminus f(X)$ and everything will fall into place (no more problems - the construction of the bijective $\hat f$ build will be completed).

Although not required if the goal is just a bijection, you can fix all problems ($g$ is missing points also) in one swipe (proof) and build a bijective $\hat g: Y \to X$ along with $\hat f: X \to Y$. An appreciation of symmetry would be the impetus for this nicety.

CopyPasteIt
  • 11,366