The set of normal numbers is uncountable

Question

I'm a grad student tutoring for undergrad math majors and one of them asked this question, I got stuck trying to solve it: Say a real number in (0,1) is normal if its ternary expansion contains every finite string made up from 0,1,2. Prove the set of normal numbers is uncountable.

Since this is not the standard meaning of "normal number" (it's a much weaker property than what "normal number" means), you should choose a different name. — Dave L. Renfro, Jul 10 '17 at 16:28

score 6 · Answer 1 · answered Jul 10 '17 at 19:30

Denote by $S_n$ the concatenation of all strings in $0,1,2$ of length $n$. A few examples: $$S_1=012, S_2=000102101112202122$$ Now, the numbers obtained in the following way are all normal according to your definition: $$x((\alpha_i)_{i\in\mathbb{N}}):=0,S_1\alpha_1S_2\alpha_2S_3\alpha_3...$$ where $\alpha_i\in\{0,1\}$. Since there are uncountably many such sequences $(\alpha_i)_{i\in\mathbb{N}}$ and $x((\alpha_i)_{i\in\mathbb{N}})\neq x((\beta_i)_{i\in\mathbb{N}})$ if $(\alpha_i)_{i\in\mathbb{N}}\neq (\beta_i)_{i\in\mathbb{N}}$, the claim is proven.

[Please forgive my sloppy way of writing it down; of course, one could put it much more precisely (by translating the $S_n$ to actual integers and defining $x((\alpha_i)_{i\in\mathbb{N}})$ as series), but this just becomes terribly messy...]

score 3 · Answer 2 · answered Jul 10 '17 at 20:32

Let's call the type of number you're asking about a "base $3$-lexicon", which is inspired by the usage in Calude/Zamfirescu's 1998 paper The typical number is a lexicon.

I'll first show how to generate continuum many (not just uncountably many) base-$3$ lexicons from a given base-$3$ lexicon. Then I'll discuss some other methods that strengthen this result.

Let $x$ be any base-$3$ lexicon such that $0 < x < 1.$ Thus, $x$ looks like this:

$$x \;\; = \;\; 0.a_1a_2a_3 \ldots a_n \ldots $$

where for each $n$ we have $a_n \in \{0, 1, 2\}$ and it is also the case that each finite ternary digit string appears at least once somewhere in the expansion.

Incidentally, it is actually the case that each finite ternary digit string appears infinitely often in the expansion. Why? Consider, for example, the digit string $102.$ Since $102102$ must also appear, the string $102$ appears at least twice. Since $102102102$ must also appear, the string $102$ appears at least three times. And so on. Note this implies that no matter how far out in the ternary expansion you go, each finite ternary digit string will show up at least once (in fact, infinitely many times) after the location you went to.

Let $n_1$ be the least positive integer such that all possible $1$-digit strings (there is a total of $3$ such strings) appear among the digits $a_{1},$ $a_{2},$ $\ldots,$ $a_{n_1}.$ Let $n_2$ be the least positive integer such that all possible $2$-digit strings (there is a total of $3^2 = 9$ such strings) appear among the digits $a_{n_{1} + 1},$ $a_{n_{1} + 2},$ $\ldots,$ $a_{n_2}.$ To see why $n_2$ exists, look again at the last sentence of the previous paragraph. Continuing in the same way, let $n_3$ be the least positive integer such that all possible $3$-digit strings (there is a total of $3^3 = 27$ such strings) appear among the digits $a_{n_{2} + 1},$ $a_{n_{2} + 2},$ $\ldots,$ $a_{n_3}.$ Keep going in the same way.

Now consider numbers having the following ternary expansions, where the *'s can be filled in with any sequence of ternary digits:

$$ 0.a_1a_2a_3 \ldots a_{n_{1}} \, * \, a_{n_{1} + 1} a_{n_{1} + 2} \ldots a_{n_2} \, * \, a_{n_{2} + 1} a_{n_{2} + 2} \ldots a_{n_3} \, * \, a_{n_{3} + 1} a_{n_{3} + 2} \ldots a_{n_4} \, * \, \ldots$$

Now observe that no matter what digits we put in the locations marked with a *, the result will still be a base-$3$ lexicon. For example, $102$ will appear in any of these numbers, since $102$ will show up somewhere between the 2nd and 3rd * locations.

Since there are continuum many ways to fill in the *'s with digits from $\{0,1,2\}$ (indeed, there are $2^{\aleph_0}$ = continuum many sequences of $0$'s and $1$'s, so we don't even need to use the digit $2),$ we get continuum many base-$3$ lexicons.

With very little additional work, we can strengthen this to every nonempty open interval of real numbers contains continuum many base-$3$ lexicons. I'll describe one way to show this by using a representative example. Suppose we want to show there are continuum many such numbers between $0.12345601$ and $0.12345602.$ In this case, pick any base-$3$ lexicon between these two numbers (pick a base-$3$ lexicon between $0$ and $1,$ call it $0.b_1b_2b_3 \ldots b_n \ldots,$ and then $0.12345601b_1b_2b_3 \ldots b_n \ldots$ will be a base-$3$ lexicon between $0.12345601$ and $0.12345602)$ and then repeat the above process, with the additional requirement that $n_1 > 8$ so that you'll stay between $0.12345601$ and $0.12345602$ no matter how the *'s are filled in.

Even stronger results are true. The set of all real numbers between $0$ and $1$ whose ternary expansions do not contain a specified ternary digit is (when $0$ and $1$ are thrown back in, which only makes the set larger) is a Cantor set with a uniform dissection ratio of $\frac{1}{3}.$ Similarly, the set of all real numbers between $0$ and $1$ whose ternary expansions do not contain a specified $2$-digit ternary string is (when $0$ and $1$ are thrown back in, which only makes the set larger) is a Cantor set with a uniform dissection ratio of $\frac{1}{9}.$ And so on. Each of these Cantor sets is small in the sense that each has Lebesgue measure zero (easy proof by advanced method: use the Lebesgue density theorem) and each is nowhere dense. So what's left over when you toss out all the numbers whose ternary expansions omit any possible finite digit string is what's left over when you throw out countably many sets each of which has Lebesgue measure zero and each of which is nowhere dense, which will be a set that is maximally large both in the Lebesgue measure sense and in the Baire category sense. That is, almost all real numbers in both the Lebesgue measure sense and the Baire category sense are base-$3$ lexicons. Contrast this with the set of base-$3$ normal numbers, which is large in the sense of Lebesgue measure but small (a first Baire category set) in the sense of Baire category -- see my 16 June 2001 sci.math post.

Incidentally, the reason I began the previous paragraph with "even stronger results are true" is because the complement of a measure zero set has continuum many points in every nonempty open interval (and also the complement of a first Baire category set, and even more so for the complement of a set that is simultaneously measure zero and first category, such as the set of base-$3$ lexicons), and it is easy to construct sets whose complements have continuum many points in every nonempty open interval that are NOT measure zero or first category.

The set of normal numbers is uncountable

2 Answers2

Linked