Fixing a flaw with less fuss?

Dana MacKenzie is the author of a terrific book, The Universe in Zero Words. Actually the book has many words, 216 pages of them, but is a brilliant history of mathematics as told through equations.

Today I want to talk about one of his chapters that discusses the equation ${2^{\aleph_{0}} = \aleph_{1}}$.

This equation expresses Georg Cantor’s Continuum Hypothesis. Building on his proof that the reals are uncountable, Cantor believed that they are not only uncountable, but are exactly equal to the next “size” possible for sets, ${\aleph_{1}}$. We now know that this statement is independent of the standard axioms of set theory, thanks to the work of Kurt Gödel and Paul Cohen.

I just returned from visiting QCRI, a new research lab in Qatar, and read the whole book cover to cover during the sixteen hour plus flight home to Atlanta. It did not take the full trip to read, but it helped pass the time: I also ate, slept, and watched many movies.

Mackenzie has written a unique and beautiful book. He has the ability to explain math ideas in a few sentences that give a non-expert at least some idea what the idea is about. I strongly recommend the book as a great read—even if you are not on a very long flight.

As you might guess he presents the famous proof of Cantor that the real numbers are uncountable. One issue that he skips in the main text he addresses in a rather long footnote:

This argument has a slight flaw in it due to the fact that some numbers have two decimal representations, for example ${0.499\dots = 0.5000\dots }$

I have seen this issue raised many times before in regard to the proof and thought it might be worth a bit of discussion. The following is a small contribution to this “slight flaw.” Note I am only addressing those who believe that Cantor’s proof is correct—not those who still think the reals are countable. Recall we have discussed this issue a few times before—see here and here for more details.

## The Argument

Mackenzie presents the classic proof that takes any countable list of decimal numbers of the form ${\boldsymbol{0.}\dots}$ and proves that there is another decimal number of the same form that is not in the list. He does this by building the diagonal number: he adds ${1}$ to each digit on the diagonal with the rule that ${9}$ wraps back to ${0}$.

Let’s be a bit more mathematical. Given a countable list of infinite decimals

$\displaystyle D = D_{1},D_{2},D_{3},\dots$

say the Cantor diagonalizer is the number ${C(D)}$ defined by the rule: the ${k}$th digit is ${D_{k}(k) + 1 \mod 10}$. The theorem is of course that for all lists ${D}$, the decimal ${C(D)}$ is not in the list ${D}$.

The difficulty, the difficulty Mackenzie calls a “slight flaw,” is that while ${C(D)}$ is not in the list ${D}$ it could represent a number that is in the list. The issue is the generalization of the fact that

$\displaystyle 0.4999\dots = 0.5000\dots$

Call decimals that represent the same real number twins. A number can have at most one twin, and that only happens if the number ends in all ${0}$‘s or all ${9}$‘s.

The problem is that this could be the list:

$\displaystyle \begin{array}{rcl} &0&.499999\dots \\ &0&.199999\dots \\ &0&.299999\dots \\ \vdots \end{array}$

The diagonal number is ${.5000\dots}$ which is not in the list, but its twin, ${0.4999\dots}$ is in the list.

## Fixing The Slight Flaw

One way to avoid the flaw is to use a better local rule. Ken and I, when we teach this argument use a better rule. Ken sends all digits to ${5}$, except that he sends ${5}$ to ${4}$. Thus, in the example we just gave,

$\displaystyle \begin{array}{rcl} &0&.499999\dots \\ &0&.199999\dots \\ &0&.299999\dots \\ \vdots \end{array}$

The diagonalizer would be ${0.555\dots}$ which works fine. The trick is to avoid those pesky ${0}$‘s and ${9}$‘s.

## Another Way

What I realized during the long flight is there is another way to keep the standard “add one to digit rule,” but still make the theorem work. This avoids changing the rule. It is not fundamentally better than the above rule change, but seems like a useful remark. It is based on the fact that in order to get into trouble the list ${D}$ must be strange.

Lemma: If ${D}$ is a countable list of decimals that contains a representative of every rational number, then the Cantor diagonalizer ${C(D)}$ is not in the list ${D}$, nor is its twin.

Proof: Suppose ${c = C(D)}$ has a twin. There are two cases: one where ${c}$ ends in all ${0}$‘s and one where ${c}$ ends in all ${9}$‘s. Let’s look at the first case, since the other is handled by the same argument.

Thus,

$\displaystyle c = \alpha 000\dots$

for some finite string of decimal digits ${\alpha}$. Let ${k}$ be the length of ${\alpha}$. Now since ${D}$ contains all rational numbers, it must contain each of

$\displaystyle \begin{array}{rcl} &0&.333\dots \\ &0&.0333\dots \\ &0&.00333\dots \\ &0&.000333\dots \\ \vdots \end{array}$

Since each of the above decimals is in the list, one of them appears at a position ${m}$ that is greater than ${k}$. But then in the ${m}$-th place, ${c}$ contains a ${1}$ or a ${4}$, not a ${0}$. This is a contradiction, and the lemma is proved. $\Box$

## No Pigeons Were Harmed In This Proof

If you examine the numbers again, you’ll see that ${m+1}$ members of the list have a ${3}$ in positions ${m+1}$ and higher. Since at most ${m}$ of those can come in the first ${m}$ entries of the overall list ${D}$, it follows that at least one of them is still waiting to be diagonalized when we get past the ${\alpha}$ part of ${c}$. Thus in fact ${c}$ does have at least one ${4}$ in it.

This stronger inference uses the pigeonhole principle. We seem to have mentioned this only once, briefly, on this blog. Although it was coined as the Schubfachprinzip (“Principle of the Drawers”) by Johann Dirichlet as early as 1834, the first use of “pigeonhole” for it in English appears to be a paper by Raphael Robinson published in 1941. Ken thinks the term must be older and must refer not to “pigeons in holes” but rather to the racks of mail slots still called pigeonholes today in the colleges that make up Oxford and Cambridge and some other universities. If there are ${n}$ Fellows and ${n+1}$ letters arrive, some lucky Fellow will receive at least two letters${\dots}$

The pigeonhole principle is not trivial to teach, and there have been beautiful papers in proof complexity that quantify its power. If ${2n}$ letters arrive, or ${n^2}$, or exponentially many, the inference that someone gets two becomes progressively easier. If infinitely many letters arrive, at least one of the ${n}$ Fellows will receive infinitely many letters.

But our proof above needs none of these inferences. It is only saying that out of those infinitely many rational numbers with ${3}$s, at least one will occur after the given finite number ${k}$ of places. We would be curious to know if this principle has a standard name. In any event, we feel it is transparent enough not to teach as a “topic.”

## Open Problems

Is this simple observation useful in some context? I do like, personally, that we can keep the simple “add one” rule and avoid the flaw. What do you think?

September 8, 2012 11:16 pm

How about modifying the original list so that whenever a number appears its twin also appears? From the original list L, create a new list L’ which has this property by adding a number’s twin, if it exists, right after it in the list. Now use diagonalization against L’.

2. September 9, 2012 2:33 am

The n’th digit in two twins are either the same or differs by at most 1. So instead of “add one mod 10” you could simply use “add two mod 10”.

September 9, 2012 3:47 am

What happens when someone comes along and starts counting reals in the following way?:
Let real(1) = the first element of the set: 0<x<1;
Let real(2) = the first element of the set: real(1)<x<1:
……………
Let real(m+1) = the first element of the set: real(m)<x<1.

September 9, 2012 8:15 am

Egads, you’re right! My god, you have overturned a century of mathematics. You -must- publish!

September 9, 2012 10:01 pm

Tomas,

You might have missed the point by a couple thousands of years, but who is counting anyway.

Like the rest of us, Cantor inherited the Euclidean point. If he has a scalpel sharp enough to cut a single point from a continuous line, the scalpel should be sharp enough to start chopping end points off line segments.

Just a metaphysical observation, not meant to be taken too seriously. 🙂

September 9, 2012 8:25 am

Heh heh, what do you mean by first? Cool, for R(2) it must be of the form 0.yyyyy… Please write down how many leading zero’s the number has after the decimal and its first few leftmost non-zero digits. For any actual listing (Counting) you have to be able to do this.

• September 9, 2012 10:50 am

Not only are the real numbers not well-ordered, but also a well-ordered set can easily be uncountable. So this doubly doesn’t work.

September 9, 2012 2:12 pm

It’s more accurate to say that the _standard_ ordering of the reals isn’t a well-ordering. Of course if you assume Choice then there must be _some_ well-ordering of the reals, but that’s a separate matter entirely. (On the other hand, it’s a well-known theorem that any sequence of reals compatible with the ‘standard’ order type must have countable order type itself, no matter how many reals there actually are, so this procedure can’t work in yet another way…)

September 9, 2012 6:12 am

‘Pigeonhole’ meaning ‘compartment for giving/receiving documents’ is standard usage in British English, not just at certain universities. (I’ve heard it in Australia as well, so it’s not even specific to the UK.)

5. September 9, 2012 4:03 pm

The example in question strikingly highlights that Cantor’s diagonal construction of an unlisted real < 1 from the assumed enumeration runs the risk of confusing denotation with connotation.

Thus the expression .4999…' is not a real number, but only the denotation of a real number. The real number denoted by .4999…' is the limit of the Cauchy sequence of rational numbers .4, .49, .499, .4999, …'. This limit is the real number denoted by .5'.

So the first constraint on Cantor's assumed enumeration ought to be that it must list only non-rational real numbers < 1.

Now, for Cantor's construction to yield a non-rational real < 1 that is not in such an enumeration, it must be possible to assert for any such enumeration that the diagonal construction will not yield a recurring `9' after some point (as this would merely yield a rational number that is not in the enumeration by definition).

Prima facie, it would seem that such an assertion can only be made axiomatically.

Although such assertion may be possible in ZF or ZFC, to treat it as self-evident—as Cantor's diagonal argument appears to do—seems more theological than mathematical.

• September 10, 2012 9:21 am

Sorry! That was a hasty response, and obviously wrong!

6. September 10, 2012 1:25 am

I have a high IQ and like math, but I cannot, no matter how much I might try, get my brain around this stuff. However, I have a son who eats this for breakfast, so he can carry the heavy load for me. Whew!

7. September 10, 2012 1:38 am

The more there is discussion about uncountability of reals the less intuitive the picture become. Initially, it was clear that it is possible to write a number that would be different from every number because at least one digit (on diagonal) is different. This intuition comes from the finite sets (and as such the Cantor argument seem to be the psychological trick). It is not clear, after all discussions, why the same logic should work for infinite sets. In particular, lets write the sequence $b_0 = 0.0, b_1 = 0.1, b_2=b_{10_2}= 0.01$, e.g. $b_n = 0.$ reverse binary representation of $n$. Since we have infinite numbers, infinitely many numbers are infinitely long. I understand argument that for every finite N, and nN+1 tells that we need to add countably many rows (numbers), equal to the size of the set at step N, and add “1” to the added part at the end. Therefore, for all countably many N the size of the set of all numbers with N digits (including infinitely many) is countable. The Cantor diagonalization argument would only say that for every n the constructed number is below diagonal. And by the definition of infinity, it does not matter how many diagonal digits (equals to tested numbers in the list) we tested there always at least one more. The power set arguments seem to be more convincing, unless we would have more discussion on it.

The only point I want to make is that Cantor argument by itself is not very convincing. It is based on assumption that we can write infinitely many digits, and test infinitely many digits at once, which is possible the reason of artificial superpower of real computation.

8. September 10, 2012 1:42 am

As usual half of the comment is cut. Sorry for disturbance.

September 10, 2012 12:14 pm

How do we know that the only numbers with twins are of the form 0.x9999?

10. September 11, 2012 2:43 am

For bonus points, propose a fix that works when the real numbers are written in binary, rather than decimal expansion (it can be done).

11. September 12, 2012 11:36 am

‘Not the list of all real numbers but some special list of them can have a diagonalizer ending with all $0$‘s or $9$‘s’ is that what your lemma prove?

12. September 12, 2012 11:43 am

Please ignore my previous comment. It was incorrectly formatted. Here is the correct one:

‘Not the list of all real numbers but some special list of them can have a diagonalizer ending with all $0$‘s or $9$‘s’ is that what your lemma proves?

September 12, 2012 9:29 pm

f.nasim

That is essentially right.