Cantor’s first proof that the reals are uncountable

Georg Cantor created set theory, a paradise that David Hilbert declared: “no one shall expel us from ${\dots}$” Hilbert’s full quote is: “No one shall expel us from the paradise that Cantor has created for us.”

“No one” referred to Leopold Kronecker who thought Cantor a “corrupter of youth”, and to Henri Poincaré who thought Cantor’s set theory a malady that would one day be cured. Even great scientists can be wrong about new directions–perhaps something we should all remember. If Cantor’s work was going on today, would he get funded or even get his papers into the top conferences? I wonder.

Cantor created set theory to solve problems in Fourier series, especially looking for conditions that imply that a Fourier series uniquely determines a function. This led him to consider actual infinite processes, which was a giant conceptual step beyond any previous work. Infinities had been used before in mathematics, but never studied in a way that Cantor did. While set theory may seem to be the most abstract of theories, its roots were based on concrete questions from function theory.

We all know the famous diagonal proof of Cantor, that proves that the reals are uncountable, so I will not repeat the argument here. If you need a refresher, or have not seen the proof before, there are many good expositions available.

The diagonal proof has been used over and over in mathematics. In the beginning of complexity theory, Alan Turing used a diagonal argument for his proof that the Halting Problem is undecidable. Later, time and space hierarchies were proved using Cantor’s method, with additional insights into Turing machine simulations. There are countless–bad pun–applications of Cantor’s diagonal proof method.

What you may not know is that Cantor discovered his wonderful proof in 1891. However, almost twenty years earlier, in 1874, he first proved that the reals were uncountable. His earlier proof (EP) is not based on diagonalization, does not seem to generalize, and is not well known.

The proof apparently was discovered in December 1873 and published in 1874 as “On a Property of the Collection of All Real Algebraic Numbers.” The main consequence that Cantor noted, at the time, was the existence of transcendental numbers in any interval. This had been proved earlier in 1844 by Joseph Liouville using constructive methods. The reason for the strange title was that Cantor was worried about opposition from Kronecker and others: yet his main result is that the reals are uncountable, not that there are transcendental numbers.

Today I would like to present the earlier proof, which I think you might enjoy seeing for two reasons. First, it is very different from the diagonal proof, and shows how Cantor’s thinking evolved over time. It also shows how deep the simple diagonal proof is: if it took the master almost twenty years to find it, one could argue that it really is deep. As I have said before, often things that seem trivial today were once impossible, were once unknown.

Second, I think the EP is an instance of an important principle on how we prove that objects exist. There are many ways that you can try to prove that some mathematical object exists. An obvious method, that sometimes works, is to “write down” the object in closed form. I will not be precise on what closed form means, but I hope you get the general idea. For example, when asked for a number that is not rational, you could write down ${\sqrt 2}$. Or when asked for a function that is periodic you can give ${x \rightarrow \sin(x)}$ as an example.

Often, however, there is no simple way to write down what you want: the object that you are trying to construct is too complex. In this case a powerful method is to construct the object as a limit of simpler objects. This is exactly what Cantor’s EP does. I wonder about his EP, and whether or not it could help solve some of our open problems. More on that later.

The Earlier Theorem and Proof

Theorem: The real numbers are uncountable.

Cantor’s earlier proof (EP) assumes that

$\displaystyle r_{1}, r_{2}, \dots,$

is a list of reals all in the interval ${[0,1]}$. He then constructs, by a limiting process, a real number ${x}$ that is in the interval ${[0,1]}$ and satisfies, for all ${k}$, ${x \neq r_{k}}$. This of course shows that the reals are uncountable, but the proof seems more concrete, to me, than the diagonal proof.

Let ${I_{0} = [0,1]}$. We will follow Cantor and construct a series of proper intervals

$\displaystyle I_{0} \supseteq I_{1} \supseteq \dots$

Suppose that we have constructed ${I_{k-1}}$ for ${k \ge 1}$, then we will construct the next interval as follows. Let ${I_{k-1} = [a,b]}$. Select ${c}$ so that

$\displaystyle a < c < b$

and ${c}$ is not equal to ${r_{k}}$. Then, ${r_{k}}$ is in at most one of the two intervals:

$\displaystyle [a,c] \text{ or } [c,b].$

Let ${I_{k}}$ be the first interval that does not contain ${r_{k}}$.

We claim that the nested intervals ${I_{k}}$ tend to an interval ${I_{\infty}}$, which can be degenerate. Let ${x}$ be any point in ${I_{\infty}}$. By construction, ${x}$ is in ${[0,1]}$. Moreover, ${x}$ cannot be equal to any ${r_{k}}$. This follows since at step ${k}$ we selected ${I_{k}}$ so that ${r_{k} \not\in I_{k}}$. But, since ${I_{\infty} \subseteq I_{k}}$, it follows that ${r_{k}}$ is not in ${I_{\infty}.}$

Note, the fact that nested closed intervals tend to a limit interval is the place where Cantor uses the fact that the reals are complete. That is any bounded increasing (decreasing) sequence of reals has a limit point.

Open Problems

I hope you like Cantor’s proof. I find it more intuitive than the diagonal proof, perhaps that is why Cantor found the EP first?

An obvious open question is: can Cantor’s EP method be used to solve some theory problem? Is it just a curiosity, or is there something that is fundamentally useful about the method? What would the notion of limit be? In complexity theory we have used the diagonal method over and over, could it be possible that the EP method could have some advantage in solving some problems?

1. April 18, 2009 6:02 pm

It’s amazing the amount of math originally motivated by making Fourier analysis rigorous (functional analysis, Lebesgue integration,…). I didn’t know Cantor’s set theory was part of it.

October 1, 2011 6:11 pm

Laurent Schwarz’s theory of distributions is also one of these byproducts. Schwarz was a member of the Bourbaki group which originated itself around 1935 in an attempt to found rigorously the prerequisites to Stokes’s formula. Jean Dieudonné – one of Bourbaki’s founding members – also used the EP in his 1960 book “Foundations of Modern Analysis”. He defined the reals axiomatically as an archimedean ordered field that satisfies the axiom of nested intervals.

2. April 18, 2009 6:13 pm

Nifty post. A quibble: Riemann’s work on manifolds, Dedekind’s work on ideals, and Frege’s logic of quantification probably deserve equal mention with Cantor in terms of the invention of set theory. As I see it, axiomatic set theories like ZF, NBG, and NF arose out of an effort to keep the mathematical results of Riemann, Dedekind, and Cantor in a foundational system like Frege’s but without the stratification brought on by using type theory as a way out of the Russell paradox. Calling set theory any one person’s “invention” is in that regard a bit misleading. Anyhow, I realize that is all beside the main point of your post.

April 18, 2009 8:07 pm

I may be missing the obvious, but shouldn’t the sequence $r_{1}, r_{2}, \dots$ be allowed to be a sequence of real numbers?

April 19, 2009 6:56 am

I think the EP and the classic diagonal are related, but they are different. My point was if they are so close why did he take almost 20 years to find the next proof?

Also the list of numbers should of course be reals not rationals.

April 18, 2009 10:42 pm

I dont see any deference between two proofs, just the second one has the unimportant details removed. Think of fixing digits as restricting the final real number to the interal of reals starting with those digits. Thus I dont think this can be of any help.

April 19, 2009 11:39 am

This proof resembles the proof of Baire’s theorem ([0, 1] is not a first category set).

April 19, 2009 3:20 pm

I agree with Anonymous, above. Can you elaborate on what you think the essential difference is?

One difference I see is that the EP does not require you to choose a base in which to represent your numbers (e.g. base 10), whereas in the diagonal proof you typically choose (arbitrarily) a base at the outset. But this seems like a non-essential difference.

April 19, 2009 6:01 pm

I guess I agree and disagree. I see no other way to prove the theorem other then some proof like this: Divide the construct up into stages. The stages each construct partial information about x. At stage k make the a finite number of choices so that no matter what you do later on the number constructed will continue to not agree with the first k numbers in the list. You must be careful that the choices you make do not make x not exist: this could happen if your construction made x infinite, for example.

I see that both proofs do this. But the key to me is that the EP works only for the reals. While the diagonal works for any set. Somehow there is some conceptual jump that Cantor did not make for almost 20 years. Note, the diagonal proof base 2 works for sets, but not for reals without a fix: this is due to repeating expansions. Apparently Cantor made a mistake along this line in an initial draft of the diagonal proof.

I hope that helps. I give in the sense that you are right, but that was not really the point of the post.

April 19, 2009 9:50 pm

I don’t actually think the EP is any more specific to the real numbers than the usual presentation of diagonalization is. In both cases, you need some principle that says the diagonal real number exists. In the former case, this comes from the intersection of nested closed intervals; in the latter case it isn’t really any simpler, but it looks simpler since the existence of real numbers with arbitrary decimal expansions is familiar since childhood.

In general, diagonalization is the observation that you can enforce as many inequalities as you want if you can make corresponding choices to ensure that each one holds. One difference between the two proofs is the logical dependencies between the choices. In the usual presentation, all the choices of individual decimal digits are completely independent (and the same holds for Cantor’s theorem about power sets). In the EP, the options available in each choice depend on all the previous choices. So you’re right that there’s some extra flexibility here, but it still fits firmly within the technique of diagonalization, and when necessary people already exploit the flexibility.

The ultimate version of this is priority arguments, where not only are the choices not independent, but in fact the dependencies sometimes force you to go back and change previous choices.

8. April 20, 2009 12:33 am

There is a theorem that a nonempty compact Hausdorff topological space without isolated points is uncountable. (Thm. 27.7 in J.Munkres “Topology”).
(and it obviously implies uncountability of [0,1])
Its proof looks quite similar to EP…

April 20, 2009 4:11 am

I suppose that the step “select $c$ so that $a” is less innocent than it actually looks…

• December 6, 2013 10:00 am

It’s perfectly innocent. Choose c_1 = (2a+b)/3 and c_2 = (a+2b)/3, one of them has to be different from r_k.

10. April 20, 2009 1:53 pm

As Dima notes, this is just a special case of a more general theorem in topology: that a non-empty compact Hausdorff space without isolated points is uncountable. Also, I wouldn’t say that this not-well-known, at least amongst topologists (not that I’m a topologist). I’ve seen this proof in both a real-analysis class as well as a topology class.

April 20, 2009 2:13 pm

Thanks for clarifying. Your most recent description reminded me of the construction of generic oracles, which is actually very much like Cantor’s EP: you take a nested sequence of open subsets of Cantor space, each time choosing the next subset so as to avoid (ie, diagonalize against) some particular property. Then, because Cantor space is compact, this nested sequence of open subsets is nonempty, and if you did the construction right, consists of a singleton. That singleton is the language (or oracle, as generics are more commonly used) you were looking for.

Of course, generic oracle constructions are somewhere between straightforward diagonalization and the priority method mentioned by Anon 3.

For interest, here is another proof of the uncountability of the reals: http://arxiv.org/abs/0901.0446. It may, when fully unraveled, come down to the same type of argument, but it at least appears to be different in character. (It is apparently based on a proof given by the Bourbaki group in their 1949 book IV: Functions of One Real Variable (in French).)

April 20, 2009 4:18 pm

Both proofs are directly related to different constructions of the real numbers.

Proofs of the equivalence between the construction by decimal expansions and the others (Dedekind, Cauchy) are harder than both uncountability proofs.

With different definition you build different proofs. The reason why EP proof was “discovered” before could be that the existence of real number comes from cuts and limits rather than an esoteric infinite words fantasy.

> I suppose that the step “select c so that a<c<b” is less innocent than it actually looks…
I don’t think so. Changing this by c = max{(a+b)/2, (2a+b)/3}-{rk} keeps the proof healthy. (for example)

May 24, 2009 11:48 am

You should read A. Garciadiego, BERTRAND RUSSELL AND THE ORIGINS OF THE SET-THEORETIC ‘PARADOXES.’ Quite apart from showing the ridiculous “foundations” of Cantor’s ideas, Garciadiego shows that the various “paradoxes” are not paradoxes at all. Cantor’s concerns are unmotivated.

Why are you so out of touch with what is going on in the historiography of set theory? Read Ferreiros on Cantor’s ridiculous notions, or Grattan-Guinness on Poincare’s lack of understanding. Or even read the Feferman edition editorial comments on Godel’s sloppiness as a researcher.

I’m not sure Godel is making any argument at all, but if someone bothers to disentangle all his idiocy, we will see the constructivist (that is, natural mathematical) intervention in his argument. Then you will see that his ideas have no logical content.

“I see it, but I cannot believe it,” Cantor said. You obviously believe it. You’re an idiot.

You can also read my introduction to this new wave of set theory historiography online:

John Ryskamp, “Paradox, Relativity, Natural Mathematics and Twentieth-Century Ideas.”

Wake up.

November 29, 2009 7:11 pm

John Ryskamp:
I’m not sure Godel is making any argument at all, but if someone bothers to disentangle all his idiocy, we will see the constructivist (that is, natural mathematical) intervention in his argument. Then you will see that his ideas have no logical content.

Thank you!

September 23, 2016 4:42 am

sincerely
Goedel and Cantor

February 5, 2010 10:06 pm

Errett Bishop uses this version of Cantor’s proof in his famous book “Foundations of Constructive Analysis”. The reason he avoids the version with infinite sequences of digits is precisely the reason brought up on this blog for the BBP algorithm for pi not quite being an algorithm, that is to say, the standard nuisances with recurring decimals. This is also why people who do exact real computation avoid decimal expansions.

15. June 29, 2010 8:37 am

Been reading your blog now for quite some time and really love it. I don’t know if it’s your style or not , but do you think you could perhaps do a post on the oil spill in the gulf?

I love your thoughts and opinions, and would love to see your commentary on this sad tragedy.

June 30, 2010 5:51 pm

Hello.

What if, in a general theory of everything kind of way, some singular human conscious had used simple Yes(I should feel guilty) or No(I do not feel guilty) answers to answer every moral question that he could remember that had ever occurred in his life. In this way he could have become completely pure. He would have truly asked himself all the questions that could be asked of him. He would have emotionally evolved back to when he was a child.

What if he then assigned an ‘It doesn’t matter as life is only made to make mistakes’ label on every decision that he had analysed.

This would not make him God or the devil, but just very still and very exhausted. Anybody can do this but just for the purpose of this experiment lets say I have done this. Which I have.

There are no fears in me and if I died today I could deal with that because who can know me better than myself? Neither God or the Devil. I am consciously judging myself on ‘their’ level.

To make this work, despite my many faults, take ME as the ONLY universal constant. In this sense I have killed God and the Devil external to myself.The only place that they could exist is if I chose to believe they are internal.

This is obviously more a matter for a shrink more than a mathematician, but that would only be the case depending on what you believed the base operating system of the universe to be. Math / Physics / morals or some other concept.

As long I agreed to do NOTHING, to never move or think a thought, humanity would have something to peg all understanding on. Each person could send a moral choice and I would simply send back only one philosophy. ‘ LIFE IS ONLY FOR MAKING MISTAKES’.

People, for the purpose of this experiment could disbelief their belief system knowing they could go back to it at any time. It would give them an opportunity to unburden themselves to someone pure. A new Pandora’s box. Once everyone had finished I would simply format my drive and always leave adequate space for each individual to send any more thoughts that they thought were impure. People would then eventually end up with clear minds and have to be judged only on their physical efforts. Either good or bad. It would get rid of a lot of maybes which would speed lives along..

If we then assume that there are a finite(or at some point in the future, given a lot of hard work, there will be a finite amount) amount of topics that can be conceived of then we can realise that there will come to a time when we, as a people, will NOT have to work again.

Once we reach that point we will only have the option of doing the things we love or doing the things we hate as society will be completely catered for in every possible scenario. People will find their true path in life which will make them infinitely more happy, forever.

In this system there is no point in accounting for time in any context.

If we consider this to be The Grand Unified Theory then we can set the parameters as we please.

This will be our common goals for humanity. As before everyone would have their say. This could be a computer database that was completely updated in any given moment when a unique individual thought of a new idea / direction that may or may not have been thought before.

All that would be required is that every person on the planet have a mobile phone or access to the web and a self organising weighting algorithm biased on an initial set of parameters that someone has set to push the operation in a random direction.

As I’m speaking first I say we concentrate on GRAINE.

Genetics – Robotics – Artificial Intelligence – Nanotechnology and Zero Point Energy.

I have chosen these as I think the subjects will cross breed information(word of mouth first) at the present day optimum rate to get us to our finishing point, complete and finished mastery of all possible information.

Surely mastery of information in this way will lead us to the bottom of a fractal??? What if one end of the universes fractal was me??? What could we start to build with that concept???

As parameters, we can assume that we live only in 3 dimensions. We can place this box around The Milky Way galaxy as this gives us plenty of scope for all kinds of discoveries.

In this new system we can make time obsolete as it only making us contemplate our things that cannot be solved because up to now, no one has been willing to stand around for long enough. It has been holding us back.

All watches should be banned so that we find a natural rhythm with each other, those in our immediate area and then further afield.

An old clock watch in this system is should only be used when you think of an event that you need to go to. It is a compass, a modern day direction of thought.

A digital clock can be adapted to let you know where you are in relation to this new milky way boxed system.(x,y,z).

With these two types of clocks used in combination we can safely start taking small steps around the box by using the digital to see exactly where you are and then using the old watch to direct you as your thoughts come to you.

We can even build as many assign atomic clocks as are needed to individually track stars. Each and every star in the Milky Way galaxy.

I supposed a good idea would be to assume that I was inside the centre of the super-massive black hole at the centre of the galaxy. That would stop us from being so Earth centric.

We could even go as far as to say that we are each an individual star and that we project all our energies onto the super-massive black hole at the centre of the galaxy.

You can assume that I have stabilized the black hole into a non rotating perfect cube. 6 outputs /f aces in which we all see ‘the universe and earth, friends’ etc. This acts like a block hole mirror finish. Once you look it is very hard to look away from.

The 6 face cube should make the maths easier to run as you could construct the inner of the black hole with solid beams of condensed light(1 unit long) arranged into right angled triangles with set squares in for strength.

Some people would naturally say that if the universe is essentially unknowable as the more things we ‘discover’ the more things there are to combine with and therefore talk about. This is extremely fractal in both directions. There can be no true answers because there is no grounding point. Nothing for the people who are interested, to start building there own personal concepts, theories and designs on.

Is it possible that with just one man becoming conscious of a highly improbable possibility that all of universes fractals might collapse into one wave function that would answer all of humanities questions in a day? Could this be possible?

Answers to why we are here? What the point of life really is et al?

Is it possible that the insignificant possibility could become an escalating one that would at some point reach 100%???

Could it be at that point that the math would figure itself out so naturally that we would barely need to think about it. We would instantly understand Quantum theory and all.

Can anybody run the numbers on that probability?

December 4, 2010 8:13 am

Actually both proofs are wrong. Barbers don’t cut their own hair, except on Crete.

October 12, 2016 5:55 pm

The earlier proof is a diagonal argument but geometric:

Imagine a segment S with P = P0, P1, … points on it. Subdivide S = S0 into subsegments S00 and S01 such that S00 and S01 do not meet at P0. Let S1 be that subsegment where P0 isn’t. If for some i Si is devoid of any Pj then any point in Si will be apart from any Pj. Otherwise subdivide Si into subsegments Si0 and Si1 as before and let Si+1 bet that subsegment where Pk is not, Pk being the first P on Si. The intersection of S0, S1, … gives at least one point apart from any P. Suppose it is a point, its locational code is then the dyadic sequence x0x1… where S = S0x0, S1x1, … : a diagonal point.

It is strange that Goedel’s “17 Gen r” is frequently thought to be based on diagonalization. It is not, at least in Goedel’s 1931 paper.

Turing is a dwarf on Goedel’s shoulders. The claim that we owe Computer Science to Turing is distressing. As for the practical part of the matter, Zuse had the hardware at least one year before Turin’s thesis, even if we were to forget Pascal, Leibniz, Babbage etc.
Turing himself made a veiled admission that what he had done was a formulation of Goedel’s 1931 result in terms of machines, which were and still are in the spirit of the time. Goedel was convinced of the adequacy of Turing’s paper for this reason. (The avowal was expressed in terms of an analogy with an excited electron or something, if memory serves).