Making Choices
The Axiom Of Choice, some history, some ideas
Gregory Moore is a historian of mathematical logic. One connection he has to things dear to us is that he edited the first two volumes of Kurt Gödel’s Collected Works, as well as some of the work of Bertrand Russell.
Today I want to talk about making choices and in particular the axiom of choice.
I just spent a week at the beach, where I had a chance to read quite a bit while listening to the waves. One book I read was hard to put down—it is a book by Moore titled Zermelo’s Axiom of Choice. Okay I am a math type. I did read a couple of thrillers, but Moore’s book really is a fascinating read.
The Axiom
Moore’s book calls the axiom of choice “The Axiom.” It really is Zermelo’s Axiom of Choice, but `Axiom’ is much shorter and probably cuts the length of the book by quite a bit. The Axiom was first stated explicitly by Ernst Zermelo in 1904.
You probably know it: The Axiom states that for any family of non-empty sets, there is a function
so that for each set
in the family
,
is an element of
. There is no other constraint on
: there can be sets
so that
. The critical point is that
is always an element from
.
Intuitively the Axiom says that there always is a choice function . The function
chooses some element from each set
. The point of the Axiom is that while there is often a way to define an explicit rule for
, this is not always possible. The Axiom therefore states that no many how complex—in any sense—the sets in
are, there is a way to select the required elements.
From Wikipedia:
“The Axiom of Choice is necessary to select a set from an infinite number of socks, but not an infinite number of shoes.” — Bertrand Russell
The story of the Axiom—and the reaction to the Axiom—form the subject of the book. See the book for details, it really is fun. I would like to say something about the book, but I would like to first give some good news and bad news.
Good News And Bad News
You probably know some of the consequences of the famous Axiom. Essentially there are two types of results. Some results would be classified by most people as good results, while other results would be classified by many as bad results. There are weaker versions of the Axiom that miss some of the bad consequences, and they also miss some of the good results. To avoid complication let’s just list some results obtained with the Axiom: we will label them into good and bad ones.
Good: The real numbers are not the union of a countable set of countable sets of reals. The Axiom is used to prove this. Note that the famous diagonal result of Georg Cantor shows that the reals are not a countable set. Yet his proof does not rule out that the reals could be the union of a countable list of countable sets of reals. Strange, but true. So for all those who doubt the reals are countable—we have discussed this before—here is some hope. If you deny the Axiom, then you get something close to countable.
Let me explain this more carefully, since it seems crazy. How can the countable union of countable sets be anything but countable? When I saw this I thought: hey that is easy to prove. So let’s go and “prove it.”
Suppose that is a countable set where each
is also countable. Let’s prove that the union of all these elements is itself countable—that is, that
is countable. It clearly follows that each has as its members
Then is countable by the same argument that Cantor used to show that the rationals are countable. Done.
This does not use the Axiom. Right? Wrong. The Axiom is used in a subtle manner that is nearly invisible. If you want a challenge, take a moment and see if you can see where the Axiom is used.
The Axiom was invoked when we enumerated the elements of all the sets . There are infinitely many
, and each has infinitely many bijections onto the positive integers. We used the Axiom to select one of these bijections to make
well-defined. Indeed, using forcing methods there are models of set theory without the Axiom where the reals—though uncountable—are the countable union of countable sets. Amazing.
Good: The Lebesgue measure is countably additive. This is one of the basic features that make measure theory work. The Axiom is used to prove this. Its use is related to the previous “good” result.
Bad: There are non-measurable sets. Life would be much simpler if all sets were measurable, but the Axiom shows that this is false.
Bad: There is no finitely additive measure on three-dimensional space that is invariant under Euclidean transformations. This follows from the famous Banach-Tarski paradox: the unit ball can be divided into a finite number of pieces and reassembled into two identical unit balls.
The Story
What I found interesting is the confusion that surrounded the early days of set theory, especially with regard to the Axiom. Zermelo introduced the Axiom to prove Cantor’s claim that the reals could be well-ordered. Of course the reals have a simple order defined on them, but a well-order is more. It requires that every non-empty set has a least element. This fails for the usual order—just consider all positive reals. They clearly have no least element.
Zermelo showed that the Axiom implies that every set can be well ordered. At the time many doubted that every set could be well ordered, but the Axiom seemed more reasonable. This is the reason that Zermelo introduced it. Many doubted the Axiom. Their doubts came from various sources. Initially the idea that there is always a choice function seemed quite powerful. Later as consequences of the Axiom were discovered, especially “bad” ones, many disliked the Axiom.
What I thought was cool about the story is that Zermelo had a simple point. Throughout analysis, for years, the Axiom had been used repeatedly by mathematicians, without knowing they were using it. So Zermelo’s point was that you have already used the Axiom in your work—so why can’t I use it to prove the well-ordering result? I found it extremely interesting that people could use the Axiom in their research, and still fight against it.
Read the book for all the details and much more about the Axiom.
Complexity Theory Version?
While reading the book, I did think about whether or not there is some finite version that is relevant for us. We tend to worry mostly about finite sets, but actually when talking about decision problems we are concerned with countable sets. Indeed. Is there some connection between the Axiom and our basic unsolved questions?
One possible connection is the notion of, what is a finite set anyway? There are many ways to define what makes a set finite. The standard notion is is finite provided there is a bijection between
and
for some
. But there are other definitions. What is relevant is these definitions seem to capture the same notion of finite, but without the Axiom that is not provable. So if we think in complexity theory about these other notions of finite, could that lead us to some new insights about complexity theory? I wonder.
Here is one of the most famous alternative definitions, due to Richard Dedekind. A set is Dedekind-infinite provided there is a bijection between a proper subset of
and
. It is Dedekind-finite if it is not Dedekind-infinite. See here and here for more details. Note the Axiom is required to show that this definition is the same as the standard one. Okay, a weaker choice axiom is enough, but some form is required.
Open Problems
Are you a believer of the Axiom? Or are you a doubter?
Can we relate Dedekind finite in some way to fundamental questions in complexity theory?
Software engineers commonly are called upon to order the set of Turing Machines in P (as in, here are two algorithms, each presented concretely and finitely as a TM input tape … which tape is asymptotically faster?). It is unsettling to contemplate that such orderings may be undecidable. As Bill Thurston says in On Proof and Progress in Mathematics
On a more optimistic note, Thurston also says in his top-rated answer to the MathOverflow Question “What’s a Mathematician to Do?”:
Perhaps we’ll end of the 21st century with a better appreciation of how Thurston’s apprehensions (of the first quote) act to condition Thurston’s hopes (of the second quote).
To put it another way, the following three classes of assertions are ontologically identical
• axiomatic assertions (we can choose algorithms A and B, both P)
• oracular assertions (let A and B be algorithms in P)
• advertising assertions (algorithm A is faster than B)
Practicing engineers appreciate the generic infeasibility of assessing the truth-values of advertising assertions … yet circumstances require that we do our best. This (wonderful!) GLL post reminds us that axiomatic and oracular assertions are no different.
Examples In Dick’s examples, theorems that depend upon choice functions acted (subtly and unconsciously) to introduce the new axiom “C” into the formal system “ZF”.
Query Have oracle-dependent complexity-theoretic theorems similarly introduced (subtly and unconsciously) new axioms into set theory?
Observation Formalizing projects like Vladimir Voevodsky’s HoTT constructionist program — as described earlier this month in the wonderful GLL essay Surely You Are Joking? essay — are beginning providing definitive answers to these tough questions.
Reason for Hope As Bill Thurston’s Proof and Progress essay reminds us:
The sooner, the better! 🙂
A further (and very enjoyable) reference in regard to the above ideas is the introductory chapter to Ravi Vakil’s (justly celebrated as it seems to me) free-as-in-freedom course notes Foundations of Algebraic Geometry in which we read:
An open question that is of central consequence to complexity theorists in particular — and arguably, to the STEM enterprise in general — is whether the postulated separations of the Complexity Zoo require, for their rigorous proof, greater scrupulosity in regard to foundational issues than their pedagogy has previously devoted.
Ravi Vakil includes, in this same introduction, a partial reference to one of my favorite David Mumford quotes (from Mumford’s Curves and their Jacobians, 1975), which reads in full:
In the decades since 1975, the mathematical vision associated to Mumford’s “real adventure” has grown to span such a vast STEM domain that even we medical researchers/engineers are embracing it!
Thank you John for pointing to Ravi Vakil’s excellent course in algebraic geometry. I wish I had the time to study it thoroughly…
I thought about exactly these issues (even while reading the same book!) a few years ago. For most complexity versions of AC I could come up with, the common wisdom is that it is false, and for some of them I think it was even provably false.
Depending on how you phrase the axiom of choice, in the complexity world you can get:
– Every polytime computable equivalence relation has a polytime canonical form. In my paper with Lance Fortnow (“Complexity Classes of Equivalence Problems Revisited”) we showed that this would imply that factoring is easy, NP=UP=RP, and PH=BPP.
– Proposition Q (see Fenner-Fortnow-Naik-Rogers “Inverting Onto Functions”): every honest, poly-time, surjective function has a polynomial-time inverse. Equivalently, given any NP machine deciding SAT, in polynomial time from an accepting path of that machine one can construct satisfying assignments. Also equivalent: finding in polytime accepting paths for any nondeterministic machine deciding \Sigma^*. Prop Q implies that P=NP intersect coNP and more.
– For every infinite language L (in P, depending on how you formulate it), L and L x L have the same p-cardinality, i.e. there are partial polytime functions f,g:{0,1}^* \to {0,1}^* such that L is contained in the domain of f, L x L is contained in the domain of g, and fg restricted to L x L is the identity of L x L and gf restricted to L is the identity map on L. A priori this is weaker than saying they are p-isomorphic, since f,g here need not be total. Not sure if this has “bad” complexity consequences or not.
– … (I remember there were several other complexity statements, but I don’t remember them as it’s been a few years)
I read in a Fortnow blog that he does believe Factoring is easy.
QuestionMan
Yes. The question is to find the algorithm. We will see…
How frustrating would it be if someone proved a $2^{{\log\log(n)}^{1+\epsilon}}$ algorithm valid $\forall \epsilon > 0$ but bringing $\epsilon = 0$ would need something as non-trivial as bringing $\epsilon = 0$ in the $n^{1+\epsilon}$ algorithm for FFTs. Only thing is in the factoring algorithm if $\epsilon \neq 0$, it would mean $P \neq NP$ but convincingly by only just that much.
From Grochow’s comment, may be we live in a world where both $\epsilon = 0$ and $\epsilon > 0$(any high value to render it maybe non-quasi polynomial) are both acceptable- just like living with AC and without AC are both acceptable. Is this possible in complexity theory?
Obviously one is tempted to compare schemes over F1 and integers. Polynomial factorization over finite fields has proven deterministic complexity (under GRH) $(n^{\log(n)}(\log(q))^(O(1)))$ where $n$ is degree of polynomial. Extending analogy to q=1, then can we guess a $n^{\log(n)} = 2^{\log(n)^{2}}$ complexity algorithm for integers of bit size $n$?
Joshua,
Have you tried negating your various proposals? As I suggested below, it seems that it’s rather the non-existence of some fast algorithm that has a structuring effect in complexity theory, in opposition to what happens in set theory where you must assume the existence of a particular infinite set to get nicer results. It looks as if these two disciplines were each other’s reflected image along the computability axis – however poetic this might sound to you… 🙂
Another phrasing of the same idea is that there’s no complexity theory when all problems are easy (resp. no axiomatic set theory when all sets are finite). Therefore, our axioms will have to state the existence of some complex problems (resp. of some large infinite sets). Similarly, there’s no probability theory when all events are certain, no chaos theory when all processes are stable, and so on…
Reblogged this on Pink Iguana and commented:
Not enough Axiom of Choice posts
the following paper is an interesting application of the axiom:
A Peculiar Connection Between the Axiom of
Choice and Predicting the Future
Christopher S. Hardin and Alan D. Taylor
this is a link to the paper
Your example about countable union of countable sets only requires a weak form of the axiom of choice: the axiom of countable choice.
Minar
Yes only countable choice. I decided to avoid that, but you are right.
The Axiom of Choice is clearly bunk. It smacks of intellectual dishonesty and the bad things it allows seem particularly bad; bad enough to motivate the search for better proofs of the good things (where possible). If the Banach-Tarski paradox doesn’t convince you, nothing will.
But I’m a constructivist at heart, which naturally prejudices me against it… The relationship between constructivism and the Axiom is itself very interesting, as some (modified) forms of the Axiom are compatible with some forms of constructivism.
I agree. Also a doubter for the same reasons.
Nothing should be able to convince anybody that a provably undecidable statement is false. You may just try to convince us that the axiom of choice is useless, since it obviously is for all constructive purposes. There are known alternatives where every set of reals is measurable – Solovay’s axiom – but the resulting theories haven’t been embraced by all mathematicians. I think most algebraists prefer the full axiom of choice, for that matter. With Voevodsky’s HoTT program we’ve seen type theory proposed as alternative foundations, but there isn’t even a universal agreement as to which set theory should be used! Likewise, there probably never will be a common agreement as to which type system should be used.
Typo: “So for all those who doubt the reals are countable” –> “doubt the reals are uncountable” or “think the reals are countable”
My wife rejects AC since AC implies the Banach Tarski paradox.
For the sake of my marriage I agree with her.
I can imagine math history going a different way where Banach-Tarski is discovered
earlier and hence AC is rejected early on. We could still have AC for countable sets.
How much math would be lost? A Lot. How much math that real people in the real world really use would be lost? I suspect not that much, but I would be happy to be proven wrong.
I have never understood the negative reaction people have to the Banach-Tarski paradox. The decomposition is non-measurable. What, exactly, is the big deal? It simply reflects the “paradoxical” properties of the free group on two letters, and the fact that the group is a subgroup of SO(3). You as might as well come to the conclusion that three-dimensional geometry is inconsistent.
To add to the confusion, Banach and Tarski also proved that the paradox was impossible in two dimensions–and their proof of that relied on choice. It took about 10 years before a constructive proof that the paradox was impossible in two dimensions was discovered.
Query If separations in the Complexity Zoo are shown to be undecidable in some (strong) constructivist framework (like HoTT for example) then in what sense might/should/could longstanding open problems like PvNP cease to be “math that real people in the real world really care about”?
Alternatively, might/should/could it be the 21st century’s new-fangled constructive frameworks like HoTT that “real people in the real world” cease to care about?
Or else, should we “real people in the real world” simply have faith that these foundational matters are scarcely likely ever to be relevant to “math that we really use”?
These interlocking questions seem (to me) to be mighty tough *and* mighty important! 🙂
The above three-part query was a response to Bill Gasach’s very interesting AC vs Banarch Tarski remarks. Perhaps not the least significant role of complexity theory is that it provides a natural test-bed for these tough foundational issues.
W.r.t. the relevance of AC to computer science, I really liked this answer on cstheory by Timothy Chow: http://cstheory.stackexchange.com/a/4031/4896. What he says there is very related to your remarks about defining “finite”. For example, the standard Graph Minor Theorem needs some form of choice, but if you fix an encoding of the minor relation then choice is not necessary anymore. His point is that essentially all natural theorems in complexity theory are arithmetic or can be rephrased as arithmetic statements, and therefore are provable without choice.
It may be hard to prove AC=False or True. However, it is easy to show that AC is both True and False. Dick knows the proof but he has to avoid it because it is bad news for him.
Rafee Kamouna.
The axiom of choice – any collection of nonempty sets has a nonempty product – can be viewed as a tool for reasoning about the many objects which can’t be seen constructively – such as the non-measurable sets on the line. I think the beauty of set theory is greatly enriched by this optimistic assumption.
By contrast, the hypothesis P=NP – the set of polytime algorithms for SAT is nonempty – would destroy the so-called “polynomial hierarchy” and, for this reason, most mathematicians prefer to suppose P!=NP. Moreover, the latter reflects more accurately our everyday experience and common wisdom. So much so that some large parts of complexity theory are actually structured by this rather pessimistic assumption.
All this to say that existence assumptions yield opposite effects in set theory than they do in complexity theory. Structurally speaking, the complexity equivalent to the axiom of choice is really P!=NP. So it’s fair to say that, in a way, complexity theorists have been working with their own “axiom of choice” since the beginning…
Indeed, set theory studies various degrees of uncomputability – several strengths of choice axioms, several sizes of infinite cardinals – while complexity theory’s about the various degrees of complexity. Just because the barrier of computability lies between these two sciences, that doesn’t mean they can’t be unified!
… which brings us directly back to Martin-Löf’s type theory, wherein a set is a problem and its elements are the methods of solving it. So, if I were to design a new foundation of math, I’d try to find one that encompassed the hardness of solving the problems.
… though P!=NP looks more like an equivalent of the axiom of infinity.
… in a context where polytime = finite, with a cardinal equal to the degree of the polynomial. Indeed, why not try to measure the asymptotic behavior of an algorithm by a set-theoretic cardinal instead of an increasing function? Thus, the exponential algorithms would be those of countable complexity. Hopefully, this association – of a cardinal to an algorithm/problem/complexity class – could be proved functorial in a natural sense.