What Is A Complexity Class?
Hartley Rogers is a mathematician at MIT who wrote arguably the definitive book on basic and advanced aspects of recursion theory. His book entitled Theory of Recursive Functions and Effective Computability is a classic. He also had many strong Ph.D. students over the years, including several who are well known computer scientists: Patrick Fischer, David Luckham, and Paul Young.
Today I plan on talking about the foundations of complexity theory. Several commentators have asked more than once, “what is a complexity class?” I will try to answer that and a bit more.
I learned recursion theory from Rogers’ book. I learned early on to beware of exercises marked with the scary symbol. These were very hard problems. There may be more recent treatments of recursion theory, but to me Rogers’ book will always be one of the great textbooks on any subject.
A recent New York Times front page article on live vs. distance learning states two main points. More and more schools are using on-line technology—it is driven by lack of funds, and there is less and less difference being found between live and on-line learning. While I never was in a live class with Rogers, I wish I had been. There are many stories of how he taught, but my favorite is that he rewarded top students with Leibniz cookies.
I guess that will be always hard to do in an on-line environment—perhaps one day it will be possible to have a cookie roll out of your laptop. I do support experiments that are trying to make better use of technology to teach, but getting a chocolate cookie seems hard to beat.
Let’s move on to the foundations of complexity theory, and I will connect back to Rogers later.
The Big Three Ideas
Complexity theory is based on three ideas, they are ideas that are so built into our “DNA” as theorists that I think we sometimes forget how important they are. There are other key ideas, but these three come first, in my opinion.
Boolean strings: I already got in trouble over this one here, but I will try to state it again. The starting insight is that we can use boolean strings to represent all our basic objects. From integers to matrices, from permutations to groups, we can and do use strings of ‘s and ‘s to represent them. Strings are the basic building blocks that we use every day in complexity theory. We say all the time, Let be an natural number It is implicitly assumed that everyone will convert that to Let be a string of ‘s and ‘s that encodes the number in the usual way. I believe the first to talk at length about binary numbers was the great Gottfried Leibniz. He was not the first to discover the binary system, but he did write at about it in his Explication de l’Arithm’tique Binaire in 1703. Also see the original here.
The definition of a problem is simple: a problem is nothing more than a set of strings. That is it. Problems often consist of Boolean strings, but we can allow arbitrary finite alphabets, since any alphabet can be encoded into a Boolean one.
This is one of the simplest definitions in mathematics, yet it is really quite important. The notion of a problem as a formal mathematical object allows us to study problems, without this definition a problem would be a vague notion. I think that one of the great achievements of theory is the formalization of the “problem” notion.
Before this definition people might have studied a problem like: “how do we assign tasks to machines?” or “how do we find the shortest path in a graph.” But they were not precise mathematical objects. So even though the definition is almost trivial, it is very important.
In the early days of theory problems were usually called languages. A language was just a set of strings:
The reason they were called languages initially is based on the early motivations of the field. The initial questions were driven by two types of languages: natural languages and artificial ones. Certainly Noam Chomsky and many others were interested in trying to model real languages, such as English. Others were driven by their interest in computer programming languages. The questions raised in early theory reflected these interests, as did the results. For example, one beautiful idea is called BNF—and stands for Backus-Naur Form. John Backus was interested in ways to describe programming languages like Algol, which lead him to create BNF. Later he used related tools in his important work on FORTRAN; for this and other seminal work he was awarded the Turing Award in 1977.
Properly a language denotes a decision problem. There are also function problems, such as given a whole number , find its prime factors. But these can be converted to decision problems over strings, such as the set is less than some prime factor of . A fast way to solve yields a fast way to factor numbers.
A family of problems is called a complexity class. Thus, is a complexity class provided it consists of problems. It is that general, that simple, and that elegant. Any such family is a complexity class. But as in the famous book Animal Farm by George Orwell:
All complexity classes are equal, but some complexity classes are more equal than others.
Our friends at Wikipedia give a much more restrictive definition of a complexity class:
In computational complexity theory, a complexity class is a set of problems of related resource-based complexity.
I think the above definition is fine for helping pin down the notion of complexity, but for class it is too restrictive in two ways. Not all interesting classes are related by a known computational resource. The notion of what are important resources itself keeps changing and growing. In the beginning there was just time and space, then we added nondeterminism, then probabilities of various sorts, then interaction, stream-ability, and so on. Perhaps energy is next.
Second, the notion of “related” better applies to the idea of a complexity level. The -complete problems are all closely related and form a level as well as a complexity class, but is a complexity class with levels that are quite unrelated. Of course, relatedness depends on the level of fine detail—compared to primitive-recursive all of is relatively easy—but saying “related” misses the larger point, I think. The notion of class identifies problems that share important characteristics, which may not be resource-bound and may not be “equal” even in Orwell’s sense.
I like having a general view of what a complexity class is, to focus on what makes problems interesting. In any event, not all possible complexity classes are interesting, and we will soon see some things that make a class, well, “classy.”
How Many Classes Are There?
The answer to this question depends on what you mean. The cardinality of strings is countable , the cardinality of problems is , and the cardinality of complexity classes is .
Of course most of these classes are uninteresting. But I thought it would be fun to point out the sizes involved. If nothing, it is clear there is a hierarchy here: there are many more complexity classes than problems, and many more problems than strings.
Since there are complexity classes and there are only a finite number of people doing research in complexity theory, we must focus on some of the classes. There is no single rule that I know that makes a complexity class “interesting.” But some classes are definitely more equal than others.
At the top level some classes seem to be very natural, others seem less natural but help explain the structure of other classes, and some classes exist for more practical reasons. I think some exist because we can prove something about them, while other more natural classes are too difficult to understand.
There is a great resource for those interested in complexity classes—especially the subset of the classes that people have considered. It is called the “Complexity Zoo” and is run by:
- Zookeeper: Scott Aaronson
- Veterinarian: Greg Kuperberg
- Tour Guide: Christopher Granade
They say that `there are now 493 classes and counting!” Of course we know that there are many more classes, but 493 is the count of “interesting” ones. Here is the tag cloud of the zoo:
One of the things that makes the Complexity Zoo different from a normal zoo, is that sometimes classes disappear. Sometimes we discover that two complexity classes are really the same—I do not know if this happens in real zoos. Do zookeepers find that animals they thought were different are really exactly the same? This does happen in our world. There was, for example, a time when and were thought to be separate, but we now know they are the same. That classes can disappear seems to be one the cool properties of our world.
Properties Of Complexity Classes
One of the properties that makes a complexity class special is having a complete problem. Suppose that is a class, if there is a single problem in the class that captures the entire class this is very neat. That a complexity class can be represented by one problem is quite special. This happens, of course, for the class : the famous SAT problem, among many others, is -complete. All the intricate structure of is captured by SAT.
The precise definition of being a complete problem for a class depends on the important notion of “reduction.” For to be complete it must have two properties:
- The problem must be in the class .
- Given any problem in the class there must be a reduction from to .
Note, the type of reduction can and does vary: there is polynomial time reduction, Turing reduction, reductions based on lower complexity classes, and so on. The intuitive notion is, however, simple: reduces to provided the ability to solve the problem allows one to solve .
I wanted a name for a complexity class that does have this special property: having a complete problem. I have tried to create a good name, I do not know one that describes exactly this property. I thought it might make sense to call such a class a -class after the usual symbol used for one of the first complexity classes ever discovered to have a complete problem. In recursion theory, see Rogers book, the letter is used to denote the Gödel numbers of those Turing Machines that halt on input . If there is a known name, or a better one, let me know. For now a class is a -class if it has a complete problem.
In Rogers book various notions of complete are defined. There are, by the way, important classes that are not known to be -classes. Perhaps the most famous, notorious even, is . This is the class of problems accepted by polynomial time probabilistic machines than have two sided bounded error. One of the great open problems is: Does this class have a complete problem? Another is the same question for .
All of this is laid out clearly in Rogers’ book, although he does not state some of what I have as explicitly. He assumes natural numbers and does not start with boolean strings. But the notion of problems is clear, as is the notion of complexity classes.
There are countless open problems. The main questions fall into two types: what are the relationships between various classes and what are the properties of a class. The former most famous example is the question, and the latter most famous is the question does have a complete problem, is it a -problem?
[various word-level tweaks]