Is it Time to Declare Victory in Counting Complexity?
The last dichotomy theorem: graph homomorphism with complex values
Today is a new type of post for us—a guest post, by Jin-Yi. We have used his suggested post title, but have inserted a couple sections for added background.
Well we always do things differently here at GLL—as Tina Turner says:
You see we never ever do nothing
Nice and easy
We always do it nice and rough.
We have several guests lined up for the new year to write posts, but as always we will add a bit of extra flavor and a few comments—Ken and I hope that they will understand.
What Is A Dichotomy Theorem?
One of the central questions of complexity theory is: given a computational problem, can we tell whether it is easy or hard? For example, is it in polynomial time or not? In full generality this meta-problem of classifying problems is hopeless—beyond hopeless, if that makes sense. How could we possibly take an arbitrary problem and classify it?
Yet we would like to do this automatically and for as large a class of problems as possible. One method is to search the Web and see if it has been studied before; a related one is to use the famous Garey and Johnson book. But that only goes so far.
The central idea of a dichotomy theorem is two-fold, pun intended. It is to restrict both the family of problems and the notions of complexity. For every problem of a restricted kind, it says that either the problem is easy or it is definitively hard.
Definitively hard can mean many things, from -complete, to -hard, to -hard. The power of such a theorem is determined by the “size” of the family of problems—the larger the family, the better the theorem. Eliminating any in-between possibility of hardness is why it is called a dichotomy theorem, since, of course, the meaning of “dichotomy” is the splitting of a whole into two pieces.
A Failed Dichotomy?
Before the guest post, let’s talkabout the famous question that Emil Post raised in the early days of computability theory, in the 1940′s. Following Alan Turing’s seminal result that the Halting Problem is undecidable, people quickly undertook to classify problems as computable not. A problem—or if you prefer a set—is called computably enumerable (c.e.) if it is possible to list its values by a Turing machine. The old style that I learned was to call these sets r.e., but that name is out of date. So use “c.e.” if you want to seem to be au courant, especially if you are talking to logicians.
As people began to classify problems it quickly appeared that every one was either computable or equivalent to the Halting Problem. Somehow they could find nothing in-between. Post asked whether there were problems, natural or not, that were in-between. Essentially he asked: is there a dichotomy theorem for c.e. sets?
This great question led to a flurry of ideas and results, but no solution. Post and others defined a variety of types of sets: simple, hypersimple, hyperhypersimple, and others that attempted to solve his problem. (These are real names—I do not know if there was a hyperhyperhypersimple notion.) The idea was this: suppose that you could define a property of sets so that:
- If a set satisfies , then it cannot be equivalent to the Halting Problem.
- There are undecidable c.e. sets that satisfy .
Then there would be sets between computable and complete—the dichotomy theorem for c.e. sets would fail. None of these notions worked, since the language they used for defining the property X was not powerful enough. Eventually, in a brilliant argument, at about the same time in the 1950′s, and independently, Richard Friedberg and Albert Muchnik proved that such intermediate sets do exist. The used a method, now called the priority method, to construct the sets. Therefore the dichotomy theorem failed for c.e. sets.
The issue then becomes whether properties that distinguish the priority-constructed sets are “natural.” There is some analogy to the famous theorem by Richard Ladner which we also referenced alongside Jin-Yi earlier, insofar as it produces sets that are “gappy.” Ladner’s technique can be adapted to many cases to break up possible dichotomies, but always with intuitively “unnatural” sets. In any event it’s not clear how much generality one has to sacrifice to get a dichotomy—and the point is that one always has to sacrifice some.
Let’s now turn to Cai’s post on the current status of dichotomy theorems in complexity theory. We have edited it somewhat, and it refers to Jin-Yi in third person in some places to say we concur with some opinions.
The Guest Post
If you care about classification theorems in complexity theory, especially for the complexity theory of counting problems, or the so-called Sum-of-Product type problems (a.k.a. partition functions), you may want to know about the recent result of Jin-Yi and Xi Chen. Even if you do not care, you may still wish to know about this paper.
The paper has a plain title: Complexity of Counting CSP with Complex Weights. But what the paper has done can be the conclusion of an era for the theory of counting problems.
Or is it?
Let me (Jin-Yi) first give you some historical context, starting from the days of Kurt Gödel and Alan Turing and Emil Post.
Complexity theory—in particular Steve Cook’s NP-completeness—essentially springs from a direct adaptation of Post’s ideas with a polynomial time restriction. Post’s problem was so intriguing because most “natural problems” seem to be either decidable or as hard as the Halting Problem. In fact, it was said that one of the psychological barrier to proving Hilbert’s 10th Problem undecidable was the fear that it might be of intermediate computability, and no one knew then how to prove a natural problem undecidable unless it is as hard as the Halting Problem. Of course ultimately Hilbert’s 10th Problem was proved equivalent to the Halting problem, by the combined work of Martin Davis, Yuri Matiyasevich, Hilary Putnam and Julia Robinson, with Matiyasevich providing the last crucial step.
Back in Complexity Theory, the story is different. The analogue of decidable sets is P, and the analogue of c.e. sets is NP. We still can’t prove NP P, whose analogue in Computability Theory is Turing’s undecidability of the Halting Problem. However, assuming NP P, Richard Ladner did show that there are problems of intermediate complexity in NP. They are neither in P nor NP-complete. However, there is one thing in common with Computability Theory: No “natural problem” in NP has been shown to be of intermediate complexity, even assuming NP P. In fact, most “natural problems” in NP seem to be either in P or NP-complete (some exceptions include Graph Isomorphism and Factoring, but there are no proofs.)
Dichotomy for CSPs
Giving this belief a precise form, Tomás Feder and Moshe Vardi formulated a CSP dichotomy conjecture. Here CSP stands for Constraint Satisfaction Problem. This is an extremely broad class of problems expressible as follows: Let be an arbitrary finite set called the domain set. Let be an arbitrary finite set of relations on . Each has an arity . Then denotes the following class of problems: The input is a set of variables over , and a collection of constraints , each applied to a sequence of variables. The question is, is there an assignment that satisfies all the constraints. Feder and Vardi conjectured that for every , is either in P or NP-complete. This is still open.
The Feder-Vardi Conjecture is about decision problems. There are two other versions of Constraint Satisfaction Problems. One is the optimization version: What is the maximum number of constraints that can be satisfied in a particular instance? The other is the counting version: How many satisfying assignments are there? It is in this counting version we wish to report a recent decisive victory.
One can express counting CSP problems as follows: Identify each with a 0-1 valued function ; then the conjunction of a collection of constraints can be replaced by their product. Then a counting CSP problem is to evaluate the following so-called partition function on an input instance ,
where has arity and is applied to variables . Of course if is 0-1 valued, this counts the number of solutions.
In this Sum-of-Product form, there is a vast generalization where each function need not be 0-1 valued. This heritage comes not from logic, but from statistical physics and elsewhere.
In many physical systems, global behavior can be determined by local interactions, which are naturally expressed in this Sum-of-Product form. Included in this class are the Ising Model, the Potts Model, and many others. There is a long tradition in statistical physics to classify the so-called Exactly Solved Models, as expounded in this book by Rodney Baxter.
In computer science language this translates to polynomial time computable. Physicists did not have a rigorous notion of what is not Exactly Solvable; the proper notion in computer science is #P-hardness.
For the whole class #P, Ladner’s theorem still holds: If #P P, then there are counting problems of intermediate complexity between P and being #P-hard. However no natural counting problems are known to have this intermediate complexity between P and #P-hardness. Again the general belief is that all natural counting problems are either in P or #P-hard.
A succession of beautiful papers can be seen in this light, culminating in the latest Cai-Chen dichotomy theorem. These papers give ever stronger indications to the scope of this general belief.
Graph Homomorphisms: A Natural Case
Over the years, a great deal of effort has been focused on a special case of CSPs: Graph Homomorphisms. This is the case when there is a single binary constraint function . For undirected graphs, as is the basic case, this is a symmetric binary function on and is usually represented as a matrix. This version of the problem was first proposed by Laszlo Lovász in 1967: Given a symmetric matrix , compute
where is any undirected graph, with vertices representing variables, and edges representing constraints.
Let’s list some highlights of this substantial body of work:
For graph homomorphisms, first Martin Dyer and Catherine Greenhill proved (note a complexity dichotomy: For any symmetric 0-1 matrix , is either in P or #P-complete, depending explicitly on . Then Andrei Bulatov and Martin Grohe extended this dichotomy to all non-negative matrices. The next major step was a beautiful paper by Leslie Goldberg, Grohe, Mark Jerrum and Marc Thurley, extending this dichotomy to all real matrices. When constraint functions can take both positive and negative values, it becomes significantly more difficult to prove complexity dichotomies. This is because there is the possibility of cancellation, which makes life much more difficult to analyze. Compare this situation with the permanent versus the determinant, where cancellation actually can make a problem easier. The recent work on holographic algorithms is also a good reminder of this. The situation is somewhat analogous to monotone versus non-monotone circuit complexity.
The final step for was a long and intricate paper by Cai, Chen and Pinyan Lu, which extended the dichotomy to the full generality of all complex-valued matrices. This blog has discussed this paper of over 100 pages before here. The upshot is that they gave a complete classification of to be either in P or #P-hard, with an explicit criterion on .
For directed graph homomorphisms, the matrix is not necessarily symmetric. In this case, the strongest previous result is another paper by Cai and Chen, giving a dichotomy of with a decidable criterion on all non-negative . However, this criterion is more opaque. Before that, Dyer, Goldberg and Mike Paterson did prove a dichotomy for more restricted directed graphs, with a more transparent tractability criterion, called Lovász-goodness. The drawback is that it only applies to special 0-1 matrices that as adjacency matrices define acyclic graphs.
A major accomplishment was a dichotomy by Bulatov, who showed that for every finite set of relations (i.e., 0-1 valued functions) the counting CSP problem is either in P or #P-complete. Bulatov is a master of applying techiques from Universal Algebra to computer science, and this proof uses deep results such as commutator theory and tame congruence theory from Universal Algebra. A major open problem of Bulatov’s theorem is the decidability of the criterion for tractability. This was solved by Dyer and David Richerby, who gave an alternative proof to Bulatov’s dichotomy theorem, which uses very little Universal Algebra. Using a technique of Lovász, they showed that their dichotomy criterion for tractability is decidable. However, even though it is decidable, compared to the Cai-Chen-Lu criterion for Graph Homomorphism, it is still not as explicit.
The Unified Dichotomy Theorem
Now all these dichotomy theorems have become special cases of the following theorem by Cai and Chen.
Theorem: Given any finite and any finite set of constraint functions , where each maps (for some ) to the complex numbers, the problem of computing the partition function is either computable in P or #P-hard.
The tractability criterion for this theorem is explicitly given. However it is not known whether the criterion is decidable. The proof uses ideas from both the Dyer-Richerby paper and the Cai-Chen-Lu paper.
More than the technical issue of decidability of the criterion, there is a very different taste to this dichotomy theorem, compared to the Cai-Chen-Lu paper on Graph Homomorphism. There is a sense in which the Cai-Chen-Lu paper on Graph Homomorphism really lets you understand what makes a problem tractable, and what makes it #P-hard. The criterion is delicate, but roughly speaking, in order to be computable in polynomial time, the matrix must be a rank-one modification of tensor products of Fourier matrices; otherwise it is #P-hard. By contrast, the criterion for the broader CSP dichotomy of Cai and Chen is infinitary: It defines certain infinitary objects from the finite set , and imposes some conditions on them.
One can caricature the situation as follows: In the Cai-Chen-Lu dichotomy, they conquered the land with meticulous care. They catalogued all the flowers and inspected all the caves. In the latest CSP dichotomy, they claimed victory by having encircled the mountain; the land is logically conquered, but one does not really know what treasures lie within.
So is it time to declare victory? You be the judge.
One final comment, from Dick: One simple way to explain the reason for all the above beautiful work shooing that problems are either easy or hard is that . If these complexity classes did collapse that would of course explain away all the dichotomy theorems. The reason there is nothing “natural” in between is that there is nothing there. Just a final thought.