Beating A Forty Year Old Result: Hamilton Cycles
A new exponential algorithm for finding a Hamilton cycle in a graph
Andreas Björklund is a complexity theorist who has many interesting results on the complexity of hard problems. He is a master at algorithms for hard problems such as computing the permanent and computing the number of solutions to #P complete problems.
Today I will discuss his recent paper that will soon appear at FOCS 2010. I think it is a beautiful result, and a clever proof. It solves a decades old problem.
I also think that Björklund’s result is an example of a trend among theory results. Most researchers believe that NP-complete problems not only lack polynomial-time algorithms, but require exponential time in worst case. Hence there is greater attention to algorithms with running time
where is a constant and is a parameter of the problem. Note, usually is not just the number of bits required to represent the problem. An example is SAT itself: there the parameter is the number of variables. The conjecture, in this case, that the best algorithm is exponential in is called the Exponential Time Hypothesis. I must add that I find this conjecture very hard to believe. Yet all algorithms to date for SAT do indeed take exponential time—perhaps I am wrong.
A Hamiltonian cycle is a cycle in an undirected graph that visits each vertex exactly once and also returns to the starting vertex. A graph is said to be Hamiltonian if it contains a Hamiltonian cycle.
The obvious algorithm is try all cycles: this would take time at least. One principle I have always told my students is:
No “real” problem’s best algorithm runs in time.
I have many times seen initial algorithms that take this time, but there always seems to be a way to get a better bound. The classic result, with the much better bound, is due to Michael Held and Richard Karp in 1970:
Theorem: There is a deterministic algorithm that takes an undirected graph on vertices as input and decides if the graph is Hamiltonian in time.
The key insight of their famous result is to make clever use of dynamic programming. The induction essentially keeps track not ordered sequences of vertices visited, but sets of vertices visited. This can be used to reduce to .
Björklund’s new result is a major improvement over this result—the first in forty years.
Theorem: There is a Monte Carlo algorithm detecting whether an undirected graph on vertices is Hamiltonian or not running in time, with false positives and false negatives occurring with probability exponentially small in .
The major point is that his algorithm runs in exponential time still, but the constant is now . As I stated earlier I believe that we will see many more results like this; results that improve the constant on an exponential algorithm.
An Overview of The Proof
Ryan Williams was kind enough to supply a summary of the proof. I have made some changes, so any errors are all mine. As usual look at the paper for the details.
But first, here is my summary of Ryan’s summary:
- Replace the graph’s adjacency matrix so that each replaced by a different variable—except that symmetric positions get the same variable.
- Then notice that the determinant of this matrix is the same as the permanent over any finite field of characteristic . This follows since in such a field.
- This permanent is, of course, a very complex polynomial with an exponential number of terms. However, it is easy to tell by evaluation at a random point whether or not the polynomial is trivial—has no terms or not.
- The claim we would like to make, but cannot, is that this polynomial is trivial if and only if there is no Hamilton cycle. Of course this would imply a randomized polynomial time algorithm, so it is too much to hope for.
- The claim is therefore false: the polynomial can be trivial for the wrong reasons. The fix to this is the most clever part of the proof. The idea is to add extra labels to the variables to avoid being trivial for the wrong reason. The cost of this last step is exponential. But, it is not too exponential.
Here is Ryan’s summary of the proof. We will first start with a broken polynomial time algorithm for Hamiltonian cycle, then over a series of steps we will try to fix it.
Let be an undirected graph. Define a matrix over variables as follows.
when and and in . Let when in , and let when in .
That is, the matrix is similar to an adjacency matrix, and it is symmetric except for the first row and first column. Suppose we plug in random nonzero values for the ‘s into drawn from the field , and evaluate the permanent of the resulting matrix. If the permanent is zero, say “no Hamilton cycle” If nonzero, say “Hamilton cycle.”
This is a polynomial time algorithm, since permanent = determinant over characteristic two. And, as the paper states, the permanent sums over all oriented cycle covers in the graph, and for each cover we have the product of all where is an arc in the cycle cover. That is, it enumerates all cycle covers along with all possible orientations of the cycles in the cover. One might expect that, over characteristic two, the “bad” cycle covers might all cancel, and so we only get a nonzero evaluation with high probability (whp) if there’s a Hamiltonian cycle. Note that if there is a Hamiltonian cycle, then we do get a nonzero evaluation whp, because for each Hamiltonian cycle we have two orientations of it which contribute an expression of the form
where is a product of all other arcs in the cycle to the sum, for some node . By the Schwartz-Zippel lemma, the sum of these will evaluate to nonzero whp, regardless of what the rest of the polynomial looks like—or by another result.
However the algorithm fails to find a Hamiltonian cycle whp, for one reason: these directed cycle covers can have “2-cycles” which go from some vertex to then back to . If the cycle cover is a collection of 2-cycles plus a cycle involving vertex 1, then when you “re-orient” one of these 2-cycles as in Lemma 1 of the paper, you get the same cycle cover back, not a different one. Hence you cannot “pair up” the cycle covers that have these 2-cycles so that their corresponding monomials all cancel out modulo 2. For all other cycle covers which are not Hamiltonian, they actually do cancel in the evaluation, as the proof of Lemma 1 shows.
The key insight is to assign distinct “labels” to all the arcs in the cycle cover, so that when you reverse the directions of arcs in a 2-cycle, you do get a different 2-cycle. If you have the 2-cycle with label “a” on and label “b” on , then the reverse cycle now has label “b” on and label “a” on , which is a “different” assignment of labels.
The good news is this works. The bad news is there are potentially too many labels needed. A naive method would use labels and this would yield an algorithm for Hamiltonian cycle that runs in , which is too slow. But at least we can compute this “labeled cover” problem in time, where is the number of labels needed.
The next important idea is to figure out a way to use less than labels. We will pick a random bisection of the graph into roughly equal parts and . Fix a Hamilton cycle. With decent probability, there are at most edges of this cycle where both endpoints of the edge are in . The paper explains how to use this partition to reduce the number of labels needed.
As I said earlier, I believe there is a “trend” among research that goes to a quote of Alan Perlis:
“for every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run.”
I am quite interested in work that is trying to improve the exponential on the running time. There have already been several neat papers, including this one, and I expect to see more.