Skip to content

Beating A Forty Year Old Result: Hamilton Cycles

September 9, 2010

A new exponential algorithm for finding a Hamilton cycle in a graph

Andreas Björklund is a complexity theorist who has many interesting results on the complexity of hard problems. He is a master at algorithms for hard problems such as computing the permanent and computing the number of solutions to #P complete problems.

Today I will discuss his recent paper that will soon appear at FOCS 2010. I think it is a beautiful result, and a clever proof. It solves a decades old problem.

I also think that Björklund’s result is an example of a trend among theory results. Most researchers believe that NP-complete problems not only lack polynomial-time algorithms, but require exponential time in worst case. Hence there is greater attention to algorithms with running time

\displaystyle  c^n

where {c>1} is a constant and {n} is a parameter of the problem. Note, usually {n} is not just the number of bits required to represent the problem. An example is SAT itself: there the parameter {n} is the number of variables. The conjecture, in this case, that the best algorithm is exponential in {n} is called the Exponential Time Hypothesis. I must add that I find this conjecture very hard to believe. Yet all algorithms to date for SAT do indeed take exponential time—perhaps I am wrong.

The Result

A Hamiltonian cycle is a cycle in an undirected graph that visits each vertex exactly once and also returns to the starting vertex. A graph is said to be Hamiltonian if it contains a Hamiltonian cycle.

The obvious algorithm is try all cycles: this would take {n!} time at least. One principle I have always told my students is:

No “real” problem’s best algorithm runs in {n!} time.

I have many times seen initial algorithms that take this time, but there always seems to be a way to get a better bound. The classic result, with the much better bound, is due to Michael Held and Richard Karp in 1970:

Theorem: There is a deterministic algorithm that takes an undirected graph on {n} vertices as input and decides if the graph is Hamiltonian in {O^*(2^n)} time.

The key insight of their famous result is to make clever use of dynamic programming. The induction essentially keeps track not ordered sequences of vertices visited, but sets of vertices visited. This can be used to reduce {n!} to {2^n}.

Björklund’s new result is a major improvement over this result—the first in forty years.

Theorem: There is a Monte Carlo algorithm detecting whether an undirected graph on {n} vertices is Hamiltonian or not running in {O^*(1.657^n)} time, with false positives and false negatives occurring with probability exponentially small in {n}.

The major point is that his algorithm runs in exponential time still, but the constant is now {1.657 < 2}. As I stated earlier I believe that we will see many more results like this; results that improve the constant on an exponential algorithm.

An Overview of The Proof

Ryan Williams was kind enough to supply a summary of the proof. I have made some changes, so any errors are all mine. As usual look at the paper for the details.

But first, here is my summary of Ryan’s summary:

  1. Replace the graph’s adjacency matrix so that each {1} replaced by a different variable—except that symmetric positions get the same variable.
  2. Then notice that the determinant of this matrix is the same as the permanent over any finite field of characteristic {2}. This follows since {-1 = 1} in such a field.
  3. This permanent is, of course, a very complex polynomial with an exponential number of terms. However, it is easy to tell by evaluation at a random point whether or not the polynomial is trivial—has no terms or not.
  4. The claim we would like to make, but cannot, is that this polynomial is trivial if and only if there is no Hamilton cycle. Of course this would imply a randomized polynomial time algorithm, so it is too much to hope for.
  5. The claim is therefore false: the polynomial can be trivial for the wrong reasons. The fix to this is the most clever part of the proof. The idea is to add extra labels to the variables to avoid being trivial for the wrong reason. The cost of this last step is exponential. But, it is not too exponential.

\displaystyle  \S

Here is Ryan’s summary of the proof. We will first start with a broken polynomial time algorithm for Hamiltonian cycle, then over a series of steps we will try to fix it.

Let {G = (V,E)} be an undirected graph. Define a matrix {A} over variables as follows.

\displaystyle A(i,j) = A(j,i) = x_{ij}

when {i \neq 1} and {j \neq 1} and {\{i,j\}} in {E}. Let {A(1,i) = x_{1i}} when {\{1,i\}} in {E}, and let {A(i,1) = x_{i1}} when {\{1,i\}} in {E}.

That is, the matrix {A} is similar to an adjacency matrix, and it is symmetric except for the first row and first column. Suppose we plug in random nonzero values for the {x_{ij}}‘s into {A} drawn from the field {GF(2^n)}, and evaluate the permanent of the resulting matrix. If the permanent is zero, say “no Hamilton cycle” If nonzero, say “Hamilton cycle.”

This is a polynomial time algorithm, since permanent = determinant over characteristic two. And, as the paper states, the permanent sums over all oriented cycle covers in the graph, and for each cover we have the product of all {x_{ij}} where {(i,j)} is an arc in the cycle cover. That is, it enumerates all cycle covers along with all possible orientations of the cycles in the cover. One might expect that, over characteristic two, the “bad” cycle covers might all cancel, and so we only get a nonzero evaluation with high probability (whp) if there’s a Hamiltonian cycle. Note that if there is a Hamiltonian cycle, then we do get a nonzero evaluation whp, because for each Hamiltonian cycle we have two orientations of it which contribute an expression of the form

\displaystyle (x_{1v} + x_{v1})P

where {P} is a product of all other {n-2} arcs in the cycle to the sum, for some node {v}. By the Schwartz-Zippel lemma, the sum of these will evaluate to nonzero whp, regardless of what the rest of the polynomial looks like—or by another result.

However the algorithm fails to find a Hamiltonian cycle whp, for one reason: these directed cycle covers can have “2-cycles” which go from some vertex {i} to {j} then back to {i}. If the cycle cover is a collection of 2-cycles plus a cycle involving vertex 1, then when you “re-orient” one of these 2-cycles as in Lemma 1 of the paper, you get the same cycle cover back, not a different one. Hence you cannot “pair up” the cycle covers that have these 2-cycles so that their corresponding monomials all cancel out modulo 2. For all other cycle covers which are not Hamiltonian, they actually do cancel in the evaluation, as the proof of Lemma 1 shows.

The key insight is to assign distinct “labels” to all the arcs in the cycle cover, so that when you reverse the directions of arcs in a 2-cycle, you do get a different 2-cycle. If you have the 2-cycle {(1,2),(2,1)} with label “a” on {(1,2)} and label “b” on {(2,1)}, then the reverse cycle now has label “b” on {(1,2)} and label “a” on {(2,1)}, which is a “different” assignment of labels.

The good news is this works. The bad news is there are potentially too many labels needed. A naive method would use {n} labels and this would yield an algorithm for Hamiltonian cycle that runs in {2^n}, which is too slow. But at least we can compute this “labeled cover” problem in {2^k \mathsf{poly}(n)} time, where {k} is the number of labels needed.

The next important idea is to figure out a way to use less than {n} labels. We will pick a random bisection of the graph into roughly equal parts {V_1} and {V_2}. Fix a Hamilton cycle. With decent probability, there are at most {n/4} edges of this cycle where both endpoints of the edge are in {V_1}. The paper explains how to use this partition to reduce the number of labels needed.

Open Problems

As I said earlier, I believe there is a “trend” among research that goes to a quote of Alan Perlis:

“for every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run.”

I am quite interested in work that is trying to improve the exponential on the running time. There have already been several neat papers, including this one, and I expect to see more.

32 Comments leave one →
  1. Joachim Gudmundsson permalink
    September 9, 2010 8:04 pm

    As always a great post, thanks. However, I’m pretty sure the photo is not of Andreas Björklund (at least not the Andreas Björklund that wrote the paper).

  2. Fedor permalink
    September 10, 2010 2:17 am

    Is it real Andreas on the photo? In real life he looks more like here

    • rjlipton permalink*
      September 10, 2010 8:07 am


      I got the picture wrong. I just clicked on wrong one. It is fixed now—I believe.

      Sorry for any confusion.

  3. Skinner permalink
    September 10, 2010 4:46 am

    I’m sorry, I can’t understand the labeling part of the proof sketch. What is a label of a variable? Do we assign it to variables or arcs of cycle covers?

  4. September 10, 2010 6:07 am

    Thanks for writing about this wonderful and important result!

    You did get Andreas’s picture wrong, however. You can see him chatting with Anna Östlin Pagh at the Reykjavik ICALP 2008 at .

  5. Josephina permalink
    September 10, 2010 8:48 am

    Hi Prof. Lipton, could you express why you find the ETH difficult to believe?

    • rjlipton permalink*
      September 10, 2010 8:58 am


      Why I do not believe ETH? Well even a proof that P is not equal to NP only requires that you prove SAT requires more than polynomial time for an infinite number of input. It says nothing about whether we can solve it in slightly non-polynomial time. SAT could be doable in n^log n and that would not contradict P not equal to NP. So why think that the best algorithm we will ever find for SAT is going to be c^n? That is what I have trouble with believing.

      • Russell Impagliazzo permalink
        September 10, 2010 1:09 pm

        As one of the introducers of at least the name ETH, let me point out that this is labelled a “hypothesis” in our paper, not a “conjecture”. I am not sure whether ETH holds, and really wouldn’t want to give more than 50/50 odds either way (unlike P vs. NP, however, irrational my belief.) However, I think either a proof or a disproof would be very interesting, so it is a useful hypothesis to relate to other plausible complexity statements. When we prove that ETH implies that some other problem is hard, we are saying that to give a better algorithm for the new problem, you have to give a better algorithm for 3-SAT that is of a different order of improvement than people have succeeded at so far. So it is some evidence that major improvement for the problem shown ETH-hard is at least quite difficult, if not necessarily evidence that no such algorithm exists.

        By the way, improved exponential time algorithms for standard NP-complete problems has been a mini-sub-field since at least Tarjan and Trojanowski in the 70’s, and there has been progress all along. Hamiltonian path was an example of a type of problem where 2^n seemed a natural barrier, so this is indeed a big result. But I don’t think it came out of nowhere, and we could point to many other significant recent significant improved algorithms over the last five years. So I think your prediction that this will be a productive area in the future is the safe kind: it’s already being fulfilled before it’s made….

  6. September 10, 2010 9:41 am

    I remember a version of linked lists from Knuth, where a head node was inserted. It made the code simpler, but presumably also made it more efficient. Some years later I optimized another algorithm by adding in ‘virtual’ entities. They didn’t really exist, but they bypassed the corner cases nicely, again saving time.

    Perhaps this is the key to some optimizations: it’s not what you take out or reuse, but instead it’s what you add in without altering the results that makes it faster?

    • September 10, 2010 10:56 am

      Paul, an elegant algebraic/geometric instantiation of this principle is the specification of quaternionic (4D) coordinate functions on a rotation group (3D) manifold … the redundant coordinate induces metric and symplectic isomorphisms (i.e., the familiar “♭” and “♯” operations on vector arrays) that are purely algebraic and singularity-free.

      At any at any given moment, a pretty substantial chunk of all the computer cycles on the planet are integrating molecular trajectories via this insight … almost any modern synthetic biology group accumulates many terabytes of integrated dynamical trajectories every week, and conditions their overall research objectives and strategy largely upon them.

      Unsurprisingly, as engineers become more-and-more focussed upon autonomous dynamical systems (ranging from dynamical molecules to dynamical vehicles), the union of algebraic efficiency with geometric naturality—gee, maybe this field ought to be called “algebraic geometry”?—emerges as not only central to mathematics, but central to the entire STEM enterprise.

      The paucity of undergraduate textbooks that espouse this naturality-driven point of view arguably holds-back the global STEM enterprise … and this is where STEM-centric weblogs (like this one) provide a vital service in providing the advanced mathematical tools that enable engineers to envision a 21st century that is secure, prosperous, free, and “interesting” … which (from a purely engineering point-of-view) is a mighty challenging exercise.

  7. September 10, 2010 10:01 am

    That Alan Perlis quote was terrific: “For every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run.”

    I parsed it as “For every strictly polynomial-time algorithm you have, there is a worst-case exponential algorithm that I would rather run because in practice it works so much better .”

    This parsing was fun because so many of the greatest algorithms fit this principle pretty accurately. In the domain of optimization, the simplex algorithm (of course). In the quantum domain, density functional formalisms. In information theory, turbo codes. In linear algebra, preconditioners that in practice almost always make convergence faster, in worst-case instances destroy convergence completely.

    So it seemed pretty clear that Perlis had articulated a profoundly general “Great Truth”. Then I followed Dick’s link, and discovered that (seemingly?) Alan Perlis had in mind an abstract principle … not concrete instantiations.

    Perhaps someday we might have a post that discussed the broader history and implications of the Perlis quote? With case histories of algorithm development, please … we engineers *LOVE* case histories!

    By the way, digging for this quote uncovered a new book: The P=np Question and Godel’s Lost Letter … by some guy named Richard Lipton … yah can find it on … and it looks TERRIFIC!

    Dick, my sincere appreciation, thanks, and congratulations are extended! You have solved everyone’s holiday gift-giving problem! 🙂

    Now if only you and Terry Tao would join forces, and write a joint column for Scientific American … the glory days of the STEM enterprise might return … 🙂

    • Anonymous permalink
      September 10, 2010 3:53 pm

      I am sad that there’s no kindle version of the book.

    • rjlipton permalink*
      September 12, 2010 1:04 pm


      You are very kind.

  8. September 10, 2010 1:50 pm

    Hi Dick
    Maybe I’m mistaken, but isn’t the Held-Karp paper that does dynamic programming on the TSP the one from 1962 ? (the 1970 papers were the ones that gave the linear relaxation for TSP, IIRC)

    • Craig permalink
      September 12, 2010 12:46 pm

      That’s what I thought. I would be surprised if this new result can be made to work for the TSP.

    • rjlipton permalink*
      September 12, 2010 1:05 pm


      You are correct. My mistake. So the result is even better. Thanks for the great correction.


  9. September 10, 2010 2:05 pm

    Just a note in passing. When I have this article open in a tab the last part of the title is cropped out and it reads “Beating A Forty Year Old.” Perhaps you want to alter the wording a little…

  10. September 10, 2010 9:25 pm

    Does this O*(.) notation mean “ignore polynomial factors” like \tilde{O}(.) means ignore logarithmic factors?

    • rjlipton permalink*
      September 12, 2010 1:06 pm


      Yes it is exactly that in this case. It allows us to be even a bit more approximate. But here the key term so dominates that is safe.

  11. September 11, 2010 12:24 am

    Open problem?: How to quantum computerize Björklund’s Monte Carlo algorithm. Maybe achieve quadratic speed up?

    • rjlipton permalink*
      September 12, 2010 1:07 pm


      A very nice question. I guess you are thinking about using Grover? That would be a cool result.

  12. September 11, 2010 4:26 am

    For this problem, I have a polynomial time algorithm and its proof. But, or smart men donot trust a pig, even donot want to pay a little time, a little attention, or pigs donot trust a smart man and even donot want to pay a little time, a little attention!
    I have no way.

  13. September 12, 2010 9:42 am

    Is it not true that each HC problem can be mapped to an Exact Cover problem with size c.n.log(n).m where n is the number of vertices and m the number of edges? The paper Faster Solutions for Exact Hitting and Exact SAT gives an algorithm with O(2^0.2985n). Would that not be enough to beat the O(2^n) boundary?

  14. September 14, 2010 2:49 pm

    I’ve seen the term “STEM disciplines” frequently used in the last few weeks. Can anyone clarify what it means?

    • rjlipton permalink*
      September 14, 2010 4:17 pm


      S.T.E.M. stands for science, technology, engineering, and mathematics check this.

      • sasuiu permalink
        August 4, 2018 6:07 am

        sure and that

  15. September 17, 2010 1:18 am

    I don’t understand why we can say that this “beats” the Held-Karp algorithm. There is a non-zero probability that the new algorithm gets the wrong answer, correct? If I must guarantee whether a given graph has a Hamiltonian cycle, I cannot use this new algorithm.

    • Jeremy H permalink
      September 27, 2010 5:50 pm

      The deterministic algorithm takes O(2^n) time, while the randomized one take O(1.7) time. As a back-of-the-envelope calculations, let’s ignore constant factors. Then for n=100, the randomized algorithm runs 100,000,000 times faster than the deterministic one.

      Running the algorithm 300 times and taking the majority answer yields a error-probability of 1/[# of particles in the universe]. And we still net a speed boost of 300,000.

      By comparison, the computer running the algorithm is far more likely to err due to being repeatedly bombarded by cosmic radiation.

  16. Venkata permalink
    September 30, 2010 4:43 pm

    “There have already been several neat papers, including this one, and I expect to see more.”
    Could someone please list a few more such papers that use pretty fundamental stuff, and yet produce great results.

    • October 11, 2010 9:47 am


      Excusez moi d’intervenir comme cela ! Je crois que l’algorithme le plus efficace est celui qui relie un Hamiltonien avec un Lagrangien, puisqu’il décrira tous les mécanismes de la conscience humaine


      Clovis Simard


  1. Minesweeper – P vs NP « INNOWORLD
  2. Inexact Remarks On Exact TSP Algorithms « Gödel’s Lost Letter and P=NP

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s