Is the traveling salesman problem np-complete

Ron Graham is part mathematician, part computer scientist, part juggler, part magician–he is one of the most talented people I know. He has done terrific work in the theory of computation, and today I will talk about just one of his many great results.

Speaking of talks I would like to share one story about ”talks” that involves Ron. I know others, but let’s save those for a future post. This one is probably famous–especially among those who saw it. The talk concerns the work that I will discuss today–his proof that the Eucildean Traveling Salesman Problem is NP-hard. He presented this work at the ACM STOC Conference back in 1976. The work was joint with Mike Garey and David Johnson, who are the authors of the famous book ”Computers and Intractability: A Guide to the Theory of NP-Completeness”.

In 1976 Graham used an overhead projector to present his talk: this was way before Powerpoint. Then, talks were prepared by writing on clear plastic transparencies called slides. The slides were placed on top of a flat glass surface of a projector, under which was a very bright lamp: the light passed through the slide and reflected off a mirror and then onto the screen. In this way the speaker could see the slide and the audience could see the slide’s image projected on the screen.

Sometimes people would point to places on the slide with a pen or even with their fingers in order to highlight some key part of the slide. This would, of course, appear on the screen as a dark shadow, since they pen or figure were opaque. Pointing in this manner could be annoying if done poorly, but it was a common practice. Today people prefer laser points, in 1976 just as there was no Powerpoint, there were no laser pointers. I imagine then a laser would have cost thousands of dollars and been the size of a small desk.

The proof that the Euclidean Traveling Salesman Problem could encode all of NP required a “gadget”. Often in this type of proof a gadget was required. This result was no different, and Ron used a small piece of cardboard to represent the gadget, since the gadget was a geometric shape. He could move his gadget around the slides to highlight important parts of the construction. One advantage back then was that speakers could use two projectors, each projector could have a different slide on it. This was extremely useful as a speaker could leave a complex slide up while he presented the next one. Note, we have lost this ability in today’s presentations: two steps forward, one step backwards.

Ron was doing exactly this. He had two projectors going and was well into his talk when a miracle happen. A miracle. He was moving the cardboard piece around one slide: you could see on the screen not only the piece but his fingers that were moving it. He then calmly walked over to the second projector to present another slide. But his fingers were clearly still on the first projector. I remember vividly how it slowly dawned on the audience that the fingers on the first projector were not Ron’s. They were fake. It was a great trick and caused an uproar of laughter. Very neat.

A Problem with The Traveling Salesman Problem

Graham’s paper proved that the Euclidean Traveling Salesman Problem (TSP) was NP-hard. But it left open whether the problem was NP-complete. The difficulty hinges on a beautiful open problem that appears to still be open over thirty years later. The problem is this: Can we tell whether or not

$\displaystyle \sqrt a_{1} + \sqrt a_{2} + \sqrt a_{3} + \dots + \sqrt a_{n} \le k$

is true in polynomial time? Here ${a_{1},a_{2},\dots}$ and ${k}$ are integers. Let’s call this problem the sum of squares problem (SSP). If SSP could be solved in polynomial time then TSP would be in NP and so it would be NP-complete. Recently, Michal Goemans posted this problem on the website called the problem garden. This is a site that is compiling a list of open problems for mathematics and for theory. Take a look at it. Even better post your open problems there. Even even better solve some of the open problems. By the way, Goemans also pointed out that this problem is of importance to our understanding of semi-definite programming.

Why is SSP Hard?

I just got a great question that I decided to add to the post. The question was: “why is SSP hard?” For any integer $a$ we can compute the value of $\sqrt a$ to $n$ bits of precision in time polynomial in $n$. So why not just compute all the square roots to high precision and see if the sum of the approximations are less than $k$? The answer is that we cannot show that a sum like

$\displaystyle \sqrt a_{1} + \sqrt a_{2} + \sqrt a_{3} + \dots + \sqrt a_{n}$

could not be almost equal to $k$. We cannot show that it cannot be extremely close to $k$. This is the reason the problem is still open, without a bound on the number of bits needed to distinguish the sum of the squares from $k$, there can be no polynomial time algorithm.

Partial Results

There has been some partial results on the sum of square problem (SSP). First, there are results from computational geometry that has shed some light on related questions. A typical result is to look at how close the following can be to zero:

$\displaystyle \mid \sqrt a_{1} + \dots + \sqrt a_{n} - \sqrt b_{1} + \dots + \sqrt b_{m} \mid.$

See the following for some recent results.

Second, there are some very pretty results by Eric Allender and his co-authors on a complexity approach to the SSP. They show that the problem is in ${\mathsf{CH}}$ which is the counting hierarchy. I will not go into details today, but will discuss these results in a future post.

An Exact Sum of Squares Problem

I would like to present a result that uses a very important technique due to Zhi-Zhong Chen and Ming-Yang Kao. They used the method in a great paper Reducing randomness via irrational numbers. I love this paper and will definitely be discussing it in the near future in more detail.

I will present their method as a tool for solving the exact sum of square problem (ESSP): this is the problem of whether or not

$\displaystyle \sum_{k} a_{i} \sqrt {b_{i}} = 0$

where ${a_{i}}$ are positive or negative integers and ${b_{i}}$ are only positive integers. The problem is more general than SSP, since we allow coefficients on the square roots; its simpler since we only ask that you determine whether or not the sum is exactly equal to ${0}$. The technology that Chen and Kao have can only work for equalities, it appears not be of any use for inequalities. I say appears–perhaps there is a clever way to extend their method to inequalities. I do not see how, but perhaps you will.

There are algorithms for the ESSP. See this for details. The reason I wanted to show you the Chen-Kao idea is that it potentially is a very useful “trick”. You may be able to use this trick in many other contexts where others methods do not work.

Their method uses some basic facts from field theory and from the theory of algebraic numbers. If you do not know this there are many good places on the web to pick up the basics. See this for a list of sites of this topic. Perhaps the lesson for us is that often classic mathematics can make important contributions to the design of algorithms.

I think the best way to present their idea is through a simple example: suppose that

$\displaystyle \alpha = 112\sqrt 3 + 134\sqrt 5 -201\sqrt 6.$

This is a toy example is so its easy to determine that it is not ${0.}$ If you calculate it you will get about ${1.27}$–not too hard to see that its non-zero. But consider the following number

$\displaystyle \beta = 112\sqrt 3 - 134\sqrt 5 -201\sqrt 6.$

The value of $\beta$ is about ${-597}$. Clearly, it is easier to calculate ${\beta}$ to low precision and see that its non-zero.

The first observation is that ${\alpha = 0}$ if and only if ${\beta = 0}$. The reason is that ${\beta}$ is a conjugate of ${\alpha}$. Algebraic numbers have conjugates and one conjugate is zero if and only if all the conjugates are zero. Moreover, in the case of numbers that are sums of square roots conjugates are easy to determine: in general the numbers ${\sqrt a}$ and ${-\sqrt a}$ are conjugates.

Thus, if we are testing for zero we can use any conjugate that we wish: one is as good as another. The second observation is that often conjugates can have wildly different absolute values. We just gave an example where one was over ${100}$ times larger than the other. Chen and Kao can prove that this is the typical situation. For a more precise statement look at their paper, but the key insight is that most of the conjugates will be relatively large. Thus, to solve an ESSP first pick a random conjugate of the sum of squares that you wish to test; then compute the sum to a polynomial number of bits of accuracy. If the answer is small, then say the sum is exactly ${0}$; otherwise, say its non-zero. The probability that this works is quite high. Very neat idea.

For those who want a bit more details here goes. The number ${\alpha}$ not only is a algebraic number but was an algebraic integer. The norm of ${\alpha}$ usually denoted by ${N(\alpha)}$ must be at least ${1}$ unless ${\alpha}$ is zero. But the norm is just the product of all the conjugates of ${\alpha}$. An easy argument then shows that since the absolute values of all the conjugates are not too large, there cannot be too many small conjugates.

Open Problems

The main open problem is to show that the SSP can be done in polynomial time. However, this is over-kill: you actually only need to show that the SSP can be done in nondeterministic polynomial time. This would be enough to show that TSP is NP-complete. This is a simple observation, yet I do not know of any results that have tried to exploit this insight. Perhaps, you can use it to solve Graham’s original problem.

A second even smaller observation is this: consider a SSP,

$\displaystyle \sqrt a_{1} + \sqrt a_{2} + \sqrt a_{3} + \dots + \sqrt a_{n} \le k$

that arises from a TSP problem. It this case the square root’s arise from the calculation of Euclidean distances. If one point is ${(r,s)}$ and other is ${(u,v)}$, then their distance is

$\displaystyle \sqrt { (r-u)^{2} + (s-v)^{2}}.$

After clearing denominators the values of ${a_{i}}$ will be special integers. In particular, they will always be the sum of two squares. Fermat essentially proved that ${x}$ is represented as the sum of two squares if and only if in the prime factorization of ${x}$, every prime of the form ${4t+3}$ occurs an even number of times. The final thought today is can we exploit this fact? Can the special nature of the square roots be useful?

1. March 4, 2009 2:15 pm

Can you explain why the SSP problem is hard? Can’t we compute square roots in polynomial time?

March 4, 2009 2:44 pm

Yes I will update the post.

We can compute square roots fast. The problem is one of approximation. Can a sum of squares be very very close to an integer? The answer is yes. This is why we may need an exponential number of bits to solve the SSP.

2. March 4, 2009 5:56 pm

I’m confused in the part where you talk about alpha and beta above. Is there any nontrivial situation where sum_i a_i sqrt(b_i) = 0, where all b_i are distinct squarefree integers, all a_i are nonzero integers?

March 4, 2009 7:12 pm

Your are close. There is a paper that shows that the only way that there is a zero is if there are at least two square roots that have a rational ratio. This can be used to get an algorithm. My point in showing the Chen Kao idea was it can be used for many other problems. Sorry for any confusion. See this for details. Hope that helps.

I also think that the conjugate method may be used in a much more general setting. That is also why I included it. Thanks again for you comment.

March 4, 2009 8:29 pm

To give a (randomized) NP verification that sum_i sqrt(b_i) <= k,
why not guess the value S (less than or equal to k) of the sum,
then verify sum_i sqrt(b_i) – sqrt(S^2) = 0 using the randomized
poly-time exact sum of squares algorithm?

I suppose the answer is that this doesn’t show the problem is in NP,
because the reduction is randomized?

March 4, 2009 10:18 pm

The problem is thus. Suppose that $S$ is the sum of the squares and we want to check if $S \le k$. Then guessing the value for $S$ does not help since it will not in general be a rational number. So what do you guess? That is the problem not the use of randomness.

March 4, 2009 8:31 pm

correction: “using the randomized poly-time exact sum of square-roots (ESSP) algorithm”

March 5, 2009 2:07 am

Can you try the following “amplification”?

To test if:

sqrt(a_1) + sqrt(a_2) + …. <= k

you square the left and square the right and are left with an equation in the form:

2*(sum a_i*a_j) <= k’

I guess the problem is that k’ could be much smaller than k …

March 5, 2009 7:39 am

I like the general approach, but I think the algebra is a bit different. When you square the sum $\sqrt a + \sqrt b + \sqrt c$ you get $a + b +c +\text{ cross terms.}$ The cross terms are all the products of the square roots–no? For example $\sqrt {ab}$ is one of the terms and so on.

March 5, 2009 11:28 am

i.e. to get actual sqrt(a) rather than writing it out?

March 5, 2009 12:49 pm

Yes…

to get latex add the five letters latex right after the first dollar.

March 5, 2009 11:34 am

The cross terms are also square roots, i.e. sqrt(ab), but ab can be computed exactly in polynomial time. So it just results in another equivalent problem of the same type, with n*n terms on the left and a (new) integer on the right. Maybe this problem would have a bigger “gap”, which could make it easier.

March 5, 2009 12:50 pm

It might but not clear…dick
I do like the idea of trying to amplify the gap. That’s clever. The trouble is that the number of terms could increase too…one needs to balance these effects. Good luck.

March 5, 2009 4:18 pm

It seems there could be a lot of terms, especially if all the $a_i$‘s are prime.

Just one other idea. Suppose we are testing if

$A = sum a_i < k$

Suppose that it is the case that $k-1 < A < k+1$. Then if $A < k$, it is the case that $A-k+1 1$.

So now we just want to test if the $(A-k+1)^2$ is smaller or larger than $A-k+1$. I don’t know how much precision that would require …

March 5, 2009 4:22 pm

my latex got a little messed up.

I meant, “… it is the case that A-k+1 < 1”.

Also, I meant:

$A = sum \sqrt(a_i) < k$

March 5, 2009 11:15 pm

Here’s an approach to the problem. $\sqrt{a_1} + \ldots + \sqrt{a_n}$ is an algebraic number, since the algebraics are a field. This means it’s the root of polynomial $p \in Z[x]$. Suppose we know that our sum is within 1 of some integer $k$, and we can find such a $p$ with the following properties:
* computable in poly time
* the only root within 1 of $k$ is our sum
* our sum is a root of multiplicity 1.

If we can meet all of these properties, we should be able to test $p(k-1)$ and $p(k)$ — which will be integers. If they have the same sign, our root must be less than $k$, otherwise it must be greater.

From what I remember of field theory, the minimal polynomial of our sum will be of degree $2^n$ (or lower in special cases), and the roots will be all conjugates of the sum — in particular, ours will be the largest root by at least 2. The degree sounds prohibitive, but the examples I tried suggest we can evaluate the polynomial by squaring $n$ times and subtracting off things at each step. For example, the polynomial for $\sqrt{a} + \sqrt{b}$ is $p(x) = (x^2 - a - b)^2 - 4ab$.

I’m not certain that we can either generate or evaluate this polynomial in poly time, but so far it seems conceivable.

March 6, 2009 1:20 am

Correction for the above: if $p(k-1)$ and $p(k)$ have different signs, then the sum is smaller than $k$, and otherwise it’s larger.

Thinking about this approach some more, it sounds $p$ would probably have to involve all of the elementary symmetric polynomials in the $a_k$‘s, which might be a deal-breaker.

12. March 6, 2009 11:34 am

uniformly in j) as n oo thus the left side of (7) is equal to k-np-r+kE

March 6, 2009 11:42 am

In your example of using then Chen-Kao trick for ESSP, isn’t it hard to determine which combinations of signs you’re allowed to flip to get a conjugate? It seems to depend on factoring the numbers in the square roots.

For example consider x = sqrt(2) + sqrt(3) + sqrt(6), deciding whether to flip the signs of sqrt(2) and sqrt(3) already determines whether or not you flip the sign of sqrt(6). To be very specific: sqrt(2) + sqrt(3) – sqrt(6) is not a conjugate of x.

How would you go about taking a random conjugate without factoring the b_i?

March 6, 2009 11:58 am

Good question. You can use gcd’s to get around the factoring problem. In your example it not important taht all numbers are prime but that you find a basis of numbers that are relatively prime.

14. March 6, 2009 5:23 pm

rjlipton: “that” is what you meant, right? L:)

March 6, 2009 6:18 pm

yes….rjlipton

15. April 7, 2009 5:51 am

Let’s be honest, trying every option it always works (if you have infinite time of course), and the algorithm is easy.

I might be wrong, but I think if you base the order in which you deliver the postcards on the angle distance of the differente locations you have to go, there is not chance you have to do extra milage for delivering the cards, and also computing the closer angle of the following delivery it is a pretty easy task, The complexity or order of the algorithm does not grow exponentially like in the np case, it has a lower order of complexity.
So pick a random starting delivery point, deliver, find the next point for delivery based in the closest angle to the previous one (always rotate in the same diraction), till you finish your task.
M

April 7, 2009 7:15 am

I think that this does not work in general. A good place to look at the traveling salesmen problem and various heuristics is the book by Bill Cook at al. See here. The angle idea will not work in general. See the book for examples why. Also here is another example from their site of how complex these tours can get.

April 7, 2009 9:13 pm

If I recall correctly from a talk by Ron Graham not too long ago, he mentioned that the trick of moving some terms to the other side and squaring, and repeating the process will ultimately give an inequality (to be checked) of the form “E <= F?” where E and F are expressions in which the operations are addition, subtraction, multiplication and squaring and the operands are a_1, a_2, …, a_n and k. Clearly without the squaring operation, both E and F can be evaluated in polynomial time. But in this case, there will be O(n) squaring operations and hence brute-force will obviously not work.

Here is a possible way to make progress on this problem: Given two such expressions E and F (as above), can we compute the sizes of expressions E and F? (i.e., the number of digits in E and F)? Can we compute the first (leading) digits of E and F?

I think there is a polynomial time algorithm for these two problems. (For example, an interesting special case is when the only operation is squaring. In this case, the problem boils down to the following: Given two positive integers A and B, compute the number of digits in A^B. Also compute the first digit of A^B. A polynomial time algorithm for these two problems was presented in:

Mika Hirvensalo, Juhani Karhumäki: Computing Partial Information out of Intractable One – The First Digit of 2 n at Base 3 as an Example. MFCS 2002:319-327.)

The basic idea to attack the original problem is: First compute the lengths of E and F, in the only case of tie, proceed to compute the first few digits of E and F etc. But this may not work if the number of digits we may need to resolve a tie in the worst-case is exponential in the size of E and F.

17. April 9, 2009 9:28 am

A small comment: This problem was posted to the Open Problem Garden by someone with the username “abie” and not by Michel Goemans. Michel Goemans is given tentatively as the originator of the problem; this is surely incorrect. As a registered user of the Open Problem Garden, you can correct this entry by replacing “Michel Goemans” with the true originator of the problem (Graham? or someone else?). You can also expand the entry, e.g., by adding more references.

April 9, 2009 4:50 pm

…Often we find in mathematics and logic it is those problems which we attach our names to variables which become the “boson”(unviewable+biased) due to exaggerant claims of ownership.