Why TSP resists being mapped to a small linear program

Sebastian Pokutta just gave a beautiful colloquium talk on proving lower bounds on the size of linear programs. He is a recent addition to the Georgia Tech faculty, but alas not in our College of Computing. He is a faculty member of the ISyE Department at Tech, which is perhaps the best industrial engineering department in the world, and his addition certainly adds to their strength.

Today Ken and I want to talk about results that he presented and their connection to attempts to prove that the Traveling Salesman Problem (TSP) can be solved in polynomial time.

TSP is one of the most famous of all the ${\mathsf{NP}}$-complete problems. It is famous enough to be the subject of a movie.

The attempts on TSP may be more fundamental still. Recall that we once presented ${\mathsf{P}}$ versus ${\mathsf{NP}}$ itself not in terms of Turing machines or nondeterminism, but as an issue between linear and integer programs, that is between LP’s and IP’s.

## The Start Of The Game

The idea is to express the problem as a linear programming problem, then invoke the wonderful result that any linear programming problem can be solved in time polynomial in its size. Put another way: get an LP for for the TSP, and then let the LP solver do all the heavy lifting—sorry too much jargon?

This is a plan, a great plan, but with one small difficulty. How do we express the TSP as a LP that is not too large? For decades people have tried to find such LP’s, and all have failed—at least so far.

This and other attempts have led to a kind of “game” between those who try to solve TSP this way and those who try to find flaws in their proofs. Let’s call the players the makers and the breakers. The makers try to find the magic encoding of TSP into an LP, while the breakers try to show that the encoding is incorrect.

This is an interesting game, indeed a game that I hope some day one side wins forever. To be sure, this requires that ${\mathsf{P}=\mathsf{NP}}$ be resolved. In one direction, if we can efficiently encode TSP instances via LP’s then ${\mathsf{P}=\mathsf{NP}}$. The other direction uses the fact that solving LP’s is known to be a complete problem for ${\mathsf{P}}$ even under very restricted reductions. These reductions would be efficient, but they need not preserve the “character” of the original formulation of the TSP instance.

Exactly what kind of character an LP formulation of a TSP instance may have is how the game is played. Not just how big it is, but how it encodes features of the instance, and how it relates to integer programs for TSP, are the main issues.

## History

The game started in earnest when Ted Swart in the mid 1980’s claimed to have discovered a polynomial-size way to express a special case of the TSP as a polynomial sized LP—he did the Hamiltonian Cycle problem. This really is the start: his attempt is listed #1 on Gerhard Woeginger’s ${\mathsf{P}}$ versus ${\mathsf{NP}}$ claims page. Note that there are sub-classes of instances on which the answers to TSP and Hamiltonian Cycle coincide and remain ${\mathsf{NP}}$-complete. As shown in a 1981 paper by Alon Itai, Christos Papadimitriou, and Jayme Luiz Szwarcfiter, these can be vertex-induced subgraphs of the rectangular grid, which are planar graphs. Then there is a TSP tour of total cost ${n}$ (using Euclidean length) if and only if the graph has a Hamiltonian cycle.

Swart’s idea was to encode Hamiltonian Cycle/TSP as an integer program in a way that its relaxation to a linear program had the same answers. The encoding was direct: it defined what is called the standard TSP polytope ${P_0}$ for a given instance. ${P_0}$ lives in a space that has a co-ordinate ${x_{i,j}}$ for every pair of nodes (cities) ${i}$ and ${j}$. Points in this space are assignments of real numbers to every (possible) edge. ${P_0}$ is defined to be the convex hull of those points that assign ${1}$ to ${n}$ edges that form a cycle, and ${0}$ to all of the ${\binom{n}{2}}$ other edges.

In an IP we can define the desired extreme points directly by constraints saying ${x_{i,j} \in \{0,1\}}$, but in an LP relaxation we can only say ${x_{i,j} \geq 0}$ and ${x_{i,j} \leq 1}$. The other constraints saying ${\sum_j x_{i,j} = 2}$ for each ${i}$, and preventing a union of disjoint cycles, can be handled already by LP, the latter most efficiently with some extra variables. The problem is that the resulting relaxed LP polytope may have other extreme points that are not integral, and hence do not represent tours.

Swart thought he had found an LP encoding and objective function that evaded the problem, but his initial method failed: errors were found by breakers. Then he repaired his encoding, and more errors were found. The game with Swart’s methods looked like it could go on forever, but finally Mihalis Yannakakis in 1988 stopped it with a famous paper, Expressing combinatorial optimization problems by linear programs.

Mihalis showed that any symmetric LP could not efficiently encode the TSP problem. Symmetric means that if you permute the labels ${i,j,\dots}$ of the cities, then any other variables can be permuted so that the LP itself remains the same. In particular, no one city has a distinguished role. Since Swart’s attempts were all symmetric that stopped him. Mihalis gave the breakers a powerful tool: They no longer had to look at the details of the LP encoding—if the encoding had a certain symmetry then it had to be wrong. Pretty neat.

## The Game Today

Mihalis left the makers a possible approach. If they could avoid the symmetry condition then his breaker theorem would no longer apply. He conjectured that the symmetry condition could be lifted, and he was right. Enter Sebastian and his colleagues: Gábor Braun, Samuel Fiorini, Serge Massar, David Steuer, Hans Raj Tiwary, and Ronald de Wolf. In a series of papers they gave the breakers a much more powerful set of tools. Besides removing the symmetric condition, they extended and improved the lower bounds. The two main papers are

• Linear vs. Semidefinite Extended Formulations: Exponential Separation and Strong Lower Boundshere.

• Approximation Limits of Linear Programs (Beyond Hierarchies)here.

As usual see the original papers for the details. In the rest we will try to give a flavor of what these results are and what they mean.

## How Did I Get Here?

Before I start talking about LP’s and TSP’s and the like, I would like to relate the cool answer to a question Prasad Tetali posed after Sebastian’s talk, when the three of us were talking alone. Sebastian answered—you can guess the question—that he did not set out to work on the TSP problem. Rather he was interested in communication complexity, and worked on that for quite a while. Then he and his co-authors noticed that there were neat connections to the classic paper of Mihalis—connections that allowed them to improve his results. The rest is history, ongoing.

## Projections

The key to the interest in the maker-breaker game is that the makers are not obligated to directly encode the TSP. They can use any variables that they like to encode the TSP: they do not have to use the obvious ones ${x_{i,j}}$. This freedom carried to extreme would make the breakers’ job essentially impossible. Like any game—from chess to football—one needs rules to have a “fair” game—the completely unstructured game is as hard as ${\mathsf{P}}$ versus ${\mathsf{NP}}$ itself. So the key notion of “character” of a TSP instance is what the game players call extended formulations.

Suppose that ${P}$ is some polytope that represents some combinatorial problem, such as ${P_0}$ for TSP. The maker need not create an efficient encoding of ${P}$, but is allowed to work in a higher dimension and create an efficient encoding of another polytope ${Q}$. The restriction is that there must be a linear projection ${\pi}$ so that

$\displaystyle \pi(Q) = P.$

Essentially the polytope ${P}$ is a shadow of the polytope ${Q}$. Note that in the following picture from Sebastian’s talk the shadow polytope has more faces than the higher-dimensional one: six vs. eight.

One thing I feel is under-emphasized in these papers is the attractiveness of extended formulations. The size of the smallest polytope that projects to another one is a natural question, and has quite a pretty theory. The question has beautiful connections to other parts of theory, such as communication complexity. For those trying to solve TSP, if it did have a small extended formulation, one could use it to solve the TSP fast. If the polytope ${Q}$ is simpler than ${P}$, working in the higher-dimensional space is just fine. Projections are good for solving. This is why a maker would seek to find small extended formulations—and why it is significant that they cannot work.

## Main Results

The question is now quite precise: Can we show that any ${Q}$ that projects to the polytope ${P}$ must be very large? That is, must it require an exponential number of constraints? If this is true, then the makers are in trouble. At least they cannot use higher-dimensional variables and clever constraints to encode the TSP. There are still possible loop-holes, but the game has become much harder for them.

The results that are proved are of the form: any polytope ${Q}$ that projects to—i.e., is an extended formulation of—the polytope ${P}$ must have size at least ${\dots}$ Here is the main example specific to TSP:

Theorem 1 Any extended formulation of the TSP polytope ${P_0}$ must have at least ${2^{\Omega(\sqrt{n})}}$ constraints.

This applies even when the LP for ${Q}$ itself is not symmetric. One thing this implies is that the force of Mihalis’ original insight is not just against symmetric programs, but against all programs that can project to a certain symmetric target.

Note also that as with Mihalis’ original barrier result, this takes out of the maker’s hands the argument that all the magic can be put into the choice of objective function. The barriers apply already just at the step of defining the feasible solution space. We don’t need to examine the objective function in any claimed program whose constraints are already ruled out.

## Barriers, Dodgeball, and Alley-Oop

Barriers do offer the maker some guidance in finding approaches that might possibly work. But what Ken and I wish to say is that the guidance should be more than just playing “dodgeball” around the barriers. The reason is that the breakers may have set up not only the ground-level barriers but also the potential of an “alley-oop” play. This is a play where the other side thrusts the ball higher than your defense considered, and can jump and fire it down from above and behind you.

In the TSP case, suppose you encode TSP via a basic polytope ${P_1}$ different from ${P_0}$. Maybe the basic variables aren’t edges between cities after all; maybe they’re compound; maybe they’re something else. You might think you’ve dodged the theorem. But first if your constraints defining ${P_1}$ are still symmetric, they may fall under a slight modification of the theorem.

OK, let’s say your ${P_1}$ assigns distinguished roles to one or more of the cities. Now you’ve got a different basic ${P_1}$ and have avoided symmetry. Since projections and other easy inversions of extensions are your friends, you create a bigger LP ${Q}$, but not too big—it’s still polynomial-sized. Are you safe now?

Here’s the rub: Because ${Q}$ is not too big, there’s a lot of room to build a ${Q'}$ with a higher set of constraints but still under a a bound like ${2^{\sqrt{n}}}$. Because you say your ${Q}$ accurately encodes the TSP your way, there’s a lot of leverage already for the higher ${Q'}$ to encode TSP in some standard way, and so project to ${P_0}$. Indeed ${Q'}$ might use extra variables to achieve the kinds of symmetries that ${P_1}$ sought to avoid, while preserving the property of all (relevant) extrema being integral—on taking your statement for ${Q}$ as a basis. But Sebastian’s theorem rules out ${Q'}$, and so your attempt ${Q}$ can still get dunked after all. That’s the alley-oop.

In circuit lower bounds, the alley-oop can be extending a proposed argument so that it naturalizes, perhaps adding the “bait-and-switch” trick we noted here. In other cases it can be taking a field extension, or using padding and translation to show implausible larger consequences.

The moral we draw is that makers facing such barriers shouldn’t play dodgeball. Rather than squint at the barriers looking for loopholes, they should erect a counter-principle that is by-nature tall enough to prevent an alley-oop. The theorems in Sebastian’s joint papers have clear ideas, which they extend to other computational problems besides TSP. Thus any maker wishing to advance another round should not expect to find a proof that works ad-hoc via a loophole, but should seek one that stands on a clear new idea. Preferably the idea should likewise extend to other problems directly, rather than by appeal to complexity-theoretic reductions.

## A Personal Connection

The way that Sebastian proves his lower bounds is to show that the size of any ${Q}$ that projects to ${P}$ is controlled by the complexity of playing certain communication protools. I must add a personal note about this, since (nondeterministic) communication complexity plays a major role here. Back in 1981, Bob Sedgewick and I introduced this notion in a paper entitled Lower Bounds for VLSI. We needed this type of protocol to prove new lower bounds on the size of VLSI circuits.

Our results also were a kind of barrier. Our lower bound proofs actually didn’t care if the VLSI connections themselves could be nondeterministic. Of course there is no practical notion of “nondeterministic VLSI,” but some problems have unexpectedly good upper bounds in that model. What this means is that anyone hoping to prove expectedly higher lower bounds for practical deterministic VLSI would have to try other techniques from ours—but ours were the only game in town for some time.

One key idea that was first made explicit in our paper is the relation between the cost of the protocol and the cost of covering 0-1 matrices with rectangles, as illustrated here:

Again see Sebastian’s papers for details of how all this fits together.

## Open Problems

Can the makers still succeed? The new theorems have made it much harder for them to win. But there are still loop-holes and still room for creative new ideas. Let’s leave the discussion of those for another day.

[Fiorni->Fiorini; improved first “alley-oop” paragraph]

1. November 29, 2012 4:41 pm

Being a convex polytope theorist I find the questions and the new results fascinating! As far as I know, examples of that kind of polytopes P with 0-1 coordinates in dimension n which cannot be described as projections of higher dimensional polytopes Q described via a polynomial number of constraints in n, were not known before, so this is also a purely geometric fruit of computational complexity thinking and methods.

• March 7, 2014 4:16 pm

As far as I know, Thomas Rothvoss showed a few years before Pokutta et al. that almost all such 0-1 polytopes (for the appropriate measure) have exponential extension complexity. Still, Pokutta et al. might have given the first explicit example.

November 30, 2012 1:27 am

Small correction: Q has 6 facets, P has 8 facets. (“facets”, not “faces”).

• November 30, 2012 11:21 pm

Hmmm…I think of “facets” as having any dimension and “faces” as the n-1 dimensional structures. But according to Wikipedia, which recognizes the ambiguity here, I (Ken) may have the usage backwards.

3. November 30, 2012 5:39 am

Dear Dick and Ken, you have wonderful resources. What I like most (not that my opinion counts) is the constructive approach on both side the makers and breakers. The major success of the boundary between XIX-XX century was a general education that lead to scientific revolution. Now we seems to have a huge gap between mathematical advances and engineering approaches. We are missing the critical mass of intuition to tunnel through the barriers (if we are quantum beings). Your site is one of the few great resources to reduce that gap. Moreover your tolerance to “crackpot” comments seems to playing the role of the doctor from “Чудесный доктор” by Kuprin.

Now let me suggest something for the makers.

The discussion here stops at LP. On the other, hand LP is a particular instance of Second Order Cone Programming, which in turn is a particular instance of Semi-Definite Programming, which is in a sense (see the implementation) is a least squares approach with inequalities. One of the instance of the Semi-definite program is the Sum of Squares of Polynomials (SOS) approach. Each of this being more general should provide some of the advantages in terms of the length of encoding. Getting to the core of the SOS we see a least squares problem over the set of multivariate monomials, with all complications arising from it. This is the point, where it gets incredible hard to understand for outsiders – real algebraic geometry. One of my friends once said that algebraic geometry is hard, and the real algebraic geometry is evil. The problem with SOS approach is that it gives only relaxation, by the the same reason Nullstellensatz certificate require high degree polynomials, with many monomials – there are too many quadratic tautologies – there are too many ways high degree monomials can be split into quadratic components. Neither SOS via semi-definite program, nor Nullstellensatz provide a way to exclude, or embed this tautologies from/into the algorithm (by the way this seem to be the reason that adding redundant equations helps in both cases). So the real barrier to P vs NP (exaggerating here) is the Hilbert attempt to construct the foundation of the mathematics, expressed in the propositional proof systems with exponentially many tautologies. As an example, pigeon hole principle is intuitively clear, still it is very hard for propositional proof systems.

November 30, 2012 10:46 am

I enjoyed your article very much – thank you – this whole site is excellent. Your comments about Ted – very sincere and accurate. I was an undergraduate at the time, Ted taught me graph theory. We later went on to become colleagues and excellent friends. What you don’t know is, we presented similar work at a conference in Winnipeg (organized by the late Ralph Stanton) not long after. In implementations – up to some reasonable size – we never came across fractional solutions in practice – the downfall of his original work. But we were left with an interesting thought – the probability of finding a fractional solution (in which case his algorithm could yield no decision) relative to an integer solution in the case of Hamiltonicity. What we really needed at the time – and never found – was an example of a non-Hamiltonian graph that still yielded a feasible LP. The Horton graph was a great candidate but too large for us. So this work was left unfinished – in the sense that we wondered about how “close” his polytope modelled an NP-complete problem. No matter … it was a fantastic time. Before Yannakakis’s proof, others had also been involved. We had some excellent interactions with a man named Michael Robson of Australia (at the time) – who presented theoretical counter examples. We also created a few larger LP’s for solution via CPLEX, and Robert Bixby solved them for us.

Regarding the recent proof – that no polynomial sized extended formulation of the TSP polytope exists – I didn’t find the result too surprising – although the proof is hugely significant i.e. generalization of Yannakakis’s result, and that the proof might generalize to relaxations of the TSP polytope – I believe. I’ll explain below. So then with LP being P-complete (noted in another blog on this site), any polyhedral approach to show P=NP – in my opinion – needs to make only limited use of the TSP polytope – sufficient to build tractable formulations as needed. The P-complete issue then wraps this whole idea as the existence of a one-shot polyhedral model. Hmmm. Hard to imagine. In the back of my thoughts, you would need a model that of course works exactly like the verification part of testing a problem in NP – the very thing we’re looking for. For example, if it were possible to assume the existence of an integer solution in a mixed poly-sized LP, and at the same time – discard pieces of the polytope as you ‘discharged’ parts you didn’t need – while at the same time build new pieces as you moved towards a solution (or no solution) – the idea would be to stave off intractabilities – while you worked toward an end goal. This is all just meta-thinking … but that’s kind of what you’d need I think. That way, we’d make a new polytope for every graph instance – so P-complete is satisfied – AND – the insight is simply summed up as – ‘start tractable, stay tractable, end tractable’.

So last few comments … about the TSP polytope and why I think even a relaxation could never be poly-sized … Rather than view the TSP polytope as the convex hull of tours, it’s perhaps more insightful to appreciate its complicated and beautiful properties with respect to how it models YES’ decisions. A set of Hamiltonian graph instances that contain the same set of tours corresponds with the same d-supporting hyperplane of the TSP polytope, defined by the affine hull of these tours. There exist factorial numbers of isomorphic d-supporting hyperplanes (up to permutation of vertex labels) for each set of these instances, and there exist factorial numbers of distinct sets of graph instances whose affine hull of tours is likewise dimension d, d=0,1,2,…,dim(TSP polytope). The complete set of YES’ decisions are encoded by the TSP polytope as the set of supporting hyperplanes constructed by permutation of every combination of tours allowed in every Hamiltonian graph instance. This IS NP-completeness, that the simple definition `the convex hull of tours’, yields intricate and complicated consequences i.e. that the TSP polytope encodes seemingly intractable amounts of information, from a single asymmetric special instance, through to all isomorphs of every graph. The point to be made here is that the TSP polytope is highly special. It’s common knowledge that no-one knows the faces of the TSP polytope, not even their number, nor can an arbitrary inequality be verified in polynomial time as inducing a face of the TSP polytope (I recall??). A guideline in searching for a model of the non-Hamilton Tour decision problem is therefore to search for a polytope that shares only a few weak properties common to the TSP polytope. That is, if the polytope were to share too many common properties or perhaps just a few strong properties, there might exist some kind of isomorphism with the TSP polytope sufficient to make the polytope just as hard to find as the TSP polytope! And finally my point … Even if a relaxed formulation of the TSP polytope exists such that each face included a 0/1 tour extremum, it’s hard to imagine that sufficient polyhedral symmetry can be gained by – in principle – adding more extrema to induce a polynomial sized formulation, that expresses all faces of the TSP polytope including specially added extrema that must now lie in each face.

Thanks!

Steve Gismondi

• November 30, 2012 11:17 pm

Stephen, also Mikhail above, thank you for great and immediately helpful comments!

December 7, 2012 12:54 pm

Hello
there is something like that with P=NP and between a angle of 0 and 90° there are something like square 3 or square-3 or 3 or 4 squares( “racines”: something who can be the result or a longueur)
http://www.les-mathematiques.net/phorum/file.php?8,file=26095,in_body_attachment=1

February 7, 2013 11:48 am

A little comparison between quantum mechanics and complexity theory.

1) Universality
All matter is composed of quantum particles.
Every NP problem is polynomially reducible to an NP-complete problem.

2) Special behavior
Quantum particles exhibit bizarre properties which are counter-intuitive.
The complexity of NP-complete problems exhibits unexplained undecidability.

3) Duality
There are two ways of considering matter – either as a wave or as a particle.
There are two ways of considering a process – either as a proof or as a program.

4) Self-reference
Particles are observed by means of light, and light is also made of particles.
Program behavior is proved, and proofs are also programs.

5) Indetermination
At the quantum level, position and momentum can’t be known simultaneously with exactness (Heisenberg).
At the NP-complete level, correctness and efficiency can’t be known simultaneously with exactness (experience).

Is the above analogy a mere coincidence? Or could it be a barrier to P=NP proofs? In all cases, I find it rather puzzling.