A Neat Result About The Simplex Algorithm
And a neat question about the neat result
Thomas Hansen gave a talk at our ARC Theory Day a week ago Friday. He is at the Center for the Theory of Interactive Computation, in the Department of Computer Science, Aarhus University. He has won best-paper awards already for his terrific work on the complexity of linear programming.
Today I (Dick) would like to talk about his result and an open question that was raised at the end of his talk.
Before I discuss the result I would like to thank the organizers Prasad Tetali and Prasad Raghavendra again. One reason our Theory Day on 11/11/11 was special, besides the thrice-in-a-century pattern of the date, was the international mix of the participants.
I have just come back from a visit to the Middle East, where the State of Qatar is expanding their vision for computer science research. The Qatari capital Doha is itself an international city, as workers from other countries actually outnumber local residents. It will be interesting to see the expansion of computer science into regions such as the Mideast and Africa and into smaller countries like Qatar.
Some Simplex History
A linear program in variables and constraints defines a polytope of feasible solutions—that is, ones that satisfy the inequalities. The optimum value of the linear function being maximized is always attainable when is a vertex of the polytope. This led George Dantzig to propose his simplex method, which first finds a starting vertex on the polytope and then iterates walking to a neighboring vertex until an optimum vertex is found. The issue is the rule to select .
If is a non-optimal vertex, then because is linear, at least one of its neighbors gives a higher value. Since there are neighbors, a polynomial-time algorithm could poll all of them. The decision is not simply a matter of taking the neighbor giving the highest in “greedy” style, because one may for instance prefer traversing a longer edge that opens into a seemingly better region of the polytope. indeed, Victor Klee and George Minty found an example where the original greedy rule leads to the algorithm tracing out all corners of a hypercube, and extensions of this example showed that several other deterministic rules have cases where they take exponential time.
This enhanced the idea of selecting a neighbor at random from among all with . There are various policies on how to weight these neighbors. Stephen Smale and others proved in the early 1980’s that under various distributions of problem instances, i.e. of linear program constraint matrices and , several of these randomized methods find an optimum in polynomial time with high probability. However, it was not known whether all sequences of fixed programs, one for each and corresponding , yielded this expected polynomial behavior over the randomization in the algorithm alone.
The paper presented by Hansen is joint with Oliver Friedmann and Uri Zwick, and is titled: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm. It appeared in the last STOC conference. The new theorems all are of the form:
Theorem 1 For the random pivoting rule X there is a concrete family of linear programs so that for all , the expected time using rule X is at least
where is a constant that depends on X.
The proofs are quite intricate and clever—see the paper for the details, and also this followup paper by Friedmann. The results use connections between pivoting steps performed by simplex-based algorithms and improving switches performed by policy iteration algorithms for solving Markov Decision Processes.
At the end of the talk a question was raised by an audience member—I am sorry I do not know who. I thought the question was quite neat and would like to repeat it here. The question makes sense for algorithms other than just simplex, but I will phrase it as only a question about simplex.
Suppose that you have two pivoting rules and . Also suppose that you can prove there is are concrete families of linear programs and so that:
- The rule takes at least steps on .
- The rule takes at least steps on .
Can we prove that there must be one family of linear programs so that both and take at least steps? Here can be smaller than .
Informally, is there a way to put together a linear program family that fools both rules? Note, it seems possible that rule could do well on those programs that has trouble with, and vice-versa.
Note that we are talking about fooling for all , or at least all sufficiently large . Thus it is not enough to define and . The combinatorial issue is whether we can combine two polytopes to have the adversarial features of each. The weaker idea of fooling for infinitely many suggests a variant question, however.
A Related Question
Ken contributes a related question. Define a “meld” of rule and rule to be a rule that at each pivot step first selects or , and then follows the policy of or . The section between and can be probabilistic or deterministic, even depending on the values at the pivot step.
Given and as above, can we find a sequence that fools every meld of and ?
Now the question seems to be nontrivial even if we only mean fooling at infinitely many . In the above example where is made by alternating and , it is conceivable that a meld could probabilistically “learn” which type of adversarial polytope it is on, and then switch to the other policy.
Solve the question that was raised—it may be easy. Or work on more complex pivoting rules. Is there a randomized rule that works in expected polynomial time for all sequences of programs?
Note that many basic questions about linear programming are still wide open. Earlier we blogged about the surprising refutation of Warren Hirsch’s conjecture that there would always exist a path to an optimum vertex in or fewer steps.
It is commonly known that linear programming belongs to , but this presumes that the number of bits needed to specify the coefficients factors into the input length. No algorithm is known that runs in time polynomial in and in a computation model with unit-cost arithmetic—whether one exists is Smale’s ninth problem.