And a neat question about the neat result

Thomas Hansen gave a talk at our ARC Theory Day a week ago Friday. He is at the Center for the Theory of Interactive Computation, in the Department of Computer Science, Aarhus University. He has won best-paper awards already for his terrific work on the complexity of linear programming.

Today I (Dick) would like to talk about his result and an open question that was raised at the end of his talk.

Before I discuss the result I would like to thank the organizers Prasad Tetali and Prasad Raghavendra again. One reason our Theory Day on 11/11/11 was special, besides the thrice-in-a-century pattern of the date, was the international mix of the participants.

I have just come back from a visit to the Middle East, where the State of Qatar is expanding their vision for computer science research. The Qatari capital Doha is itself an international city, as workers from other countries actually outnumber local residents. It will be interesting to see the expansion of computer science into regions such as the Mideast and Africa and into smaller countries like Qatar.

## Some Simplex History

A linear program in ${n}$ variables and ${m = n+r}$ constraints defines a polytope of feasible solutions—that is, ones that satisfy the inequalities. The optimum value of the linear function ${f(x)}$ being maximized is always attainable when ${x}$ is a vertex of the polytope. This led George Dantzig to propose his simplex method, which first finds a starting vertex ${v}$ on the polytope and then iterates walking to a neighboring vertex ${v'}$ until an optimum vertex is found. The issue is the rule to select ${v'}$.

If ${v}$ is a non-optimal vertex, then because ${f}$ is linear, at least one of its neighbors ${v'}$ gives a higher value. Since there are ${rn}$ neighbors, a polynomial-time algorithm could poll all of them. The decision is not simply a matter of taking the neighbor ${v'}$ giving the highest ${f(v')}$ in “greedy” style, because one may for instance prefer traversing a longer edge that opens into a seemingly better region of the polytope. indeed, Victor Klee and George Minty found an example where the original greedy rule leads to the algorithm tracing out all ${2^n}$ corners of a hypercube, and extensions of this example showed that several other deterministic rules have cases where they take exponential time.

This enhanced the idea of selecting a neighbor at random from among all ${v'}$ with ${f(v') > f(v)}$. There are various policies ${X}$ on how to weight these neighbors. Stephen Smale and others proved in the early 1980’s that under various distributions of problem instances, i.e. of linear program constraint matrices and ${f}$, several of these randomized methods find an optimum in polynomial time with high probability. However, it was not known whether all sequences of fixed programs, one for each ${n}$ and corresponding ${r}$, yielded this expected polynomial behavior over the randomization in the algorithm alone.

## The Result

The paper presented by Hansen is joint with Oliver Friedmann and Uri Zwick, and is titled: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm. It appeared in the last STOC conference. The new theorems all are of the form:

Theorem 1 For the random pivoting rule X there is a concrete family of linear programs ${P_{n}}$ so that for all ${n}$, the expected time using rule X is at least

$\displaystyle 2^{n^{\alpha}},$

where ${\alpha > 0}$ is a constant that depends on X.

The proofs are quite intricate and clever—see the paper for the details, and also this followup paper by Friedmann. The results use connections between pivoting steps performed by simplex-based algorithms and improving switches performed by policy iteration algorithms for solving Markov Decision Processes.

## The Question

At the end of the talk a question was raised by an audience member—I am sorry I do not know who. I thought the question was quite neat and would like to repeat it here. The question makes sense for algorithms other than just simplex, but I will phrase it as only a question about simplex.

Suppose that you have two pivoting rules ${X}$ and ${Y}$. Also suppose that you can prove there is are concrete families of linear programs ${P_{n}}$ and ${Q_{n}}$ so that:

1. The rule ${X}$ takes at least ${2^{n^{\alpha}}}$ steps on ${P_{n}}$.

2. The rule ${Y}$ takes at least ${2^{n^{\alpha}}}$ steps on ${Q_{n}}$.

Can we prove that there must be one family of linear programs ${R_{n}}$ so that both ${X}$ and ${Y}$ take at least ${2^{n^{\beta}}}$ steps? Here ${\beta > 0}$ can be smaller than ${\alpha}$.

Informally, is there a way to put together a linear program family that fools both rules? Note, it seems possible that rule ${Y}$ could do well on those programs ${P_n}$ that ${X}$ has trouble with, and vice-versa.

Note that we are talking about fooling for all ${n}$, or at least all sufficiently large ${n}$. Thus it is not enough to define ${R_{2n} = P_n}$ and ${R_{2n+1} = Q_n}$. The combinatorial issue is whether we can combine two polytopes to have the adversarial features of each. The weaker idea of fooling for infinitely many ${n}$ suggests a variant question, however.

## A Related Question

Ken contributes a related question. Define a “meld” of rule ${X}$ and rule ${Y}$ to be a rule that at each pivot step first selects ${X}$ or ${Y}$, and then follows the policy of ${X}$ or ${Y}$. The section between ${X}$ and ${Y}$ can be probabilistic or deterministic, even depending on the values at the pivot step.

Given ${P_n}$ and ${Q_n}$ as above, can we find a sequence ${R_n}$ that fools every meld of ${X}$ and ${Y}$?

Now the question seems to be nontrivial even if we only mean fooling at infinitely many ${n}$. In the above example where ${R_n}$ is made by alternating ${P_n}$ and ${Q_n}$, it is conceivable that a meld could probabilistically “learn” which type of adversarial polytope it is on, and then switch to the other policy.

## Open Problems

Solve the question that was raised—it may be easy. Or work on more complex pivoting rules. Is there a randomized rule ${Z}$ that works in expected polynomial time for all sequences of programs?

Note that many basic questions about linear programming are still wide open. Earlier we blogged about the surprising refutation of Warren Hirsch’s conjecture that there would always exist a path to an optimum vertex in ${r}$ or fewer steps.

It is commonly known that linear programming belongs to ${\mathsf{P}}$, but this presumes that the number of bits needed to specify the coefficients factors into the input length. No algorithm is known that runs in time polynomial in ${n}$ and ${r}$ in a computation model with unit-cost arithmetic—whether one exists is Smale’s ninth problem.

1. November 23, 2011 5:13 pm

The results of Oliver, Thomas, and Uri are truly wonderful and so is the connection with various classes of stochastic games.

November 24, 2011 6:59 am

“…, but this presumes that the number of bits needed to specify the coefficients factors into the input length.” ?
… is unlimited ?

November 24, 2011 3:31 pm

Why isn’t $R_{n^2} = P_n \times Q_n$ the answer to the problem (gives an infinte number of problems)?
If rule $X$ could solve the $R_n$ in less than $2^{n^\alpha}$ steps, it would also solve $P_n$ as a side result in less than $2^{n^\alpha}$ steps.
Hence, we could speed up rule $X$ by making problems larger. Are there reasons, why this property should be unrealisitic?

So, we can argue analogously for strategy $Y$. Hence we cannot solve $R_{n^2}$ in less than $2^{n^\alpha}$ steps. So we cannot solve $R_n$ in less than $2^{n^\beta}$ steps with $\beta = \frac{\alpha}{2}$.

This solution looks a bit trivial, so what am I missing?

If I am not mistaken, it should also be true that if $(p_1, q_1), (p_2, q_2)$ are adjacent vertices of $P_n \times Q_n$, then $p_1, p_2$ should be adjacent vertices of $P_n$ and similarly for $Q_n$. So we are actually only making the problem larger in a very specific way.

Uhh.. just noticed: Taking the product of two polytopes only adds the number of variables and constraints. So its $R_{2n} = P_n \times Q_n$. And we can even find better bounds on $\beta$.