Getting On Base With P=NP
Getting somewhere on the P=NP question
George Herman Ruth, the “Babe,” was one of the greatest baseball players of all time. Ruth was in the first group of players elected to the Baseball Hall of Fame. He also held the home run record of 60 in the 1927 season, until it was broken in 1961 by Roger Maris hitting 61. Since that season was eight games longer, and since the three subsequent players with more homers in a season are under substance-abuse clouds, it is still possible to argue Ruth’s homer spree was the greatest ever.
Today I want to talk about the status of the question and baseball. July 4th is the traditional midpoint of the Major League Baseball season, although the actual midpoint was last Thursday or Friday for most teams.
One of the lessons of baseball is that swinging for the fence at every at bat is not a great idea. One is as likely to strike out as get a hit, and even less likely to hit a home run. For example, when Ruth played his first season for the Yankees in 1920, in 458 at-bats he had 172 hits, and of those 54 were home runs. A tremendous season, a tremendous record, but not every at-bat was a hit, and not every hit was a home run.
“Babe Ruth struck out 1330 times.”—Graffiti Seen in New York City.
The point of this is that we should be trying to achieve quite modest goals. We should be not looking for home runs, but for walks and singles. Babe Ruth still stands 3rd on the all-time career walks list, despite many fewer at-bats than the two above him. He stands 2nd to only Ted Williams in career on-base percentage (OBP). This, OBP, is the main baseball stat sought by general manager Billy Beane according to the book Moneyball by Michael Lewis. How can we get on-base with the question?
Even though most “experts” believe that is likely to be not equal to , there are more claims that they are equal. On the P-versus-NP page maintained by Gerhard Woeginger, we count 38 claims of equal to 29 of not-equal, through June 2011.
The reason we surmise is that a proof that they are equal only requires that one supply an algorithm that solves one of the known -complete problems. Usually the problem selected is , but sometimes other problems have been used. Probably the Traveling Salesman Problem (TSP) is the second most favorite problem.
The home-run would be to find a polynomial algorithm for , or TSP, or your other favorite problem. I have two suggestions here:
Get a single: The best known algorithms for all run in time bounds that are worse case
where is a constant that depends on the number of variables allowed in the clauses. And is as usual the number of variables. The suggestion is to try and beat the best known bounds. A result that there is an algorithm that runs in time
for each would be tremendous. The conjecture called the ETH, not to be confused with the great institution ETH, is that this is impossible. The conjecture is due to Russell Impagliazzo and Ramamohan Paturi in 1999. A personal note: I may be wrong about many things, but I really find it hard to believe that 3-, for example, will not be shown to be decidable in time
But who knows.
If you can prove such an algorithm exists that would be wonderful, and would be a great breakthrough. Note it would say quite simply: there is a method for solving that requires a much smaller search than brute force. This would be wonderful.
Get a walk: Improve the best known algorithm. Just improve it at all. An improvement here would be a great result—not earth shattering, but a great result nevertheless. One obvious idea would be to try to use powerful tools to improve the running times. The current algorithms for are extremely clever, but do not use powerful algorithmic tools. I would guess that tools such as Semi-Definite Programming, Linear Programming, extractors, expanders, and so on should be used in any attempt to “get a walk.”
The situation for lower bounds is even more difficult than that for better upper bounds. We have almost no lower bounds at all. Further there are the famous obstacles of oracles, natural proofs, and algebraic barriers that must be avoided.
Get a single: We do know something about the hardness of . We know that any Turing machine that solves it must take super-linear time provided the algorithm uses very little space. See this for a new discussion of what is known. I believe that it should be possible to improve these results—although I currently have no idea how to do that.
One bright spot is that such results do not seem to violate any of the above barriers. Also the current proofs of these results, while tricky, use no powerful technology. I wonder if the application of the right powerful tool could at least allow us to get a “single.”
Get a walk: While we know something about the complexity of , we really have almost no hard evidence that deterministic Turing machines are weaker than nondeterministic ones. The best are results that I have discussed here. These show that nondeterminism is slightly stronger than determinism:
Theorem: The complexity class is unequal to the complexity class .
We do not even know if is unequal to the complexity class . This is quite embarrassing, especially since many believe that there is an exponential gap between deterministic time and nondeterministic time—recall the ETH conjecture.
Get a walk or try for a single—forget about home runs for now.
July 4th is the US Independence Day. Some have also tried to prove that is independent—from some strong system of logic. That would be a revolutionary result, but it’s like trying to attack by air rather than “one if by land, two if by sea.”