# Getting On Base With P=NP

* Getting somewhere on the P=NP question *

George Herman Ruth, the “Babe,” was one of the greatest baseball players of all time. Ruth was in the first group of players elected to the Baseball Hall of Fame. He also held the home run record of 60 in the 1927 season, until it was broken in 1961 by Roger Maris hitting 61. Since that season was eight games longer, and since the three subsequent players with more homers in a season are under substance-abuse clouds, it is still possible to argue Ruth’s homer spree was the greatest ever.

Today I want to talk about the status of the question and baseball. July 4th is the traditional midpoint of the Major League Baseball season, although the actual midpoint was last Thursday or Friday for most teams.

One of the lessons of baseball is that swinging for the fence at every at bat is not a great idea. One is as likely to strike out as get a hit, and even less likely to hit a home run. For example, when Ruth played his first season for the Yankees in 1920, in 458 at-bats he had 172 hits, and of those 54 were home runs. A tremendous season, a tremendous record, but not every at-bat was a hit, and not every hit was a home run.

“Babe Ruth struck out 1330 times.”—Graffiti Seen in New York City.

The point of this is that we should be trying to achieve quite modest goals. We should be not looking for home runs, but for walks and singles. Babe Ruth still stands 3rd on the all-time career walks list, despite many fewer at-bats than the two above him. He stands 2nd to only Ted Williams in career on-base percentage (OBP). This, OBP, is the main baseball stat sought by general manager Billy Beane according to the book *Moneyball* by Michael Lewis. How can we get on-base with the question?

** Equal **

Even though most “experts” believe that is likely to be not equal to , there are more claims that they are equal. On the P-versus-NP page maintained by Gerhard Woeginger, we count 38 claims of equal to 29 of not-equal, through June 2011.

The reason we surmise is that a proof that they are equal *only* requires that one supply an algorithm that solves one of the known -complete problems. Usually the problem selected is , but sometimes other problems have been used. Probably the Traveling Salesman Problem (TSP) is the second most favorite problem.

The home-run would be to find a polynomial algorithm for , or TSP, or your other favorite problem. I have two suggestions here:

**Get a single:** The best known algorithms for all run in time bounds that are worse case

where is a constant that depends on the number of variables allowed in the clauses. And is as usual the number of variables. The suggestion is to try and beat the best known bounds. A result that there is an algorithm that runs in time

for each would be tremendous. The conjecture called the ETH, not to be confused with the great institution ETH, is that this is impossible. The conjecture is due to Russell Impagliazzo and Ramamohan Paturi in 1999. A personal note: I may be wrong about many things, but I really find it hard to believe that 3-, for example, will not be shown to be decidable in time

But who knows.

If you can prove such an algorithm exists that would be wonderful, and would be a great breakthrough. Note it would say quite simply: there is a method for solving that requires a much smaller search than brute force. This would be wonderful.

**Get a walk:** Improve the best known algorithm. Just improve it at all. An improvement here would be a great result—not earth shattering, but a great result nevertheless. One obvious idea would be to try to use powerful tools to improve the running times. The current algorithms for are extremely clever, but do not use powerful algorithmic tools. I would guess that tools such as Semi-Definite Programming, Linear Programming, extractors, expanders, and so on should be used in any attempt to “get a walk.”

** Not Equal **

The situation for lower bounds is even more difficult than that for better upper bounds. We have almost no lower bounds at all. Further there are the famous obstacles of oracles, natural proofs, and algebraic barriers that must be avoided.

**Get a single:** We do know something about the hardness of . We know that any Turing machine that solves it must take super-linear time **provided** the algorithm uses very little space. See this for a new discussion of what is known. I believe that it should be possible to improve these results—although I currently have no idea how to do that.

One bright spot is that such results do not seem to violate any of the above barriers. Also the current proofs of these results, while tricky, use no powerful technology. I wonder if the application of the right powerful tool could at least allow us to get a “single.”

**Get a walk:** While we know something about the complexity of , we really have almost no hard evidence that deterministic Turing machines are weaker than nondeterministic ones. The best are results that I have discussed here. These show that nondeterminism is *slightly* stronger than determinism:

Theorem:The complexity class is unequal to the complexity class .

We do not even know if is unequal to the complexity class . This is quite embarrassing, especially since many believe that there is an exponential gap between deterministic time and nondeterministic time—recall the ETH conjecture.

** Open Problems **

Get a walk or try for a single—forget about home runs for now.

July 4th is the US Independence Day. Some have also tried to prove that is independent—from some strong system of logic. That would be a revolutionary result, but it’s like trying to attack by air rather than “one if by land, two if by sea.”

“How can we get on-base with the P=NP question?”

Doesn’t Goedel’s famous ‘undecidable’ arithmetical formula [R(x)] help get us there since:

(a) There is no Turing machine TM such that, given any numeral [n], TM will halt and return the value ‘0’ as evidence that [R(n)] interprets as true in the standard model of PA (hence [R(x)] can be said to be algorithmically uncomputable as always true);

(b) For any given numeral [n], there is always some Turing machine TM_n that will halt and return the value ‘0’ as evidence that [R(n)] interprets as true in the standard model of PA (hence [R(x)] can be said to be algorithmically verifiable as always true)?

I do not see the relevence of Godel’s undecidability result to P=?NP. Godel’s problem does not consider how long it takes to decide something, only whether it is decidable. The problems even have different types, Integer->Integer vs Integer->Boolean. The second is decidable in general by a Turing machine by brute force, the first is not.

“I do not see the relevence of Godel’s undecidability result to P=?NP.”

Well, the crude argument goes like this:

(i) We can prove that an arithmetical formula [F(x)] has a proof (of fixed length) in PA if, and only if, there is a program (algorithm) TM (of fixed size) such that, for any given numeral [n], TM gives evidence that [F(n)] is a true arithmetical proposition over the domain N of the natural numbers;

(ii) In his famous 1931 paper Goedel has shown how to construct an arithmetical formula [R(x)]—to which Goedel refers by its Goedel-number ‘r’—such that [R(x)] is not PA-provable;

(iii) So [R(x)] has no proof (of fixed length) in PA, and there is no program TM (of fixed size) such that, for any given numeral [n], TM gives evidence that [R(n)] is a true arithmetical proposition over the domain N of the natural numbers;

(iv) Now, if [R(x)] were in P, then there would necessarily be a program TM (of fixed size) such that, for any given numeral [n], TM would give evidence that [R(n)] is a true arithmetical proposition over the domain N of the natural numbers;

(v) So [R(x)] is not in P;

(vi) In his 1931 paper Goedel also proved that, for any given numeral [n], the formula [R(n)] is PA-provable;

(vii) So, for any given numeral [n], there is a program TM_n (of fixed size) that gives evidence that [R(n)] is a true arithmetical proposition over the domain N of the natural numbers;

(viii) It should, thus, be possible to formally argue that [R(x)] is in NP, unless we argue that [R(x)] is neither in P nor in NP;

(ix) However, since [R(x)] can reasonably be said to be algorithmically verifiable as true for any given [n], but not algorithmically computable as true for any given [n], we should be able to formally conclude that P is not NP.

As with many engineers, it used to be my intuition that any (uniform) Turing machine in $P$ in some sense “gives evidence” (in the sense of (iv) above) that the words in its language possess such-and-such an attribute. After all, Turing machines are commonly designed by software engineers precisely to accomplish this purpose.

But this intuition was overturned by the class of Turing machines in $P$ that Joel David Hamkins constructs in his highly-rated answer to the

MathOverflowquestion Can a problem be simultaneously polynomial time and undecidable? I have long wished that some expert in complexity theory would write a thorough-going exposition of these “wild” members of $P$ and $NP$, which so thoroughly confound my most natural and cherished intuitions of what Turing machines can and cannot accomplish.Bhupinger, I have to agree with ooo that your arguments have nothing to do with PvNP. Furthermore, your statement (i) above is not true; just because a statement F(x) is not verifiable in PA doesn’t mean it’s not verifiable by Turing machines. Indeed it’s pretty easy to construct Turing machines that solve problems that PA can’t. For example, a Turing machine can easily demonstrate for any given X that the Goodstein sequence terminates, but PA cannot prove this for all X.

“Furthermore, your statement (i) above is not true; just because a statement F(x) is not verifiable in PA doesn’t mean it’s not verifiable by Turing machines. Indeed it’s pretty easy to construct Turing machines that solve problems that PA can’t. For example, a Turing machine can easily demonstrate for any given X that the Goodstein sequence terminates, but PA cannot prove this for all X.”

1. The proof of (i)—referred to as a Provability Theorem for PA—is given at:

It is based on a distinction between algorithmic verifiability and algorithmic computability.

2. Strictly speaking, a Turing machine cannot “demonstrate for any given X that the Goodstein sequence terminates”, although we can construct a Turing machine that will halt on any given natural number n if, and only if, the Goodstein sequence for n terminates.

3. Now, the argument that Goodstein’s sequence must terminate for any natural number n is non-finitary and—in the light of Thoraf Skolem’s remarks (see link below) on the `Skolem’ paradox—cannot reasonably be considered as conclusive. The argument is based on the dubious assumption that that any theorem over the finite ordinals whose proof appeals to the properties of transfinite ordinals must hold over the natural numbers. The validity of such an assumption is challenged at:

oOo, your questions are discussed in two survey articles that are hosted on Gerhard Woeginger’sP-versus-NP page: [1] Michael Sipser’s “The History and Status of the P versus NP question” (1992), and [2] Scott Aaronson’s “Is P Versus NP Formally Independent?” (2003). Scott’s survey in particular states its conclusions crisply:If we suppose that all three of Scott’s conjectures are formally true, then what are the implications for complexity theory’s flagship research problem ? This we may take to be a key question that arises naturally from this wonderful

Gödel’s Lost Lettercolumn.Moreover, if we regard computation and simulation as being pretty much the same thing, then we can transfer Scott’s three conjectures to the quantum systems engineering domain without much change:

Two strikingly similar burning arrows are naturally associated to these parallelized conjectures: we can hope to pullback and onto restricted languages that are separable yet adequately respect our (evolving) notions of “computational goodness,” and similarly we can hope to pullback quantum trajectories onto restricted state-spaces that are simulable yet adequately respect our (evolving) notions of “quantum goodness.”

The latter arrow of (pulled back) simulation is at present somewhat better understood than the former arrow of (restricted) computation, but really we are far from having a satisfactory understanding of either kind of arrow.

These striking parallels are why we systems engineers care passionately about both classes of conjectures … and why we enjoy

Gödel’s Lost Letterso much … and why we’re willing and even eager to shoot burning arrows (as necessary) to make concrete progress in solving these two flagship problems of the 21st century.“If we suppose that all three of Scott’s conjectures are formally true, then what are the implications for complexity theory’s flagship research problem P=?NP? ”

Both the articles you cite have a similar perspective on ‘undecidability’, which is perhaps shared almost universally. This is a perspective based on two—generally implicit—assumptions:

1. PA is consistent and has a model over the domain N of the natural numbers;

2. PA is omega-consistent (for a definition see link below).

The first is intuitionistically unobjectionable; the second is not.

To see this note that:

(a) PA is omega-consistent if, and only if, the formula [(Ex)F(x)] can be interpreted in every model of PA as ‘There is some natural number n such that F(n)’.

(b) Almost every text on first-order logic explicitly states—as something that should be intuitively obvious to the reader—that the existential quantifier [E] of any first-order language can always be interpreted as ‘There is some …’.

(c) Brouwer had pointed out in a seminal 19008 paper that such an interpretation of the existential quantifier is ambiguous if interpretation is intended over an infinite domain. He essentially argued that such postulation is invalid as a general logical principle in the absence of a means for constructing some putative object/s for which the proposition holds in the domain of the interpretation.

(d) Despite Brouwer’s objection, the standard interpretation of PA is believed to be free of contradiction even though it explicitly assumes that the formula [(Ex)F(x)] can be interpreted in every model of PA as ‘There is some natural number n such that F(n)’.

(e) Although under this interpretation the PA-formulas are constructively decidable as satisfied / unsatisfied or true / false if, and only if, they are algorithmically verifiable as satisfied / unsatisfied or true / false (in the sense of my earlier companion post, but defined formally in the paper linked to below), the interpretation cannot claim to be finitary, as it does not lead to a finitary justification of the Axiom Schema of (finite) Induction of PA from which we may conclude—in an intuitionistically unobjectionable manner—that PA is consistent.

Prima facie, any mathematical argumentation that bases itself on the assumptions 1 and 2 implicitly should be in danger of being grounded on shifting sands.

Now, if Computability Theory is incontrovertibly convincing in any respect, it is in the practical demonstration of the premise—to the validity of which we freely entrust our lives at each moment of our existence in a technological society—that machine languages admit perfect and effective communication within the ambit of their construction.

So the question may not be as to the consequences for Complexity Theory if Scott’s conjectures are formally true. A more pertinent query may be that if problems such as the PvNP problem appear intractable, do we need to investigate the possibility that the assumptions (and conclusions) on which we base our theory do not adequately reflect the premise on which we base our practice (formalisation of which, ultimately, should be the raison d’etre of any theory)?

Now, there is an argument (link below) that our practical premise is grounded not only in sound intuition, but also in a sound theory free of the objectionable assumption 2 where:

(i) We can define an algorithmic interpretation of PA over N, under which the PA-formulas are constructively decidable as satisfied / unsatisfied or true / false if, and only if, they are algorithmically computable as satisfied / unsatisfied or true / false (in a specifically defined sense);

(ii) The PA-axioms—including the Axiom Schema of (finite) Induction—are algorithmically computable as satisfied / true under the algorithmic interpretation of PA;

(iii) Since the Rules of Inference of PA preserve algorithmically computable truth/satisfiability under such an interpretation, we would have a finitary—and intuitionistically unobjectionable—proof of the consistency of PA;

(iv) It would also follow that PA is omega-inconsistent and algorithmically complete in the sense that a PA-formula [F] is provable if, and only if, there is an algorithm (Turing machine) that will give evidence that [F] is ‘algorithmically’ true (as opposed to ‘verifiably’ true under the standard interpretation);

(v) PA is not omega-consistent and incomplete in Goedel’s sense (to the effect that there is a PA formula [R(x)] such that both [(Ax)R(x)] and [~(Ax)R(x)] are not PA-provable).

Bhupinder, as a person who is both hardware-oriented and a non-expert in set theory, pretty much any response I could offer to your post would echo the concluding section to Scott Aaronson’s survey article

Is P Versus NP Formally Independent?I particularly admired the remark “In one crucial respect, is not at all like [the Continuum Hypothesis]. It’s a -sentence. It refers, not to a transfinite realm beyond human experience, but to machines, Boolean circuits, 1’s and 0’s.” It would be my expectation as an engineer that if the truth of were somehow shown to depend crucially upon issues associated to (for example) -incompleteness, or any other transfinite aspect of set theory, then mathematical experts should repair the definitions of and/or to exclude this possibility. Because Scott’s survey is entirely correct (or so engineers are inclined to agree) to stipulate that in any proper formulation of complexity theory the question must be answerable by reference to physical machines, Boolean circuits, and binary digits. After all, that’s the only kind of computation/simulation that we engineers are ever likely to work with.May I say in passing, this topic sure creates numerous opportunities to reference some terrific articles!

Luke G (and oOo and Anand too), Turing machines that recognize Goodstein’s sequence are interesting, but (regrettably) these machines belong to a class that is far outside of and (AFAICT) far outside even of . These machines’ immense runtimes substantially diminish their practical interest to engineers.

It is known that Turing machines exist in that accept Gödel-undecidable propositions and that these machines can be concretely instantiated. These short-runtime machines are interesting by reason of the natural question, what do we learn and/or verify by actually running them? Here we have the useful and natural intuition that we learn nothing and verify nothing … and yet it’s not obvious (to me anyway) how to state this intuition formally and prove it generally.

Then it is natural to wonder, given that there exists a subclass of -machines that teach us nothing and verify nothing, then should we restrict the problem — and hopefully simplify solving it — by excluding these useless machines? I have longed for a survey or textbook on complexity theory that addressed these considerations (Oded Goldreich’s recent

P, NP, and NP-completenesscomes close) because from an engineering point-of-view these questions are both mathematically natural and eminently practical.With regard to broader considerations relating undecidability to (and to borrow Scott Aaronson’s pithy characterization) Juris Hartmanis’ 1978 monograph is distinguished as being both “

meatyandmeta:”In light of subsequent work seems to me that six words might be naturally appended to Hartmanis’ 1978 introductory assertion, such that it would read “In the 2010s we started to understand what is and is not

practicallyorfeasiblycomputable (and thusverifiablysimulable) in P.” This would serve as a starting point for a 21st century sequel to Hartmanis’ monograph … a sequel that would find at least four interested readers among theGödel’s Lost Letterreadership. 🙂“Juris Hartmanis’ 1978 monograph … ‘We conclude by considering the possibility that the classic problem may be independent of the axioms of formal mathematical systems, such as formal set theory’.”

So far as formal set theory is concerned, I suspect that it is more of a likelihood than a possibility.

Reason: The distinction between algorithmically computable functions, and functions that are algorithmically verifiable but not algorithmically computable—which I suggest may lead to the desired ‘base’ for addressing the PvNP problem gainfully— cannot be reflected in formal set theory, since functions are defined extensionally as mappings (sets of the values that they define).

For instance, in Theorem VII of his famous 1931 paper Goedel has shown that if f(x)=y is a primitive recursive relation, then we can use it to define an arithmetical relation F(x, y) such that, for any natural numbers m and n, f(m)=n if, and only if, F(m,n) is true.

Now, given any such relation f(x)=y we can construct a Turing machine TM_f that will halt on (m, n) if, and only if, f(m)=n.

However, given F(x,y) we cannot (see link below) construct a Turing machine TM_F that will halt on (m, n) if, and only if, F(m)=n.

In other words, two number-theoretic relations can be instantiationally equivalent (and therefore define the same set in a formal set theory) but have differing computational properties; a fact that cannot be reflected in formal set theory.

This implicit limitation of formal set theory should indeed be of some concern for engineers who have been trained to accept formalisations of Computability issues in formal set theory as both natural and adequate.

Anand, although the three of you might not agree, it seems to me that you and Scott Aaronson and Juris Hartmanis have all stated broadly the same case for the undecidability of (albeit in three uniquely different styles). Needless to say, these questions relating to undecidability are subtle, open, and difficult … and likely to remain so for awhile. And it is entirely possible too, that some ingenious researcher will surmount all obstructions associated to oracles, natural proofs, and algebraic barriers, and concretely demonstrate that

isdecidable, via key insights that (as Scott has opined) “lie in yellow books yet to be written.”My opinion/hope is that coming decades will witness plenty of progress on via

bothstrategies, which perhaps (IMHO) are best viewed as being not merely compatible with one another, but even mutually dependent.Thanks for the wonderful column, and especially the link to Gerhard Woeginger’s web page … it’s packed with terrific references. The concluding statement of the column especially made me laugh out loud:

It was surprising to me that the authors of the multiple “Burning Arrow” columns here on

Godel’s Lost Letterhave resisted the temptation to shoot a few additional fiery arrows. So here are two more.The first burning arrow is more “smoldering” than “burning.” We reason that since the fortress of has withstood all attacks by land and sea for many decades, perhaps it’s becoming increasingly reasonable to devote resources to an air assault. An informal sampling of links from Gerhard Woeginger’s web page establishes that (in order mentioned on Gerhard’s page) strategists like Stephen Cook, Michael Sipser, Scott Aaronson, Christos Papadimitriou, Lance Fortnow and Steve Homer, and Scott Aaronson all more-or-less deprecate the possibility that an air assault might be successful, by omission if not otherwise. Moreover, the most comprehensive discussion (that I know) of air assault strategies is Juris Hartmanis’

Feasible computations and provable complexity properties(1978), which is cited only by Scott Aaronson’s review (and even Scott mistakenly cites the title of Hartmanis’ review as “provable complexity problems” rather than “provable complexity properties”). For younger researchers this implies that air assaults on are a relatively underexploited research strategy. C’mon kids, the senior researchers don’t know everything!A second arrow is associated to the notion that possibly is the wrong flagship question for complexity theory (although I greatly respect what I take to be Oded Goldreich’s opinion to the contrary). After all, that flagship hasn’t been sailing very fast! To put a point on the second arrow, perhaps it is possible to restrict the definition of and such that the restricted classes are provably separable, while at the same time sufficiently large to encompass Russell Impagliazzo’s world of

Cryptomania. The intuition, due mainly to Hartmanis, is that this complexity-theoretic paradise might possibly be reached without ever formally resolving , by excluding from solely “wild” languages that are recognized solely by “wild” Turing machines whose properties are undecidable to a degree that renders them inaccessible to verifiable human programming methods.Would complexity theorists in general, and cryptographers in particular, be satisfied to live (provably) in Impagliazzo’s

Cryptomania, without any formal resolution of ? Heck, don’t ask me ;…many engineers would settle for it though! 🙂Dick,

I’m sending you this link because 3sat is reversible just like this.

For students especially, yet another heuristic for “getting on base” in grappling with a vast and intricately structured mathematical domain (like the Complexity Zoo) is to focus on

naturalityas a tool for separating the essential from the inessential.A paywall-free source of survey articles relating to emerging ideals of naturality in the context of computation and/or simulation is the

Bulletin of the AMS… a journal that publishes so many naturality-centric articles that I have to think that this focus is a deliberate policy of the AMS editors.Recent AMS articles that (for me) have been illuminating and/or simply fun are Landsberg’s “Geometry and the complexity of matrix multiplication” (2008), Pelayo and Ngo’s “Symplectic theory of completely integrable Hamiltonian systems” (2011), and Gromov’s “Crystals, proteins, stability and isoperimetry” (2011). These articles provided (what were for me) fresh perspectives on familiar systems; perspectives completely different from those provided by standard textbooks.

So, what exactly should

naturalitymean in the context of computation and/or simulation? Systematically looking for fresh answers to this question is yet another strategy for getting to first base.For example, an arxiv search for “((simulation AND separation) AND natural) AND complexity” finds Fortnow and Santhanam’s “Robust simulations and significant separations” (2010) as a very interesting single — perhaps even an extra-base hit? — that exemplifies this natural hitting strategy.

Too long answers/comments. Our life is too short to go through all this … It would be good to FIX the problem. Not just its vicinity. I am afraid that many readers of your (wonderful in all respects!) blog are more and more lost by there being no “mainline” to follow.