The Shapes of Computations
Or rather, what can the shapes of proofs tell us about them?
April CACM source |
Juris Hartmanis did much to lay the landscape of computational complexity beginning in the 1960s. His seminal paper with Richard Stearns, “On the Computational Complexity of Algorithms,” was published 50 years ago this month, as observed by Lance Fortnow in his blog with Bill Gasarch. It is a great achievement to open a new world, but all the more mysterious that after 50 years so much of its landscape remains unknown.
Today we ask what might determine the unseen topography and how much some recent large-data discoveries may help to map it.
The idea for this post arose from a possibly phantom memory that Juris (co-)wrote a short draft survey on “Shapes of Computations” sometime in 1986–1989 when I was at Cornell. I recall the specific phrase “long and skinny” to describe space-bounded computations. An space-bounded computation can explore all of a polynomial -sized undirected graph by saving just the current node and some auxiliary information by which to choose a neighbor for the next step. The trace of this computation becomes an -length sequence of -bit strings. A polynomial-space computation doing an exponential search of a game tree has the same long-and-skinny “shape” even though the scale is greater with regard to the input length . Polynomial time-bounded computations, however, can use polynomial space, whereupon they become “short and fat.” Breadth-first search of a graph is a canonical algorithm that hogs space for its relatively short duration.
Which computations fall between these “Laurel and Hardy” extremes? For SAT and the other -complete problems, this is the great question. The surest way to separate from and from would be to characterize these problems by a third distinct shape of computation. But we have not even separated from , nor logspace from , so what can we pretend to know?
Shapes of Proofs
My memory has probably been transferred from a column Juris wrote with his students Richard Chang, Desh Ranjan, and Pankaj Rohatgi for the May 1990 issue of the Bulletin of the EATCS. It has this nice diagram:
The analogy between computations and proofs has been instrumental since the early days of Kurt Gödel and Alonzo Church and Alan Turing. Proofs do, however, give nondeterminism “for free”; is treated the same as in their diagram, same as , while “nondeterministic polynomial space” equals . Hence I’ve regarded “proofs” as secondary to “computations” as objects for complexity. However:
- We have a lot more examples of proofs, going back before Euclid;
- The search for a proof is computational;
- Interactive proofs reflect a two-player game setting where the emphasis is more on solving; and
- We now have examples of large computer-generated proofs to analyze.
The EATCS column defines a notion of width of a proof, characterizes via polynomial width proofs, and marvels at how the classic interactive protocol for retains the “skinny” shape with less length. Indeed, in cases where the verifier is able directly to check evaluations of the unique multilinear extension of the arithmetization of a quantified Boolean formula, every proof step involves just two terms for the rounds to . A related form of skinniness had been brought out by Jin-Yi Cai and Merrick Furst in their 1987 paper, “ Survives Three-Bit Bottlenecks.” The column goes on to emphasize that the form of the proof lends itself to quicker probabilistic verification. This aspect was shortly codified in the definition of probabilistically checkable proof, which lends itself most readily to characterize and .
Three Kinds of Proofs
Amid all this development on “long and skinny” proofs, what can we say about “short and fat” ones? Intuitively, such proofs have lots of cases, but that is not their full story. The network of logical dependence matters too. Hence we think there are most helpfully three kinds of proofs in regard to shape:
- Proofs with close to one single path to the conclusion;
- Proofs with many cases, but the cases have a simple hierarchical tree pattern and can mostly be checked sequentially;
- Proofs with a more web-like pattern of logical dependence between parts.
Direct evaluations of quantified Boolean formulas in have type 2, while the interactive proof with polynomials gives the “feel” of type 1 to both the prover and the recipient of the proof.
Chess problems prefer types 1 or a limited form of 2 for esthetics. The website ChessBase.com recently republished the longest branching-free “Mate-In-N” problem ever created, by Walther Jörgenson in 1976. It is mate-in-203 with no alternative move allowed to the winner, and virtually no case analysis of alternate defensive tries by the loser either.
However, a chess search often has type 3. Often there will be different starting sequences of moves that come to the same position . The value of that was computed the first time is stored in a hash table so that the later sequences are immediately shown that value, cutting off their need for any further work. This resembles breadth-first search insofar as marked nodes may be touched later along other paths. The dependencies of values become web-like. Injured values from hash collisions can cause huge ripple effects, as I covered in a post three years ago.
The stored-hash tactic is much the same as a lemma in a proof that is used multiple times. We suspect that last year’s 6-gigabyte computer-generated proof of discrepancy > 2 in the Erdős Discrepancy Conjecture has many such lemmas, and hence is more type 3 than 2. The provers Boris Konev and Alexei Lisitsa have an updated page with further computations. They do not link the whole impossibility proof of a length 1,161-or-more sequence of discrepancy 2, but do give some of the “DRUP” certificates of unsatisfiability. DRUP stands for reverse-unit propagation with clause deletion, and that propagation strikes us as largely composed of cases and lemmas. The subcases might be codable at high level via predicates for m < 1,161 expressing the unavailability of length- subsequences fulfilling some extra conditions , with such predicates being copiously re-used.
One finds an explosion of stored sub-cases in chess endgame tables. However, in many positions a vast majority of them are to prove wins that a chess master would see as “trivially true” in a few seconds. In other cases an alternative by the loser might simply jump to a later stage of the main trunk line, thus merely accelerating the same loss rather than posing a new case. (Similarly, an alternative by the winner might merely allow the defender to wind back to an initial stage, without much need for separate verification.) We wonder how far this pertains to the longest endgame win discovered in the new 7-piece endgame tables–mate in 545 moves. That is, how large is the part of the proof tree that needs to be specified, so that a chess program given values for positions in the tree could verify the rest via its local search?
Computations Short-Cutting Proofs?
This last eventuality prompts our new speculation: Perhaps we can rigorously develop a science of when large sections of proofs can be effectively handwaved. This would appeal to the computational style most often postulated for our human brains: shallow but broad and with great power of heuristic association, analogous to the complexity class of poly-size constant-depth threshold circuits. Even when yoked with computers the added-value of our grey matter is not to be discounted, as attested in my joint paper last summer—in section 6 showing how human-computer tandems performed better at chess in 2005–08 than computers alone.
We have recently twice covered a conjecture by Freeman Dyson that one feels should lend itself to this kind of treatment. Many other open conjectures in number theory are felt to be “probably” true, where “probably” has a technical sense that might be developed further into some kind of dependent structure: if that handwave is valid then all-the-more certainly so is this one. The idea could be helped by enumeration of exceptions that, once handled, enable the rest of the proof to be executed with a broad brush. As linked from an essay post by Scott Aaronson, Tim Gowers relates a relevant opinion by Don Zagier in a MathOverflow comment. We morph this opinion to say that mathematicians may need a “handwave heuristic” simply because many “obviously true” statements don’t connect to anything substantial that would give a reason for their truth.
This could push many proofs of type 3 toward types 2 or 1. Note that in the way interaction and randomness combine to move type 2 toward type 1, we are already agreeing to tolerate a chance of error. It is the nature of the kind of error involved in converting instances of type 3 into type 2 that needs further consideration. We wonder whether current developments such as homotopy type theory are embracing not just exact patterns but also heuristics for when a search is overwhelmingly likely to succeed—or to fail.
This still leaves our original question of shapes of computations. In the past both Dick and I have undertaken some exploration of conditions under which computations might be similarly “self-improved.” That idea will have to be for another time.
Open Problems
Can we assign particular “shapes” of computations canonically to specific computational problems? Can this help guide concrete attacks, or is it no more than tantamount to solving the big open questions to begin with?
Again we congratulate Juris and Richard on this 50th anniversary of their achievement. We also tip our hat to a comment by John Sidles in our “Case Against Cases” post which partly prompted this further essay. Much more can be said about “shapes of proofs”, including the “tree-like” and “DAG-like” proofs investigated by Steven Cook and Robert Reckhow in the 1970s.
[word changes to second paragraph of “short-cutting” section, ending, and caption, added note with Cook-Reckhow]
It’s The Vision Thing —
• http://inquiryintoinquiry.com/2015/05/10/survey-of-animated-logical-graphs-•-1/
Reblogged this on Pink Iguana.
exactly, the time/space hierarchy is very much like a characterization of computational tableau “dimensions”. there is also a lot of tie-in to CAs. its amazing how mysterious computational complexity remains a half century later. there was a lot of “low hanging fruit”/ theory that got resolved fairly quickly but then there are decades of unresolved questions, it accelerated quickly & then ran into what now seems like a brick wall. P/NP was discovered only ~½ decade and less after the seminal hartmanis-stearns paper you cite… it would be neat if someone wrote a wideranging consolidated historical survey at this pt. there are many such separate accounts.
Discourse in regard to mathematical proofs, truth, understanding, beauty, utility (etc.) has the all the diverse elements of the fable Blind men and an elephant.
No doubt this diversity is a good thing … surely mathematics would be impoverished if there were but one way to approach it … a trait that the mathematical “elephant” shares with many other structured-yet-diverse human activities (e.g., poetry, enterprise, romance, exploration, philosophy).
In recent weeks Michael Harris’ weblog Mathematics without apologies has hosted a marvelously interesting series of essays/comments upon this topic; the recent post “Jacob Lurie explains and elaborates on his ‘No Comment'” (May 18, 2015) is a good start.
The blogosphere hosts many high-quality essays … what’s notable about Harris’ recent essays is the outstanding quality and diversity of the comments.
There are several issues of computation shape and proof style that raise their heads already at the ground level of boolean functions and propositional calculus. The dimensions I have found most prominent at this stage are:
• Insight Proofs vs. Routine Proofs
• Model-Theoretic Methods vs. Proof-Theoretic Methods
• Equational (Information-Preserving) vs. Implicational (Information-Reducing) Proofs
More later, after I dig up some basic examples …
The most striking example of a “Primitive Insight Proof” (PIP❢) known to me is the Dawes–Utting proof of the Double Negation Theorem from the CSP–GSB axioms for propositional logic. There is a graphically illustrated discussion here. I cannot guess what order of insight it took to find this proof — for me it would have involved a whole lot of random search through the space of possible proofs, and that’s even if I got the notion to look for one.
There is of course a much deeper order of insight into the mathematical form of logical reasoning that it took C.S. Peirce to arrive at his maximally elegant 4-axiom set.
The insight that it takes to find a succinct axiom set for a theoretical domain falls under the heading of abductive or retroductive reasoning, a knack as yet refractory to computational attack, but once we’ve lucked on a select-enough set of axioms we can develop theorems that afford us a more navigable course through the subject.
For example, back on the range of propositional calculus, it takes but a few pivotal theorems and the lever of mathematical induction to derive the Case Analysis-Synthesis Theorem (CAST), which provides a bridge between proof-theoretic methods that demand a modicum of insight to model-theoretic methods that can be run routinely.
The AoPS “Proof Without Words Gallery“ shows us that for some proofs, the human mind can delegate “CPU” ratiocination to “GPU” insight-generators (the hyper-efficient workings of the latter being inaccessible to the conscious mind’s introspection).
A little more seriously, the cited Hartmanis / Chang / Ranjan / Rohatgi article “On IP=PSPACE and theorems with narrow proofs” asserts
With the benefit of twenty-five years of hindsight, we appreciate that that NP =? P is about this question but it is legitimate to wonder whether NP =? P is solely about this question.
Indeed, the TCS Stackexchange community wiki “Does P contain languages whose existence is independent of PA or ZFC?” asks a question that inspires the reflection:
These considerations are motivated (of course) by Hartmanis’ own writings … in particular Hartmanis’ monograph Feasible computations and provable complexity properties (1978) … indeed the postulates and arguments of the Hartmanis et al “Theorems with narrow proofs” article are crafted with scrupulous care (as I read them anyway) to avoid the explicit and/or implicit introduction of oracle-decided class-membership.
Conclusion It’s not easy even to talk sensibly about algorithms whose runtime exponents are undecidable in ZFC … much less prove theorems about them!