Should we expect simplicity in a theory named for complexity?

 Amer. Phy. Soc. interview source

Sabine Hossenfelder is a physicist at the Frankfurt Institute for Advanced Studies who works on quantum gravity. She is also noted for her BackRe(Action) blog. She has a forthcoming book Lost in Math: How Beauty Leads Physics Astray. Its thesis is that the quest for beauty and simplicity in physics has led to untestable theories and diverted attention from concrete engagements with reality.

Today we wonder whether her ideas can be tested, at least by analogy, in computational complexity.

Her book is slated to appear on June 12. We have not seen an advance copy but the book grew from her past commentaries including this from 2016, this in Nature in 2017, and this last week. The criticism of string theory goes back even before the book and blog Not Even Wrong by Peter Woit of Columbia and the book The Trouble With Physics by Lee Smolin emerged in 2006. We are not trying to join that debate but rather to engage with the general thesis she stated here:

Do we actually have evidence that elegance is a good guide to the laws of nature?

She continues: “The brief answer is no, we have no evidence. … Beautiful ideas sometimes work, sometimes they don’t. It’s just that many physicists prefer to recall the beautiful ideas which did work.” For an example, supersymmetry is beautiful but has gone far toward a definite “doesn’t work” verdict.

In theoretical computing and mathematics we both remember and preserve beautiful ideas that work. But as bloggers looking to the future as she does, we address ideas that have not yet emerged from shells, to help judge which ones to try hatching. Algorithms and complexity have feet planted not just in Platonic reality but in the empirical fact of programs giving correct answers within the time and other constraints we say they will. Hence we have a testbed for how often a-priori beautiful ideas have proved effective and vice-versa.

## Physics and Complexity

Certainly the burst of particle physics in the early-mid 20th Century came with unanticipated complexity. We mention one well-known anecdote that, to judge from her index, is not among those in her book: Isidor Rabi won the 1944 Nobel Prize for his discovery of nuclear magnetic resonance, which he used not to treat sports injuries but to discern the magnetic moment and nuclear spin of atoms. When the muon was discovered but appeared to play no role in nuclear interactions, he famously reacted by exclaiming,

Who ordered that?

Muons are ingrained in the physics Standard Model which has much beauty but also has “bolted-on” aspects that those seeking greater beauty seek to supersede. The model is incomplete with regard to gravity and neutrino masses and leaves issues about dark energy and the matter/antimatter imbalance unaddressed.

William of Ockham’s “Razor” is most often quoted as “Entities should not be multiplied beyond what is necessary” in Latin words by John Punch from the early 1600s. Estimating where the bar of “necessary” is set is still an issue. Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred Warmuth in 1987 connected Ockham’s Razor to the complexity of learning, and this was further sharpened by Ming Li, Paul Vitanyi, and John Tromp. Further connections via Kolmogorov complexity and algorithmic probability lead to arguments summarized in a nice survey by Li and Vitanyi with Walter Kirchherr. They quote John von Neumann,

The justification (of a model) is solely and precisely that it is expected to work. … Furthermore, it must satisfy certain aesthetic criteria—that is, in relation to how much it describes, it must be simple.

and continue in their own words:

Of course there are problems with this. Why should a scientist be governed by ‘aesthetic’ criteria? What is meant by ‘simple’? Isn’t such a concept hopelessly subjective?

The answer they seek is that simpler theories have higher probability of having been actuated. This may apply well in large-scale environments such as machine learning and “fieldwork” in biological sciences, in testable ways. Whether it applies on the one scale of one theory for one universe is another matter.

At least we can say that complexity theory proposes grounds for judgment in the physics debate. Hossenfelder seems aware of this, to go by a snippet on page 90 that was highlighted by an early reviewer of her book:

Computational complexity is in principle quantifiable for any theory which can be converted into computer code. We are not computers, however, and therefore computational complexity is not a measure we actually use. The human idea of simplicity is very much based on ease of applicability, which is closely tied to our ability to grasp an idea, hold it in mind, and push it around until a paper falls out.

It hence strikes us as all the more important to reflect on what complexity is like as a theory.

## Complexity and the Three Families

We have three main natural families of complexity classes: ${\mathsf{DTIME}[t(n)]}$ and ${\mathsf{NTIME}[t(n)]}$ and ${\mathsf{DSPACE}[s(n)]}$. Up to polynomial equivalence these stand apart and form a short ladder with rungs ${s(n) = O(1)}$, then ${s(n) = O(\log n)}$ and ${t(n) = 2^{O(s(n))} = n^{O(1)}}$, then ${s(n) = n^{O(1)}}$ and ${t(n) = \exp(n^{O(1)})}$, and finally ${s(n) = \exp(n^{O(1)})}$ which is exemplified among natural computational problems by the computation of Gröbner bases and the equivalence of regular expressions with squaring.

Complexity theory’s first remarkable discovery is that almost all of the many thousands of much-studied computational problems are quantized into the completeness levels of these classes. The reductions involved are often much finer than their defining criterion of being poly-time or log-space computable. Without question the reductions and quantization into three families are beautiful. Requoting Rabi now:

Who ordered them?

The families intertwine:

$\displaystyle {\mathsf{DSPACE}[O(\log n)] \subseteq \mathsf{P} \subseteq \mathsf{NP} \subseteq \mathsf{PSPACE} \subseteq \mathsf{EXP} \subseteq \mathsf{NEXP} \subseteq \mathsf{EXPSPACE}.}$

The problems they quantize are similarly ordered by reductions. Thus we can extend Rabi with a pun:

Who totally ordered them?

Yet whether these classes are all distinct has escaped proof. The belief they are distinct is founded not on elegance but on myriad person-years of trying to solve these problems.

Stronger separation conjectures such as Unique Games and (S)ETH, however, seem to be hailed as much for explanatory power as for solid evidence. As a cautionary coda to how we have blogged about both, we note that the former’s bounds were shaved in exactly the range of exponential time bounds that the latter hypotheses rely on for their force.

What is also like the situation in physics is a disconnect between (i) how complexity theory is usually defined via asymptotic time and space measures and (ii) concrete real-world feasibility of algorithms, aspects of which we have flagged. This also infects the reliance on unproven assumptions in crypto, which has been remarked by many and may be unavoidable. In crypto, at least, there is vast experience with attempts to break the conditionally-secure systems, a check we don’t see how to have with published algorithms.

## Bolder Conjectures

Rather than shrink from the physics analogy, we want to test it by going even further with conjectures and comparing their ramifications for theory-building. Here is the first:

Every “reasonable” complexity class is equal to a member of one of the three main families.

Note that some of the big surprises in complexity theory went in the direction of this conjecture. The result that ${\mathsf{IP}=\mathsf{PSPACE}}$ is a perfect example. Also the closure under complement of space shows we only need ${\mathsf{DSPACE}[s(n)]}$ and do not need its nondeterministic counterpart. Short of specifying exactly which of the several hundred classes in the complexity “Zoo” are “reasonable,” we note that many of its classes ${\mathcal{C}}$ are reasonable and such that the equality of ${\mathcal{C}}$ to one of the basic time or space classes would be a huge result. For ${\mathcal{C}}$ like linear time or space or like ${2^{O(n)}}$ exponential time that are not polynomially closed we still get equality to a basic time ${t(n)}$ or space ${s(n)}$ class.

Our second conjecture might be called “superreducibility”:

For every two “natural” computational problems ${A}$ and ${B}$, either ${A \leq_T^p B}$ or ${B \leq_T^p A}$.

This is roughly entailed by the first conjecture since the three families are totally ordered. It may be viable for finer reductions that collapse complements such as polynomial-time one-query reducibility. It is however false without the “natural” qualifier: whenever ${A \leq_T^p B}$ but ${B}$ does not reduce back to ${A}$, there are infinitely many pairwise-incomparable languages between ${A}$ and ${B}$. We wonder whether one can formulate an opposite of the “gappiness” property used to prove this theorem in order to make the second conjecture more formal.

Combined time-space classes ${\mathsf{TISP}[t(n),s(n)]}$ for different ${t(n),s(n)}$ pairs may furnish exceptions to both conjectures, but how natural? Eric Allender noted to us that ${\mathsf{TimeAlternations(exp,poly)}}$ has the first-order theory of the reals with addition as a natural complete problem, as shown by Leonard Berman. It fits between ${\mathsf{NEXP}}$ and ${\mathsf{EXPSPACE}}$ but equality to either would surprise. It preserves the total order conjecture, however. Closer to home are “intermediate” problems within ${\mathsf{NP}}$ or in the realm of ${\mathsf{BQP}}$ or ${\mathsf{SZK}}$ or others. We surveyed work by Eric and others that gives some of these problems greater relatability under randomized Turing reductions but less likelihood of hardness. Notwithstanding these issues, we feel it will take a stronger principle to deflate the guidance value of these conjectures.

If we had a choice in building complexity theory, would we build it like this? Should we invest effort to simplify the theory? Is there a model that improves on the Turing machine? Are there theories within computational complexity for which lack of beauty inhibits their development? For one example, Dick and I started a theory of “progressivealgorithms but ran into uglefactions.

## Kolmogorov Complexity as an Example

The clearest example for our thoughts about theory-building may be Kolmogorov complexity (KC) itself. It is the most direct effort to quantify information. If there is any place where we should expect a simple theory with unique concrete answers and radiant beauty, this is it.

Much as we love and apply the subject, we do not get that tingling feeling. First, the metric by which it is quantified—a universal Turing machine (UTM)—is an arbitrary choice. Any UTM ${U}$ has equal footing in the theory as it stands. The difference made by choice of ${U}$ is just an additive shift related to the size of ${U}$ and the theory is invariant under such shifts. But if you want to know about concrete KC there are strenuous efforts to make.

Second, there are multiple basic definitions, starting with whether the set of code strings needs to be prefix-free. No clear winner has emerged from the original competing proposals.

Third, the metric is uncomputable. Proposals for approximating it by feasible KC notions have only multiplied the entities further. One can base them on automata that have computable decision properties but then there are as many notions as automata models. I (Ken) mentioned here a conversation last year among several principals in this line of work that did not radiate satisfaction about concreteness.

Fourth, these issues complicate the notation. Is it ${K(x)}$ or ${C(x)}$ or ${KC(x)}$—or ${Kt(x)}$ or ${K^t(x)}$ or ${KT(x)}$—conditioned by default or not on ${n = |x|}$ and the like, and are we dropping or keeping additive constants and (log-)logs?

We note a new paper on making the KC theory more “empirical” that may help clean things up. But in the meantime, we cannot deny its importance and success. Our point is that the above marks of ugliness are a brute fact of reality, and any attempt at a more beautiful theory of strings would fly in the face of them.

## Open Problems

In what ways might the quest for beauty and simplicity in complexity theory be necessarily compromised? What do you think of our conjectures: like them? refute them?

May 20, 2018 7:58 am

I fully agree with the view by Prof. Sabine Hossenfelder. As a cryptologist our Science always looks for solving practical problems in practical time. A Cryptologist cannot afford to stay inactive because of apparent complexity of hard problem instances for which there are no known feasible solutions such as solving multivariable polynomial equations, prime factorization of large numbers which are known to have large prime factors, discrete log problems on multiplicative groups, SVP in Lattice, decoding an arbitrary linear code etc. All of these are equivalent to instances of Boolean satisfiability. In Cryptology it is in fact necessary to solve the all-Satisfiability problem (that of representing all satisfying solutions of a Boolean formula). This problem is known to be even harder than the CNF-n-SAT problem which is NP complete for n>2. But in practice we have to solve it anyway.

2. May 20, 2018 2:16 pm

A belief analogous to “the fundamental theory of the universe is beautiful” might be “for any natural problem, the fastest algorithm is beautiful.” This might not be true – for many problems, the fastest algorithm might involve a lot of specialized kluges and fine-tuned bells and whistles to reduce the running time (or memory use or whatever we want to optimize). On the other hand, these klugey algorithms don’t teach us how to solve _other_ problems. It’s only the beautiful algorithms that show us deep new ideas, surprising connections, and entire new families of algorithms.

This is why the best work in computer science involves beautiful mathematics – not necessarily the “best” in terms of engineering, but the best in terms of advancing our understanding.

May 23, 2018 12:32 pm

I don’t think I quite catched the connection of computational complexity and Kolmogorov complexity. The latter is clearly related to simplicity. The former is also called “complexity” but does not seem to have much relevance to the topic…?

May 23, 2018 3:45 pm

To make a more constructive comment: Kevin Kelly has in a number of papers given an a priori justification for Ockham’s razor. He uses the framework of formal learning theory to show that the “simplest” hypothesis has the best worst case performance for the number of retractions leading to the truth. This seems to be the kind of justification Li and Vitanyi are trying to find.

Granted, Kelly doesn’t exactly prove that “simpler theories have higher probability of having been actuated”. Nonetheless, he gives (to my knowledge) the first solid non-empirical justification of Ockham’s razor. A downside is that he has to use a rather artificial theory of learning (formal learning theory) instead of more practical theories like SLT (statistical learning theory) with strong connection to machine learning. But I guess the hope is that the results can be generalized. For example, in SLT, “simpler” hypothesis classes tend to have a lower VC dimension (alas, there are exceptions).