Congratulations to John Nash and Louis Nirenberg on the 2015 Abel Prize

 Combined from src1, src2.

John Nash and Louis Nirenberg have jointly won the 2015 Abel Prize for their work on partial differential equations (PDEs). They did not write any joint papers, but Nirenberg evidently got Nash excited about David Hilbert’s 19th problem during Nash’s frequent visits to New York University’s Courant Institute in the mid-1950s. Nash in return stimulated Nirenberg by his verbal approach of barraging a problem with off-kilter ideas. The Norwegian Academy of Sciences and Letters recognized their ‘great influence on each other’ in its prize announcement.

Today we congratulate both men on their joint achievement.

Hilbert’s 19th problem asked whether all solutions to certain partial differential equations must be analytic functions—that is, expressible as power series on local neighborhoods. Enough progress had been made since the 1930s that the remaining task could be likened to building a short road bridge without central stanchions or much room below for support. If you just shoot the road across it is hard to maintain the level needed to reach the other side. But if you aim up then you create more room to make an arch for best support.

The level known before Nash went to work—and Ennio De Giorgi slightly earlier in Italy—was that solutions to Hilbert’s equations gave a property on second derivatives (pardoning a negligible set of points with discontinuities or other bad non-differentiable behavior) that was insufficient. The support structure on the other side of the bridge needed some kind of continuity condition on the first partial derivatives. The task was aiming at a condition low enough to prove but high enough to land where needed on the other side. This makes us wonder whether we have similar situations in computational complexity without fully realizing it.

Nirenberg is one of few surviving researchers who worked on the Manhattan Project—in his case, on a part that had been contracted to Canada’s National Research Council in Montreal. Richard Courant’s son Ernst was a co-worker and suggested NYU as a destination for a Master’s, which led to doctoral work and affiliation there. For his PhD thesis he completed an attack by Hermann Weyl on the problem of embedding the 2-sphere equipped with any Riemannian metric having positive Gaussian curvature into ${\mathbb{R}^3}$ as a convex surface with the standard Euclidean metric so that paths of corresponding points have the same length under the respective metrics. With Louis Caffarelli and Joseph Kohn he gave what are still considered the most stringent restrictions on possible singularities in solutions to the Navier-Stokes equations. He is acclaimed as the grand master of enduringly useful inequalities, which he said he “loves” more than equations.

## Continuity Conditions

The basic definition for a function ${f : A \rightarrow B}$ to be continuous is that for every ${x \in A}$ and ${\epsilon > 0}$ there exists ${\delta > 0}$ such that whenever ${||x - y|| < \delta}$, ${||f(x) - f(y)|| < \epsilon}$. Actually, a more basic definition is that for every open subset ${S}$ of ${B}$, ${f^{-1}(S)}$ is open as a subset of ${A}$, but we are presupposing metrics on ${A}$ and ${B}$ so that ${||\cdot||}$ is defined for each.

If for every ${\epsilon}$ there is a ${\delta}$ that works for all ${x}$, then ${f}$ is uniformly continuous. In much of real or complex analysis this is the strongest uniformity condition one needs. It comes for free if ${A}$ is compact. We can postulate a mapping ${g}$ sending ${\epsilon}$ to ${\delta}$:

$\displaystyle (\exists g)(\forall \epsilon)(\forall x): ||x - y|| < g(\epsilon) \implies ||f(x) - f(y)|| < \epsilon.$

However, this still does not articulate a numerical relationship between ${||x - y||}$ and ${||f(x) - f(y)||}$. It does not guarantee differentiability, much less that the first derivatives be continuous.

The key conditions have the form

$\displaystyle ||f(x) - f(y)|| \leq C||x - y||^{\alpha},$

where ${C}$ and ${\alpha}$ are real constants. If ${\alpha = 1}$ and ${C < 1}$ then ${f}$ is a contracting mapping. Contracting mappings can be called fellow-travelers of Nash’s work. Nash used the Brouwer fixed-point theorem to prove his famous theorem about equilibria in non-zero-sum games, but one can also use a more-general fixed-point theorem together with contracting mappings. Matthias Günther found an alternative proof of Nash’s equally great embedding theorem for Riemannian manifolds into Euclidean space via contracting mappings.

If ${\alpha = 1}$ and ${C}$ is arbitrary, then ${f}$ satisfies a Lipschitz condition, named for Rudolf Lipschitz. Although Lipschitz conditions are commonly available and frequently useful, they were still too high to aim for this problem. What De Giorgi and Nash accomplished was showing that things could work with any appropriately given or chosen ${\alpha}$. The criterion with ${\alpha > 1}$ allowed is called Hölder continuity, named for Otto Hölder.

Hölder continuity turned out to be the key for Hilbert’s 19th. The proof again was like building a bridge. For any instance of Hilbert’s equations, one could take ${\alpha}$ high enough and find ${C}$ to make the inequality work. Given any ${\alpha}$ (and ${C}$), analyticity could be shown to follow. I wonder if we can solve open problems in computer theory not by attacking them directly but by trying to set up this kind of proof structure.

## Continuity and Containment

Applying this idea could be as simple as the condition for saying one complexity class ${C}$ is contained in another class ${D}$. Consider the statement

$\displaystyle \mathsf{NP = P},$

where the classes are represented by standard computable enumerations ${[N_i]}$ of poly-time NTMs and ${[P_k]}$ of poly-time DTMs, respectively. The enumerations are nondecreasing with respect to code size. In terms of those machines, the statement is:

$\displaystyle (\forall i)(\exists k)\,L(N_i) = L(P_k).$

We can strengthen this by insisting that ${k}$ be given by a computable mapping of ${i}$:

$\displaystyle (\exists g)(\forall i): L(N_i) = L(P_{g(i)}).$

Thus a complexity class containment involves a mapping between “spaces” of machines. We can ask what further conditions such a mapping may or must meet. If we start with machines ${N_i}$ and ${N_j}$ that are “close” in some respect, how far apart can the machines ${P_{g(i)}}$ and ${P_{g(j)}}$ be?

Because ${\mathsf{NP}}$ has complete sets we get some further properties for free. If ${\mathsf{NP = P}}$ then there is an ${s}$ such that ${\mathsf{SAT} = L(P_s)}$. The mapping ${g}$ from ${N_i}$ executes the reduction to ${\mathsf{SAT}}$ and composes its output with ${P_s}$ to get ${P_k = P_{g(i)}}$ such that ${L(P_k) = L(N_i)}$.

It is important to note that although inputs ${x}$ of length ${n}$ given to ${P_{g(i)}}$ are expanded by the reduction into formulas of size more than linear in ${n}$ which are input to ${P_s}$, the code of ${P_{g(i)}}$ simply embeds that of ${N_i}$ and ${P_s}$ and so has size linear in ${|N_i|}$. Moreover, if we weaken the mapping condition to say

$\displaystyle (\exists g)(\forall i): L(N_i) \equiv^F L(P_{g(i)}),$

where ${A \equiv^F B}$ means that ${(A \setminus B) \cup (B \setminus A)}$ is finite, then we can employ Leonid Levin’s universal search algorithm to write the code of ${P_s}$ in advance. This ensures that the code expansion from ${N_i}$ to ${P_{g(i)}}$ has a reasonable additive constant. In any event, with respect to the ‘metric’ of code size we can deduce a kind of Lipschitz condition: for all ${i < j}$:

$\displaystyle |P_{g(j)}| - |P_{g(i)}| \leq C\cdot(|N_j| - |N_i|).$

And with respect to running time, although that of ${P_s}$ can be astronomical (as we noted in last year’s post), it is still ${O(n^{\alpha})}$ for some fixed ${\alpha}$. The running time ${n^c}$ of ${N_i}$ does expand inputs ${x}$ into formulas of size ${\tilde{O}(n^c)}$ (the tilde means ignoring polynomials in ${\log n}$), which makes an overall runtime of ${\tilde{O}(n^{c\alpha})}$. Rather than use “${\tilde{\Theta}}$” for exact runtimes, let us ignore more than just ${\log}$ factors by defining a function ${t(n)}$ to be ${\Phi(n^c)}$ if for all ${b < c < d}$,

$\displaystyle n^b = o(t(n)) = o(n^d).$

What we’d like to do take two machines ${M}$ and ${N}$—deterministic or nondeterministic—that have runtimes in ${\Phi(n^b)}$ and ${\Phi(n^c)}$, respectively, and define a distance ${d(M,N)}$ in terms of ${b}$ and ${c}$. We’d further like to arrange at least that under the hypothesized mapping ${g}$,

$\displaystyle d(P_{g(i)},P_{g(j)}) \leq C\cdot d(N_i,N_j)^{\alpha'},$

perhaps with ${\alpha' > \alpha}$. This uses the putative runtime ${\Phi(n^\alpha)}$ of ${P_s}$ to create an analogue of a Hölder condition.

If we define the metric simply on the exponents as ${|c - b|}$ then we get a Lipschitz condition. The running times of ${P_{g(i)}}$ and ${P_{g(j)}}$ become ${\Phi(n^{\alpha b})}$ and ${\Phi(n^{\alpha c})}$, so their ${d}$-distance is (at most) ${\alpha\cdot|c - b|}$. However, we would like to involve quantities like “${n^b}$” and “${n^c}$” or something else that is exponential in ${b}$ and/or ${c}$ in the metric. We could try ${d'(N_i,N_j) = |2^c - 2^b|}$, but then to get even a Hölder condition on the mapping we are seeking ${\alpha'}$ such that

$\displaystyle |2^{\alpha c} - 2^{\alpha b}| \leq C\cdot|2^c - 2^b|^{\alpha'}.$

This is not valid without further qualification because ${2^c - 2^b = 1}$ is possible, among other things. We would be interested to find a reasonable metric based on running-time and/or program size that gives a Hölder but not Lipschitz condition.

Can ${\mathsf{P = NP}}$ happen anyway with a Hölder or even Lipschitz condition under a metric like ${d'}$? It does with an oracle. The construction giving

$\displaystyle \mathsf{NP^{QBF} = P^{QBF}}$

basically maps each oracle NTM ${N_i}$ to an oracle DTM ${P_j}$ that simply bundles the translation of the code of ${N_i}$ into a formula into the oracle queries, and so has the same polynomial running time as ${N_i}$ up to log factors. Hence we can get a Lipschitz property under various distances ${d}$ that use the exponents ${c}$ in running times ${\Phi(n^c)}$. This property does not necessarily “relativize,” and it may be interesting to ask what happens if it does.

## Some Quotes

Perhaps ideas like this can help probe other complexity class relations. When the classes do not have complete sets (or are not known to have them), even getting a computable embedding function ${g}$ can be problematic. That the concepts may be simple is not a block; the point is to find a combination of ideas that are conducive to deep uses. For instance, the maximum principle simply states that solutions to elliptic and parabolic PDEs attain their maximum on any connected open subset on the boundary of that subset. A Simons Foundation feature on Nirenberg quotes him as saying,

“I have made a living off the maximum principle.”

Of course this is similar to principles in convex optimization which Nash initially studied.

As with similar ideas we’ve posted, we’re casting about for a new handle on open problems. To quote Sylvia Nasar’s biography A Beautiful Mind on pages 218–221 about Hilbert’s 19th problem:

[Nash] had a theory that difficult problems couldn’t be attacked frontally. He approached the problem in an ingeniously roundabout manner, first transforming the nonlinear equations into linear equations and then attacking these by nonlinear means.

At worst we can pass on advice from De Giorgi, who narrowly beat Nash to the solution with a markedly different proof:

“If you can’t prove your theorem, keep shifting parts of the conclusion to the assumptions, until you can.”

Wikipedia’s bio of De Giorgi cites this from a MathOverflow thread titled “Should one attack hard problems?” that leads with ${\mathsf{P = NP?}}$

Nasar finishes her account by addressing the “shock” of De Giorgi’s earlier proof on Nash, quoting Peter Lax that when the two met at Courant in 1957, “it was like Stanley meeting Livingstone.” She puts more blame for Nash’s subsequent troubles, quoting Nash himself, on “his attempt to resolve the contradictions in quantum theory.” Which we have also been guilty of promoting. Oh well.

## Open Problems

Can such ideas of continuity, metrics, and more broadly topology help to gain insight about complexity classes?

[typo fix: oracle NTM to oracle DTM, P_j –> P_k consistently, some other word changes]

1. April 16, 2015 6:05 am

There is some work toward the differential analysis of Boolean functions that I did here:

Differential Logic and Dynamic Systems

April 16, 2015 7:13 am

Reblogged this on Pathological Handwaving.

3. April 16, 2015 8:25 pm

I chose ${\Phi}$ in ${\Phi(n^c)}$ to stand for “fuzzy ${n^c}$” and because it’s like a ${\Theta}$ rotated 90 degrees. I hope I’ve got the statement of Weyl’s problem right—as with “analytic” there are different definitions of “isometric” and I wanted to be more helpful than just saying “isometrically embedded.” Plus clarifying that the metric in ${\mathbb{R}^3}$ is taken along the surface of the body not through its interior. Nirenberg in this 2002 AMS interview partially contradicts Nasar’s account of his interaction with Nash, and I tried in the intro to bridge the accounts along lines recognized by the Norwegian Academy.

4. April 17, 2015 11:00 am

Minor typo: in the second display formula in the “Continuity and Containment” section, k is unquantified and the quantification over j is not used. So, either j should be k or k should be j :).

• April 17, 2015 12:03 pm

Thanks. There was another P_j that needed changing to P_k further down. I guess we need a ‘Lint for WordPress Math’ :-). My original draft’s technical part ended shortly after the mention of Levin’s universal machine; it didn’t try to carry the allusion to “metrics” any further. So it just did N_i-to-P_j. Then when I felt the rest could stand as speculation (with some nod from Dick) I grouped ‘i’ and ‘j’ with NTMs and edited P_k to be the DTM.