A surprising theorem about differential equations

 Composite of src1, src2.

Olivier Bournez and Amaury Pouly have proved an interesting theorem about modeling physical systems. They presented their paper at ICALP 2017 last month in Warsaw.

Today Ken and I wish to explain their theorem and its possible connections to complexity theory.

Of course as theorists we are most interested in discrete systems and rarely if ever mess with differential equations. I do recall, with some awe, that when I started my career at Yale the numerical analysts were experts at ODEs: that’s ordinary differential equations for the rest of us. An ODE is an equation that involves functions of one independent variable and its derivatives. A famous one is Isaac Newton’s second law of motion:

$\displaystyle F = m \frac{dv}{dt}.$

They used their ability to guess solutions to discrete recurrence systems. I do not believe there is an exact meta-theorem connecting discrete recurrences with ODEs but at the heuristic level they were able to just look at a recurrence and say I believe the solution is ${\dots}$ And they were usually right.

Let’s get back to what Bournez and Pouly (BP) have proved.

## The Old Result

Let’s first state what BP proved and then discuss their result. Their result is an extension of a 1981 theorem of Lee Rubel, which Bournez and Pouly call “an astonishing fact.” Rubel proved there is a differential equation that is “universal” in the sense expressed by his theorem statement:

Theorem 1 There exists a fourth order differential algebraic equation (DAE)

$\displaystyle \qquad P(y',y'',y''',y'''')=0 \qquad(*)$

where ${P}$ is a polynomial in four variables and integer coefficients such that for any continuous function ${\phi(t)}$ and any ${\epsilon>0}$ there is a smooth solution ${y(t)}$ to (*) such that for all real ${t}$,

$\displaystyle |y(t)-\phi(t)| < \epsilon.$

Rubel actually proved more: the theorem allows ${\epsilon}$ to be any continuous positive function: so the error between the solution ${y}$ and ${\phi}$ can decay off as ${t}$ tends to plus or minus infinity. He also exhibited specific polynomials ${P}$. If we rename the differentials to variables ${w,x,y,z}$, respectively, then Rubel’s simplest one is:

$\displaystyle P(\!w,\!x,\!y,\!z\!) \!=\! 3w^4 x z^2 \!-\! 4 w^4 y^2 z \!+\! 6w^3 x^2 yz \!+\! 24 w^2 x^4 z \!-\! 12 w^3 x y^3 \!-\! 29 w^2 x^3 y^2 \!+\! 12 x^7.$

Boulez and Pouly note that simpler ones were found by others, including two whole families by Richard Duffin and one in a neat paper by Keith Briggs where one can take any ${n > 3}$:

$\displaystyle \begin{array}{rcl} P_1(w,x,y,z) &=& n^2 w^2 z + 3n(1-n)wxy + (2n^2 - 3n + 1)x^3\\ P_2(w,x,y,z) &=& nw^2 z + (2 - 3n)wxy + 2(n-1)x^3\\ P_3(w,x,y,z) &=& n^2 w^2 z - 3n^2 wxy + 2(n^2 - 1)x^3 \end{array}$

Note that all three have the same “differential monomials” and approach each other as one chooses ${n}$ higher, as is most obvious on multiplying ${P_2}$ by ${n}$.

## The New Result

The theorem may be astonishing but according to BP is also a bit disappointing. They say:

As we said, Rubel’s proof can be seen as an indication that (fourth-order) polynomial implicit DAE is too loose a model compared to classical ODEs, allowing in particular to glue solutions together to get new solutions. As observed in many articles citing Rubel’s paper, this class appears so general that from an experimental point of view, it makes little sense to try to fit a differential model because a single equation can model everything with arbitrary precision.

They cite two more deficiencies of all these results:

First, … the proofs heavily rely on the fact that constructed DAE does not have unique solutions for a given initial data. … Rubel’s DAE never has a unique solution, even with a countable number of [initial] conditions of the form ${y^{(k_i)}(a_i) = b_i}$. Second, the proofs usually rely on solutions that are piecewise defined. Hence they cannot be analytic, while analycity is often a key expected property in experimental sciences.

This leads into what BP proved (their emphasis):

Theorem 2 There exists a fixed polynomial in ${d+1}$ variables, for some ${d}$, so that for any continuous ${f(t)}$ and any ${\epsilon>0}$ there exist ${\alpha_{0},\dots,\alpha_{d-1}}$ so that there is a unique analytic function so that

1. The function ${y}$ is a solution in the sense that

$\displaystyle y(0)=\alpha_{0},y'(0)=\alpha_{1}(0),\dots,y^{(d-1)}(0)=\alpha_{d-1}\quad\text{and}\quad p(y,y',\dots,y^{d})=0;$

2. For all real ${t}$,

$\displaystyle {|y(t)-f(t)| < \epsilon}$.

As with Rubel’s result they actually prove the stronger form where ${\epsilon = \epsilon(t)}$ can be any continuous positive function so that the error between the solution ${y}$ and ${f}$ can decay off just like before.

## ODEs As Models

BP’s theorem differs from Rubel’s in that their solution is unique and analytic. This has several ramifications. For one it makes the proof harder, since gluing functions together as Rubel did fails for analytic function. The proof uses some programming tricks instead. One ingredient is that the function

$\displaystyle \frac{1}{2 - \cos(x) - \cos(\alpha x)}$

can be modulated for any irrational ${\alpha}$ to bring both ${x}$ and ${\alpha x}$ arbitrarily close to multiples of ${2\pi}$ without the denominator vanishing. That is, they can make the fraction grow fast while retaining analyticity. Émile Borel once conjectured that solutions to ${n}$-variable ODEs that are defined on all of ${\mathbb{R}}$ must have growth bounded by a stack of ${n}$ exponentials. This conjecture was refuted even for ${n = 2}$, and BP were able to apply the refutation.

Second, their theorem cuts down the extreme multiplicity of solutions all the way to one. In real life we expect that the equation given the initial conditions uniquely forces the total behavior of the solution. As they quote, Rubel’s paper had even given as open “whether we can require in our theorem that the solution that approximates ${\phi}$ to be the unique solution for its initial data.”

Their result can be viewed both as a positive result and a negative one. On the one hand they show that ODEs are very powerful and can model almost any reasonable computation. You can think of this as showing the power of analog computation. On the other hand, they show that models based on ODEs may likewise be too powerful. They too may be so powerful that they are not really useful.

One weakness is that their polynomial ${p}$ is not so simple. They outline how their construction can be effectivized to give ${d}$ somewhere north of 300. Possibly it can be tightened much further. They point out an analogy with the kind of universality shown for the Riemann zeta function by Sergei Voronin, citing one of our two posts on it for reference. The zeta function is analytic but cannot be a solution to any polynomial ODE (nor DAE). We can try to motivate the task of improving ${p}$ and ${d}$ by noting how a minimum universal ODE is a natural and fixed version of a minimum universal program.

## Open Problems

What are some further implications of the new result for modeling nature? Is there some way we can stratify differentially simulatable systems according to some (concrete) measure of their computational complexity?