With a lemma from 1947 that might be useful today?

 Cropped from article on his letters

Freeman Dyson passed away last February 28th, one day short of the leap day, February 29th. He was one of the great physicists, one of the great writers about science, and one of the great thinkers of all time. He is missed.

Today we wish to discuss a tiny part of Dyson’s contributions to mathematics—and ask whether it has been developed further.

Dyson did so much in so many areas of science that we will leave it to others to discuss it. We covered a puzzle of his years ago here. He famously showed that quantum electrodynamics (QED) is a consistent theory—in particular showing that different theories connecting quantum mechanics and special relativity were the same. He published many interesting books about science in general. He speculated about aliens, about space travel, and much more.

Our focus is on a beautiful result of his proved in 1947. A result about rational numbers. Nothing grandiose, nothing about infinite visions, nothing about worlds that we cannot easily imagine.

## Approximations To Numbers

For over 2000 years we have known that ${\sqrt{2}}$ is not expressible as a rational fraction. This is sometimes credited to Hippasus of Metapontum. The obvious question, at least to math types, is how close can we make it to a rational number? The answer is interesting, with many consequences. And includes some top mathematicians such as Johann Dirichlet, Joseph Liouville, Axel Thue, Carl Siegel, Dyson, and Klaus Roth.

${\bullet }$ Hippasus: The value ${\sqrt{2}}$ is not rational.

${\bullet }$ Folklore: The value ${\sqrt{2}}$ like all algebraic numbers ${\alpha}$ can be algorithmically approximated by rationals. Note that ${\sqrt{2}}$ is algebraic since it satisfies the equation

$\displaystyle x^{2} - 2 = 0,$

with ${D=2}$. The number ${D}$ is called the degree of ${\alpha}$.

${\bullet }$ Dirichlet: The value ${\alpha}$ can be well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{1}{q^{2}},$

can be done for infinitely many ${p}$ and ${q}$.

${\bullet }$ Liouville: The value ${\alpha}$ cannot be too well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{c}{q^{v}},$

cannot be done for infinitely many ${p}$ and ${q}$ and some constant ${c > 0}$ provided ${v}$ is larger than ${D}$.

${\bullet }$ Thue: The value ${\alpha}$ cannot be too well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{c}{q^{v}},$

cannot be done for infinitely many ${p}$ and ${q}$ and some constant ${c > 0}$ provided ${v}$ is larger than ${D/2+1}$.

${\bullet }$ Siegel: The value ${\alpha}$ cannot be too well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{c}{q^{v}},$

cannot be done for infinitely many ${p}$ and ${q}$ for some constant ${c>0}$ provided ${v}$ is larger than ${2\sqrt{D}}$.

${\bullet }$ Dyson: The value ${\alpha}$ cannot be too well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{c}{q^{v}},$

cannot be done for infinitely many ${p}$ and ${q}$ for some constant ${c > 0}$ provided ${v}$ is larger than ${\sqrt{2D}}$.

${\bullet }$ Roth: The value ${\alpha}$ cannot be too well approximated. That is

$\displaystyle \left| \alpha - \frac{p}{q} \right| \le \frac{1}{q^{v}},$

cannot be done for infinitely many ${p}$ and ${q}$ provided ${D \geq 2}$ and ${v}$ is larger than ${2}$.

Note that the last result is best possible in the sense that Dirichlet achieved ${v = D = 2}$, although slightly stronger statements than Roth can be made by using logarithms.

To illustrate these formulas, Liouville proved that the following number is transcendental:

$\displaystyle \begin{array}{rcl} \sum_{k=1}^{\infty} \frac{1}{10^{k!}} &=& 0.110001000000000000000001000000000000000000000000000000000000\\ && 00000000000000000000000000000000000000000000000000000000001\dots \end{array}$

Note that the exponent in Liouville’s number grows as ${k! \approx \exp(k\log k)}$. The later formulas improve this to simply exponential in ${k}$. For instance,

$\displaystyle \sum_{k=1}^{\infty}\frac{1}{10^{2^k}}$

is transcendental. We’ll leave it to our readers to figure out which formula gives this consequence and will give the historical answer at the end.

## A Key Lemma

What we wish to highlight from Dyson’s 1947 paper is the main lemma for his proof. He called it the key new idea in his paper and also said it might have independent interest. We agree, and we abstract some of its statement into the following definitions.

Suppose we have a polynomial ${f(x,y)}$. Call a point ${(x_0,y_0)}$ a zero of order ${(p,q)}$ if ${0 \leq m \leq p}$ and ${0 \leq n \leq q}$ implies

$\displaystyle \frac{\partial^m}{\partial x} \frac{\partial^n}{\partial y}f(x = x_0, y = y_0) = 0.$

Since this includes the case ${m = n = 0}$, the point ${(x_0,y_0)}$ must be a zero of ${f(x,y)}$ as well as of all the partial derivatives. For example, when

$\displaystyle f(x,y) = xy - x - y + 1,$

the point ${(1,1)}$ is a zero of order ${(1,0)}$ and of order ${(0,1)}$ but not of order ${(1,1)}$. The diagonal-step nature of this example plays into the next definition. We follow Dyson in numbering from zero.

Definition 1 A “${\lambda}$-staircase” for ${f(x,y)}$ is given by points ${(x_0,y_0),\dots,(x_r,y_r)}$ and nonnegative reals ${t_0,\dots,t_r}$ such that for all ${i}$ and ${n \leq t_i}$,

$\displaystyle (x_i,y_i) \quad\text{is a zero of order} \quad (\lfloor\lambda(t_i - n)\rfloor, n).$

It is understood that the ${x_i}$ are distinct and the ${y_i}$ are distinct, but they need not be distinct from each other.

The parameter ${\lambda}$ is the steepness of the staircase. The idea is to make the staircase quite steep by taking both ${\lambda}$ and the degree ${u}$ of ${x}$ to be large. This is controlled by a parameter ${\delta}$ that is inverse to ${\lambda}$ and chosen to meet the hypotheses of the key lemma:

Lemma 2 Suppose ${(x_0,y_0),\dots,(x_r,y_r)}$ and ${t_0,\dots,t_r}$ constitute a ${\lambda}$-staircase for a bivariate polynomial ${f(x,y)}$ of degree ${u}$ in ${x}$ and degree ${s}$ in ${y}$, where for each ${i}$, ${0 \leq i \leq r}$,

$\displaystyle \lfloor t_i \rfloor \leq \min\{s, \frac{u+1}{\lambda} - 1\}.$

Suppose ${\delta}$ is a positive real number such that:

$\displaystyle \lambda > \frac{2}{\delta}, \quad \text{and}\quad \delta \geq \frac{2s}{r(u+1)}.$

Then

$\displaystyle \lambda \sum_{i=0}^r (1 + \lfloor t_i \rfloor)(t_i - \frac{1}{2}\lfloor t_i \rfloor) \leq (s+1)(u+1)(1 + \frac{1}{2}r(r+1)\delta). \ \ \ \ \ (1)$

## Interpretation and Application

Dyson supplied the following interpretation in his paper. The left-hand side of (1) approximately counts the number of zeroes in the staircase—approximately because ${\lambda}$ is left as a real number. We want a good upper bound on this number

A polynomial ${f(x,y)}$ of degree ${u}$ in ${x}$ and ${s}$ in ${y}$ may have up to ${(u+1)(s+1)}$ non-zero coefficieints. Thus ${(u+1)(s+1)}$ is intuitively the maximum number of zeroes that could be arranged “on purpose” by the choice of ${f}$.

We want to minimize the number af additional zeroes that could exist. Lemma 2 says that this is limited by the factor ${(1 + \frac{1}{2}r(r+1)\delta)}$. We want to make this factor approach ${1}$. We can do so by making ${\delta}$ arbitrarily small. By the condition ${\lambda > \frac{2}{\delta}}$ this entails choosing ${\lambda}$ large. The other requirement to chooise ${\delta}$ is that it must majorize

$\displaystyle \frac{2s}{r(u+1)}.$

We need ${\delta r^2}$ to be small yet bigger than

$\displaystyle \frac{2sr}{u+1}.$

This means making ${\frac{u}{s} > 2r - \frac{1}{s}}$. For whatever number ${r}$ of points we are concerned with, we can meet this by making the degree ${u}$ in ${x}$ be larger than the degree ${s}$ in ${y}$ by the factor ${2r}$. That is all we need to do for ${\delta}$ and ${\lambda}$ to exist that will allow us to apply the lemma.

Thus the upshot is that suitably “lopsided” polynomials, together with a connected region of their partial derivatives, cannot have too many more than the prescribed number of zeroes, counting multiplicity of the orders of the zeroes. The proof of the lemma works by successive applications of reasoning about determinants and linear independence of polynomials that serve as components of ${f(x,y)}$. It is fairly long-winded and ends with some painstaking estimates.

The application supposes for sake of contradiction that there are infinitely many fractions ${\frac{p}{q}}$ giving closer approximations to ${\alpha}$ than the theorem statement allows. It suffices to consider the case where ${\alpha}$ is an algebraic integer—that is, a root of a univariate polynomial ${g(z)}$ with integer coefficients. The supposition yields candidate polynomials ${f(x,y)}$ as differences of two polynomials, one derived from ${g}$ and the other from the closely approximating fractions. The ${f(x,y)}$ have both a large staircase of zeroes and a bounded space of possible coefficients, which imposes constraints on the degrees in ${x}$ and ${y}$. These elements can be manipulated, using the approximating fractions on one hand and the fact of ${g}$ (and hence its degree) being fixed on the other hand, to create the lopsided form of ${f(x,y)}$ in which the hypotheses of lemma 2 take effect. The lemma then cranks out a contradiction.

At the end of his proof, Dyson deftly explains how taking ${t = 1}$, where ${t}$ results from the process that selects the sequence ${t_i}$ for the lemma, yields Siegel’s theorem. Thus his framework affords an extra lever for manipulating ratios, in his case the ratio ${\frac{s}{t}}$. Tending the ratio to ${\sqrt{r/2}}$ rather than to ${s}$ produces his theorem.

Our point of interest is whether Dyson’s setup can be used to attack other questions about polynomials in complexity theory. We have not yet formed an understanding of how specific it is to the approximation application. Perhaps for complexity we would want to generalize it from polynomials in 2 to ${n}$ variables. There could be some relation to the ${d}$-variable techniques used for integer multiplication as we covered here, but again we’re just at the point of asking. Dyson ends his paper with the opinion that

“such an investigation … would not be in any way a hopeless undertaking.”

## Open Problems

Have there been useful extensions of Dyson’s lemma, as opposed to improvements by Roth and others to his approximation bounds? Has Dyson’s “non-hopeless investigation” been brought to fruition? What other applications would it have? What are the closest techniques that have been used in complexity theory?

The search for proving transcendental numbers has gone in other directions. These are represented in a survey paper by Jeffrey Shallit, which grew into the book Automatic Sequences with Jean-Paul Allouche. To answer our reader question above, in the book they trace the transcendence of ${\sum_k 10^{-2^k}}$ to a 1916 paper by Aubrey Kempner which uses a different technique.

[Edit: Fixed Roth bound on degree.]

March 3, 2020 2:22 pm

I have found a beautiful technique to solve complete or partially problems such as:

1- P vs NP
2- L vs NL
3- L vs P
4- Goldbach’s conjecture
5- Twin prime conjecture
6- Beal’s conjecture
7- Riemann hypothesis

See more in my notions:

March 4, 2020 3:16 am

Thanks for this beautiful discussion of Dyson’s result, an extraordinary man I had the chance to meet several years ago… I think that in Roth’s result: “and v is larger than D” should be “and v is larger than 2.”

March 4, 2020 9:52 am

Dear Giuseppe:

Thanks for comment. Of course you are right and will fix the error right away.

Best

Dick

3. March 4, 2020 2:07 pm

There is some superficial similarity between the guarantees of Thue-Siegel-Dyson theorems and those of Reed-Solomon decoding algorithms. Thue’s approach uses a bivariate polynomial f(X,Y) that is linear in Y (analogy with unique decoding of RS codes), and Siegel uses higher degree in Y (analogy with list decoding of RS codes and the guarantee one gets by bounding individual degrees). Roth’s approach uses multivariate polynomials (which are also used to go beyond RS codes in list decoding). Of course beyond the superficial similarity of what kind of monomials are used in the polynomial method, the details are different and very deep in the line of work on Diophantine approximation.

• March 5, 2020 11:21 pm

Venkat, thanks very much for this comment in line with the post’s main “bleg” (blog-beg).

March 5, 2020 9:28 am

So the answer is that none of the theorems you mentioned), even Roth’s theorem, is strong enough to prove the transcendence of sum(10^(-2^k), k=1..infinity). In fact, this number even has bounded partial quotients, so it behaves more like a quadratic irrational with respect to approximation by rationals. To prove it is transcendental, you need different techniques. Actually, it is a direct consequence of the relatively recent results of Adamczewski & Bugeaud on the so-called “automatic real numbers”.

• March 5, 2020 9:43 pm

Jeff, thanks! It was actually not intended as a trick question; I did not appreciate the difference between 10^{-2^k} and 10^{-3^k} as exemplified on page 18 (and earlier) of 1966 notes by Joseph Lipman.

March 5, 2020 3:13 pm

I knew Dyson fairly well, as I was down at IAS often and Dyson took an interest in a result of mine in “String Theory” (OLD SCHOOL!!!) along with a novel conjecture involving continued fractions and Klein’s String (Klein’s String is beautiful — Davenport, in his book on “The Higher Arithmetic” shows it and also mentions Dyson as Dyson made contributions to Diophantine approximation problems.

I made a 4k high definition film of me and him discussing various issues. I opened the floor by saying that, perhaps, in all his talks and interviews he had neglected to say something and mentioned that here was a chance to rectify matters (he joked “Famous last words!”). He went on to talk about Oliver Heaviside who put Maxwell’s work into a form that was comprehensible; Dyson made clear that Heavyside was not given enough credit. Apparently, he had just been reading a book about Heavyside.

I was a bit amazed, as this was in late May, 2018 — my wife had just died from a terrible mistake at a hospital after a long battle with a rare ovarian cancer where she did well being treated with some exotic pulsed electromagnetic waves of super-high strength that were applied to her when she slept. But she needed IV treatment for a common fungus and acquired an e-coli infection at the hospital and then got sepsis; she fine one minute and then was finished off in an hour. Dyson wrote to me before I came down to visit that the last place one should be in is a hospital. Standard joke about how you can enter a hospital but it is often difficult (or impossible) to get out!

He looked absolutely fantastic, however, and his mind was sharper than I had witnessed it in decades. I was amazed (he had just returned from a visit to see a daughter in CA). I came away think that he absolutely HAD to be on one or ten of those brain stimulating drugs that people like Bill Gates supposedly use. In any case, I was saddened to hear of his passing.

At some point, I will make the 4k video available. Parts of it are humorous: I kidded him when he brought up Heavyside acting like I had misheard him and that he was talking about “heavy water”! In any case, he was quite a fixture at IAS and very gracious with me, so I am thankful that he took time to talk with me every couple of years when I visited Princeton. On film, I asked him a question which I don’t think anyone had ever broached with him — the issue of publishing Feynman’s work before Feynman published it himself (this isn’t entirely accurate but it did get him a bit riled up).

• March 5, 2020 11:20 pm

Peter, thanks very much for these reminiscences. I would like very much to have met Dyson.