We lost a great computer scientist.

Frances Allen was one of the leaders who helped create the field of compilers research. Fran was an elite researcher at IBM, and won a Turing Award for this pioneering work. Allen also collected other awards.

Perhaps the coolest award is one that Fran could never win: The Frances E. Allen award created this year by the IEEE:

For innovative work in computing leading to lasting impact on other fields of engineering, technology, or science.

I consider Fran a friend, although we never worked together—our areas of interest were different. One fond memory of mine is being on panel a while ago with Fran. What a delightful presence.

Fran always seemed to be smiling, always with that smile. The following images come in large part from interviews and lectures and award events-the fact that it is so easy to find them is a testament to her public engagement as well as scientific contributions.

## Compliers were Key

There was a time when compliers were the most important program available on any new computer. Perhaps on any computer. Here is proof:

1. Computers are there to solve problems.

2. We must write programs to solve problems.

3. Details of the computer and instructions, are complicated, which make writing programs hard.

4. Thus, automating the writing of programs is important.

5. QED: we must have compliers.

Okay this is not really a “proof”, but there is some truth to the argument. Fran was at IBM and worked on some of the early compliers, including FORTRAN and related languages. IBM wanted to sell computers, well actually in the early days rent them. One potential roadblock, IBM realized, was that new computers could be hard to program. Thus to ensure that companies rented new machines as fast as IBM could manufacture them was important. This created the need for compliers and even more for optimizing compilers.

In order to ship more machines, the code that a complier created had to be efficient. Hence, a stress on Allen was to figure out how compliers could generate high quality code. This led Fran and others like John Cocke to discover many complier techniques that are still used today. A short list of the ideas is:

• Branch prediction

• Register allocation

• Control flow graphs

• Program dependence graphs

• And many more.

What is so important is that Allen’s work was not just applicable to this machine or that language. Rather the work was able to be used for almost any machine and for almost any language. This universal nature of the work on compliers reminds me of what we try to do in theory. Allen’s research was so important because it could be used for future hardware as well as future languages.

## Register Allocation

Guy Steele interviewed Allen for the ACM here. During the interview Fran talked about register allocation:

I have a story about register allocation. FORTRAN back in the 1950’s had the beginnings of a theory of register allocation, even though there were only three registers on the target machine. Quite a bit later, John Backus became interested in applying graph coloring to allocating registers; he worked for about 10 years on that problem and just couldn’t solve it. I considered it the biggest outstanding problem in optimizing compilers for a long time. Optimizing transformations would produce code with symbolic registers; the issue was then to map symbolic registers to real machine registers, of which there was a limited set. For high-performance computing, register allocation often conflicts with instruction scheduling. There wasn’t a good algorithm until the Chaitin algorithm. Chaitin was working on the PL compiler for the 801 system. Ashok Chandra, another student of Knuth’s, joined the department and told about how he had worked on the graph coloring problem, which Knuth had given out in class, and had solved it not by solving the coloring problem directly, but in terms of what is the minimal number of colors needed to color the graph. Greg immediately recognized that he could apply this solution to the register allocator issue. It was a wonderful kind of serendipity.

## Compliers Create Theory

The early goal of creating compliers lead directly to some wonderful theory problems. One whole area that dominated early theory research was language theory. In particular understanding questions that arise in defining programming languages. Syntax came first—later semantics was formalized.

Noam Chomsky created context-free grammars to help understand natural languages in the 1950s. His ideas were used by John Backus, also a Turing award winner from IBM, to describe the then new programming language IAL. This is known today as ALGOL 58, which became ALGOL 60. Peter Naur on the ALGOL 60 committee called Backus’s notation for ALGOL’s syntax: Backus normal form, but is now called BNF-Backus-Naur form.

Theorists worked on, I confess I did, many questions about such languages. Existence problems, decidability problems, efficient algorithms, and closure properties were just some of the examples. Not clear how much of this theory effected compiler design, but I would like to think that some was useful. Theorist should thank the complier researchers. I do.

For instance the 1970 STOC program had many papers on language related topics—here are some:

• A Result on the Relationship between Simple Precedence Languages and Reducing Transition Languages. Gary Lindstrom

• The Design of Parsers for Incremental Language Processors: Ronald Book, Sheila Greibach, Ben Wegbreit

• Tape- and Time-Bounded Turing Acceptors and AFLs. David Lewis

• Closure of Families of Languages under Substitution Operators. William Rounds

• Tree-Oriented Proofs of Some Theorems on Context-Free and Indexed Languages. Barry Rosen

• On Syntax-Directed Transduction and Tree Transducers. Alfred Aho, Jeffrey Ullman

• The Analysis of Two-Dimensional Patterns using Picture Processing Grammars. Jean-Francois Perrot

• On Some Families of Languages Related to the Dyck Language. Joseph Ullian

• Three Theorems on Abstract Families of Languages. Joseph Ullian

By the way Abstract Families of Languages or AFLs was created by Seymour Ginsburg and Sheila Greibach in 1967 as a way to generalize context free languages.

## Open Problems

Fran was asked by Steele in that interview: Any advice for the future?

Yes, I do have one thing. Students aren’t joining our field, computer science, and I don’t know why. It’s just such an amazing field, and it’s changed the world, and we’re just at the beginning of the change. We have to find a way to get our excitement out to be more publicly visible. It is exciting, in the 50 years that I’ve been involved, the change has been astounding.

Thanks Fran. Much of that change is due to pioneers like you. Thanks for everything.

A breakthrough on the separating words problem

Zachary Chase is a graduate student of Ben Green at Oxford. Chase has already solved a number of interesting problems–check his site for more details. His advisor is famous for his brilliant work—especially in additive combinatorics. One example is his joint work with Terence Tao proving this amazing statement:

Theorem 1 The prime numbers contain arbitrarily long arithmetic progressions.

Today we wish to report Chase’s new paper on a problem we have twice discussed before.

But first Ken wants to say something about Oxford where he got his degree long before Green arrived.

## Oxford Making Waves

Green moved to Oxford in 2013. He holds a professorship associated to Magdalen College. I (Ken) did not know him when I started at Oxford in 1981. It would have been hard, as Green was only 4 years old at the time. But I did know the preteen Ruth Lawrence when she started there and even once played a departmental croquet match including her in which Bryan Birch made some epic long shots. Lawrence had joined St. Hugh’s College in 1983 at the age of twelve.

Oxford has been Dick’s and my mind more in the past six years than before. Both of us were guests of Joël Ouaknine in 2012–2015 when he was there. Oxford has developed a front-line group in quantum computation, which fits as David Deutsch’s role as an originator began from there—note my story in the middle of this recent post.

Recently Oxford has been in the news for developing a promising Covid-19 vaccine. ChAdOx1 heads Wikipedia’s list of candidate vaccines and has gone to final trials, though there is still a long evaluation process before approval for general use.

Before that, a modeling study from Oxford in March raised the question of whether many more people have had Covid-19 without symptoms or any knowledge. This kind of possibility has since been likened to a “dark matter” hypothesis, not just now regarding Covid-19 but a decade ago and before.

A main supporting argument is that a wide class of mathematical models can be fitted with higher relative likelihood if the hypothesis is true. I have wanted to take time to evaluate this argument amid the wider backdrop of controversy over inference methods in physics, but online chess with unfortunately ramped-up frequency of cheating has filled up all disposable time and more.

## The Problem

Back to Chase’s new results on the following problem:

Given two distinct binary strings of length ${n}$ there is always a finite state deterministic automaton (FSA) that accepts one and rejects the other. How few states can such a machine have?

This is called the separating words problem (SWP). Here we consider it for binary strings only.

John Robson proved ${O(n^{2/5})}$ states are enough—we suppress any log factors. Some like to write this as ${\tilde{O}(n^{2/5})}$. Chase improves this to ${\tilde{O}(n^{1/3})}$:

Theorem 2 For any distinct ${x,y \in \{0,1\}^{n}}$, there is a finite state deterministic automaton with ${O(n^{1/3} \log^{7} n)}$ states that accepts ${x}$ but not ${y}$.

We previously discussed this twice at GLL. We discussed the background and early results here. The original problem is due to Pavel Goralcik and Vaclav Koubek. They proved an upper bound that was ${o(n)}$. Then we went over Robson’s bound here. The best upper bound was Robson’s result until Chase came along.

## The Approach

All the approaches to SWP seem to have a common thread. They find some family of “hash” functions ${H}$ so that:

1. Any ${h}$ in ${H}$ can be computed by a FSA with few states.

2. For any ${x \neq y}$ binary strings of length ${n}$, there is an ${h \in H}$ so that ${h(x) \neq h(y)}$.

The challenge is to find clever families that can do do both. Be easy to compute and also be able to tell strings apart. Actually this is only a coarse outline—Chase’s situation is a bit more complicated.

## The Proof

We have taken the statement of Theorem 2 verbatim from the paper. It has a common pecadillo of beginning a sentence for a specific ${n}$ but writing ${O(\cdots n \cdots)}$ later. However, this is how we think intuitively: in terms of how the pieces of the formula behave. Chase declares right away his intent to ignore the power of ${\log n}$. How he gets the power ${1/3}$ of ${n}$ is the real point. We can convey the intuition in brief.

A length-${n}$ binary string can be identified with its set ${A \subseteq [n]}$ of positions where the string has a ${1}$. Chase begins by showing how a power of ${1/2}$ on ${n}$ is obtainable by considering sets of the form

$\displaystyle A_{i,p} = \{j : j \in A \wedge j \equiv i \pmod{p}\},$

where ${p}$ is prime and ${i < p}$. Suppose we know a bound ${k}$ such that for all distinct ${A,B \subseteq n}$ (that is, all distinct binary strings of legnth ${n}$) there is a prime ${p < k}$ and ${i < p}$ such that

$\displaystyle |A_{i,p}| \neq |B_{i,p}|.$

Then by the Chinese Remainder Theorem, there is a prime ${q}$ of magnitude about ${\log n}$ such that

$\displaystyle |A_{i,p}| \not\equiv |B_{i,p}| \pmod{q}.$

Now we can make a finite automaton ${M_{A,B}}$ with states ${(j,\ell)}$ that always increments ${j}$ modulo ${p}$ and increments ${\ell}$ modulo ${q}$ each time it reads a ${1}$ when ${j \equiv i \pmod{p}}$. Then ${M_{A,B}}$ has order-of ${pq \approx k\log n}$ states. The finisher is that ${k = \tilde{O}(n^{1/2})}$ suffices. Again we ignore the pecadillo but we add some redundant words to the statement in the paper between dashes:

Lemma 3 For any distinct ${A,B \subseteq [n]}$—of size at most ${n}$—there is a prime ${p = \tilde{O}(n^{1/2})}$ such that for some ${i \in [p]}$, ${|A_{i,p}| \neq |B_{i,p}|.}$

The power ${1/2}$ is of course weaker than Robson’s ${2/5}$, but this statement conceals two “levers” that enable leap-frogging ${2/5}$ to get ${1/3}$. The first is that we don’t have to limit attention to sets ${A,B}$ that come from places where the corresponding strings ${x,y}$ have a ${1}$. Consider any string ${w}$ and take ${A_w}$ to be the set of index positions ${j}$ in which ${x}$ has the substring ${w}$ beginning at place ${j}$. Define ${B_w}$ likewise for ${y}$. Then we can try to prove results of the following form given ${m < n}$:

Proposition 4 For all distinct ${x,y \in \{0,1\}^n}$ there is ${w \in \{0,1\}^m}$ such that ${A_w \neq B_w}$ and

$\displaystyle |A_w|,|B_w| = O(\frac{n}{m}).$

A finite automaton using this extension needs ${m}$ states to store ${w}$ in its finite control. The second lever is to try to prove results of this form, where now the words “of size at most ${N}$” are not redundant:

Lemma 5 (?) For any distinct ${A,B \subseteq [n]}$ of size at most ${N}$ there is a prime ${p = \tilde{O}(N^{1/2})}$ such that for some ${i \in [p]}$, ${|A_{i,p}| \neq |B_{i,p}|.}$ [Update: see note below.]

Now we need to balance the levers using the proposition and the lemma together. Since ${w}$ will add order-${m}$ states to the automaton, we balance it against ${p}$ from the previous argument. So take ${m = n^{1/3}}$. Then ${N = \frac{n}{m} \approx n^{2/3}}$. Lemma 5 then gives the bound

$\displaystyle k = \tilde{O}(N^{1/2}) = \tilde{O}(n^{1/3})$

on the magnitude of the needed primes ${p}$. This yields the ${\tilde{O}(n^{1/3})}$ breakthrough on SWP.

Here a famous New Yorker cartoon with the caption “If only it were so simple” comes to mind. But there is a catch. Chase is not quite able to prove lemma 5. However, the ${w}$ lever comes with extra flexibility that enables finding ${w}$ that make ${A_w \neq B_w}$ and also give those sets an extra regularity property ${X}$. Using ${X}$, he is able to show the existence of good hash functions of a certain type. The modified lemma is enough to prove his new bound. The proof still uses intricate analysis including integrals.

This is classic high-power mathematics. When some idea is blocked, try to weaken the requirements. Sometimes it is possible to still proceed. It is a lesson that we sometimes forget, but a valuable one nevertheless.

## Open Problems

We like the SWP and think Chase’s contribution is impressive. Note that it adds a third element ${w}$ to ${p}$ and ${q}$ in the automaton. Can the argument be pushed further by finding more levers to add more elements? Is Lemma 5 true as stated, and with what (other) tradeoffs of ${m}$ and ${N}$ between it and Proposition 4? [Update: not for extreme tradeoffs—see this comment—but plausibly for ${m,n,N}$ polynomially related.]

We feel there could also be interesting applications for his theorem as it stands. Is the ability to tell two strings apart with a simple device—a FSA with not many states—useful? Could it solve some open problem? It does seem like a basic insight, yet we have no candidate application. Perhaps you have an idea.

[added Q on Lemma 5 to “Open Problems”, “lower” bound –> “upper” bound in third section, update in Open Problems.]

And Razborov’s brilliant proof method

Stasys Jukna is the author of the book Extremal Combinatorics With Applications in Computer Science.

Today we talk about Jukna’s book on extremal combinatorics.

A flying start from nearby Rochester

Anurag Agarwal and Richard Zanibbi are tenured faculty in Mathematics and Computer Science, respectively, at RIT. They partner with Clyde Lee Giles of Penn State and Douglas Oard of U.Md. on the MathSeer project. If the name reminds you of CiteSeer${{\,}^x}$, no surprise: Giles co-originated that and still directs it.

Today we note last month’s release of a major piece of MathSeer called MathDeck and show how to have fun with it.

Ron Graham passed away, but he lives on…

 Cropped from tribute by Tom Leighton

Ron Graham just passed away Monday at the age of ${84}$ in La Jolla near UCSD.

Today Ken and I wish to say a few words about Ron.

Some different ideas for marking the Fourth

 “Founding Frenemies” source

John Adams and Thomas Jefferson did not use Zoom. Their correspondence, from 1777 up to their deaths hours apart on July 4, 1826, fills a 600-page book.

Today, Independence Day in the US, we consider the kind of intellectual fireworks represented by the correspondence.

By blowing up its objects

 Composite crop of src1, src2

Joshua Greene and Andrew Lobb proved last month that every smooth Jordan curve in the plane and real ${r \leq 1}$, there are four points on the curve that form a rectangle with sides of ratio ${r}$.

Today we explain how this result relates to Otto Toeplitz’s famous “square peg conjecture,” which is the case ${r = 1}$ when the curve need not be smooth.

Gossip and more.

 Composite of , src1, src3

Jessica Deters, Izabel Aguiar, and Jacqueline Feuerborn are the authors of the paper, “The Mathematics of Gossip.” They use infection models—specifically the Susceptible-Infected-Recovered (SIR) model—to discuss gossip. Their work was done before the present pandemic, in 2017–2019. It is also described in a nice profile of Aguiar. Their analogy is expressed by a drawing in their paper:

Not just for today, but for the summer at least, Ken and I want to share some gossip, share some problems, and ask our readers a question.

Some thoughts on P versus NP

Norbert Blum is a computer science theorist at the University of Bonn, Germany. He has made important contributions to theory over his career. Another claim to fame is he was a student of Kurt Mehlhorn, indeed the third of Mehlhorn’s eighty-eight listed students.

Today I wish to discuss a new paper by Blum.

Proofs and perpetual motion machines

Leonardo da Vinci is, of course, famous for his paintings and drawings, but was also interested in inventions, and in various parts of science including mathematics and engineering. It is hard to imagine that he died over 500 years ago, given his continued impact on our world. He invented practical and impractical inventions: musical instruments, a mechanical knight, hydraulic pumps, reversible crank mechanisms, finned mortar shells, and a steam cannon.

Today I wish to discuss proofs and perpetual motion machines.

You might ask: What do proofs and perpetual motion machines have in common? Proofs refer to math proofs that claim to solve open problems like P ${\neq}$ NP. Ken and I get such claims all time. I take a look at them, not because I think they are likely to be correct. Rather because I am interested in understanding how people think.

I started to work on discussing such proofs when I realized that such “proofs” are related to claims about perpetual motion machines. Let’s see how.

## Perpetual Motion Machines

A perpetual motion machine is a machine that operates indefinitely without an energy source. This kind of machine is impossible, as da Vinci knew already:

Oh ye seekers after perpetual motion, how many vain chimeras have you pursued? Go and take your place with the alchemists.
—da Vinci, 1494

I like this statement about applying for US patents on such machines:

Proposals for such inoperable machines have become so common that the United States Patent and Trademark Office (USPTO) has made an official policy of refusing to grant patents for perpetual motion machines without a working model.

Here is a classic attempt at perpetual motion: The motion goes on “forever” since the right side floats up and the left side falls down.

The analogy of proofs and to perpetual motion machines is: The debunking such a machine is not done by looking carefully at each gear and lever to see why the machine fails to work. Rather is done like this:

Your machine violates the fundamental laws of thermodynamics and is thus impossible.

Candidate machines are not studied to find the exact flaw in their design. The force of fundamental laws allows a sweeping, simple, and powerful argument against them. There are similar ideas in checking a proof. Let’s take a look at them.

## Proofs

Claims are made about proofs of open problems all the time. Often these are made for solutions to famous open problems, like P${\neq}$NP or the Riemann Hypothesis (RH).

Math proofs are used to try to get to the truth. As we said before proofs are only as good as the assumptions made and the rules invoked. The beauty of the proof concept is that arguments can be checked, even long and complex ones. If the assumptions and the rules are correct, then no matter how strange the conclusion is, it must be true.

For example:

${\bullet }$ The Riemann rearrangement theorem. A sum

$\displaystyle a_{1} + a_{2} + a_{3} + \dots$

that is conditionally convergent can be reordered to yield any number. Thus there is series

$\displaystyle b_{1} + b_{2} + b_{3} + \dots$

that sums conditionally to your favorite number ${M}$ and yet the ${b_{1},b_{2},\dots}$ is just a arrangement of the ${a_{1},a_{2},\dots}$. This says that addition is not commutative for infinite series.

${\bullet }$ Cover the largest triangle by two unit squares: what is the best? The following shows that it is unexpected:

The point of a proof is that it is a series of small steps. If each step is correct, then the whole is correct. But in practice proofs are often checked in other ways.

## Checking Proofs

The starting point for my thoughts—joined here with Ken’s—are these two issues:

1. A proof that only has many small steps but no global picture is hard to motivate.

2. A proof with complex logic at the high level is hard to understand.

Note that a deep, hard theorem can still have straightforward logic. A famous theorem of Littlewood has for its proof the structure:

• Case the RH is false: Then ${\dots}$

• Case the RH is true: Then ${\dots}$

The RH-false case takes under a page. The benefit with this logic is that one gets to assume RH for the rest. The strategy for the famous proof by Andrew Wiles of Fermat’s Last Theorem (FLT)—incorporating the all-important fix by Richard Taylor—has this structure:

• If ${X}$ then ${Y}$.

• If not-${X}$ then ${Z}$.

• ${Y}$ implies FLT.

• ${Z}$ implies FLT.

Wiles had done the last step long before but had put aside since he didn’t know how to get ${Z}$. The key was framing ${X}$ so that it enabled bridged the gap in his originally-announced proof while its negation enabled the older proof.

Thus what we should seek are proofs with simple logic at the high level that breaks into cases or into sequential sub-goals so that the proof is a chain or relatively few of those goals.

## Shapes and Barriers

This makes Ken and I think again about an old paper by Juris Hartmanis with his students Richard Chang, Desh Ranjan, and Pankaj Rohatgi in the May 1990 Bulletin of the EATCS titled, “On IP=PSPACE and Theorems With Narrow Proofs.” Ken’s post on it included this nice diagram of what the paper calls “shapes of proofs”:

Ken’s thought now is that this taxonomy needs to be augmented with a proof shape corresponding to certain classes believed to be properly below polynomial time—classes within the NC hierarchy. Those proofs branch at the top into manageable-size subcases, and/or have a limited number of sequential stages, where each stage may be wide but is shallow in its chains of dependencies. Call this shape a “macro-tree.”

The difference between the macro-tree shape and the sequential shapes pictured above is neatly captured by Ashley Ahlin on a page about “Reading Theorems”:

Note that, in some ways, the easiest way to read a proof is to check that each step follows from the previous ones. This is a bit like following a game of chess by checking to see that each move was legal, or like running a spell-checker on an essay. It’s important, and necessary, but it’s not really the point. … The problem with this is that you are unlikely to remember anything about how to prove the theorem, if you’ve only read in this manner. Once you’re read a theorem and its proof, you can go back and ask some questions to help synthesize your understanding.

The other high-level structure that a proof needs to make evident—before seeing it is reasonable to expend the effort to check it—is shaped by barriers. We have touched on this topic several times but maybe have not stated it full on for P versus NP. A recent essay for a course led by Li-Yang Tan at Stanford does so in just a few pages. A proof should state up front how it works around barriers, and this alone makes its strategy easier to follow.

The idea of barriers extends outside P versus NP, of course. Peter Scholze seems to be invoking it in a comment two months ago in a post by Peter Woit in April on the status of Shinichi Mochizuki’s claimed proof of the ABC conjecture:

I may have not expressed this clearly enough in my manuscript with Stix, but there is just no way that anything like what Mochizuki does can work. … The reason it cannot work is a[nother] theorem of Mochizuki himself. … If the above claims [which are negated by the theorem] would have been true, I would see how Mochizuki’s strategy might have a nonzero chance of succeeding. …

Thus what Ken and I conclude is that in order for a proof to be checkable chunk by chunk—not line by line—it needs to have:

1. A top-level decomposition into a relatively small number of components and stages—like legs in a sailing race—and

2. A demonstration of how the stages navigate around known barriers.

Lack of a clear plan in the first already says the proof attempt cannot avoid being snagged on a barrier, as surely as natural laws prevent building a perpetual-motion machine.

## Open Problems

Does this help in ascertaining what shape a proof that resolves the P versus NP problem must have?