Our 1977 paper on the role of formal methods

 [ Harvard ]

Harry Lewis is known for his research in mathematical logic, and for his wonderful contributions to teaching. He had two students that you may have heard of before: a Bill Gates and a Mark Zuckerberg.

Today I wish to talk about a recent request from Harry about a book that he is editing.

With a short solution that was hard for me to find.

 [ Royal Society ]

Sir Timothy Gowers is a Fields medalist and fellow blogger. Sometimes he (too) writes about simple topics.

Today I would like to talk about a simple problem that came up recently.

The problem is a simple to state “obvious fact”. The reason I thought you might be interested is that I had a tough time finding the solution. I hope you find the explanation below interesting.

The general issue of proving obviously true statements is discussed here for example. Here is an example from Gowers:

Let ${I_{1},I_{2},\dots }$ be intervals of real numbers with lengths that sum to less than ${1}$, then their union cannot be all of ${[0,1]}$.

He says:

It is quite common for people to think this statement is more obvious than it actually is. (The “proof” is this: just translate the intervals so that the end point of ${I_{1}}$ is the beginning point of ${I_{2}}$, and so on, and that will clearly maximize the length of interval you can cover. The problem is that this argument works just as well in the rationals, where the conclusion is false.)

## The Problem’s Statement

The following simple problem came up the other day. Suppose that ${t}$ is an odd number. Show there is some ${x}$ so that

$\displaystyle (x,t) = 1 \text{ and } (4x+1,t) = 1.$

Here ${(a, b)}$ is the gcd, the greatest common divisor of ${a}$ and ${b}$. For example,

$\displaystyle (21, 14) = 7.$

This result seems totally obvious, must be true, but I had trouble finding a proof.

## The Problem’s Cousin

There is a unproved conjecture in number theory that says: There are an infinite number of ${x}$ so that both ${x}$ and ${4x+1}$ are prime. This clearly shows that there is an ${x}$ for our problem.

I like conjectures like this since they give you an immediate insight that a statement is likely to be true. But we would like a proof that does not use any unproved conjectures. Our problem can be viewed as a poor version of some of these conjectures. Suppose that you have a conjecture that there are an infinite number of ${x}$ so that

$\displaystyle f_{1}(x),f_{2}(x),\dots,f_{m}(x),$

are all prime for some given functions ${f_{k}(x)}$. Then the poor version is to prove that there are ${x}$ so that these numbers are all relatively prime to some given ${t}$. There are some partial results to the prime version by Ben Green and Terence Tao.

## The Problem’s Failed Proofs

My initial idea was to try to set ${x}$ to something like ${ty + 1}$. The point is that this always satisfies the first constraint: that is ${(ty+1, t)=1}$ for any ${y}$. Then I planned to try and show there must be some ${y}$ that satisfies the second constraint. Thus the goal is to prove there is some ${y}$ so that

$\displaystyle (4(ty+1)+1, t) = 1.$

But this is false. Note that if ${5}$ divides ${t}$ then

$\displaystyle 4(ty+1)+1 = 4ty + 5$

and so ${4x+1}$ is always divisible by ${5}$. Ooops.

My next idea was to set ${x}$ to a more “clever” value. I tried

$\displaystyle x = ty + d.$

Here I thought that I could make ${d}$ special and control the situation. Now

$\displaystyle 4(ty+d)+1 = 4ty + 4d+1.$

This looked promising. I then said to myself that why not make ${4d+1}$ a large prime ${p}$. Then clearly

$\displaystyle 4x + 1 = (4t)y + p.$

Since ${4t}$ and ${p}$ are relatively prime by the famous Dirichlet’s Theorem on arithmetic progressions we could make ${4x+1}$ a prime too by selecting ${y}$. This would satisfy the second constraint, and we are done.

Not quite. The trouble is that we need to have also that

$\displaystyle (x,t) = 1.$

Now this is

$\displaystyle (ty + d, t) = 1.$

The trouble is that ${d}$ might not be relatively prime to ${t}$. So we could just ${\dots}$ This seems like a recursion and I realized that it might not work.

## The Problem’s Solution

I finally found a solution thanks to Delta Airlines. My dear wife, Kathryn Farley, and I were stuck in DC for several hours waiting for our delayed flight home. This time was needed for me to find a solution.

The key for me was to think about the value ${t}$. It is usually a good idea to look at the simplest case first. So suppose that ${t=1}$, then clearly the constraints

$\displaystyle (x,t) = 1 \text{ and } (4x+1,t) = 1,$

are now trivial. The next simplest case seems to be when ${t}$ is a prime. Let’s try ${t=3}$. Now ${x=1}$ works. Let’s generalize this to any prime ${t>3}$. The trick is to set ${x}$ so that

$\displaystyle x \equiv 2 \bmod t.$

Then ${4x+1}$ is equal to ${9}$ modulo ${t}$, which is not divisible by ${t}$. This shows that when ${t}$ is an odd prime there is always some ${x}$.

Okay how do we get the full result? What if ${t}$ is the product of several primes? The Chinese remainder theorem to the rescue. Suppose that ${t}$ is divisible by the distinct odd primes ${p_{1},p_{2}, \dots, p_{m}}$. We can easily see that we do not care if there are repeated factors, since that cannot change the relatively prime constraints.

Then we constraint ${x}$ by:

$\displaystyle x \equiv 1 \bmod 3$

and

$\displaystyle x \equiv 2 \bmod p_{k}$

for all primes ${p_{k}>3}$. Then the Chinese remainder theorem proves there is some ${x}$. Done.

## Open Problems

Is there some one line proof of the problem? Do you know any references? There are several obvious generalizations of this simple problem, perhaps someone might look into them.

Cracking a Diophantine problem for 42 too

Andrew Booker is a mathematician at the University of Bristol, who works in analytic number theory. For example he has a paper extending a result of Alan Turing on the Riemann zeta function. Yes our Turing.

Today Ken and I will talk about his successful search for a solution to a 64 year old problem.

A recipe for changing the objectives of problems

 Composite crop of src1, src2, src3

Aram Harrow, Avinatan Hassidim, and Seth Lloyd are quantum stars who have done many other things as well. They are jointly famous for their 2009 paper, “Quantum algorithm for linear systems of equations.”

Today Ken and I discuss an aspect of their paper that speaks to a wider context.

A clever trick on combining automata

John Robson, who goes by Mike, has worked on various problems including what is still the best result on separating words—the topic we discussed the other day. Ken first knew him for his proof than ${N \times N}$ checkers is ${\mathsf{EXPTIME}}$-complete and similar hardness results for chess and Go.

Today I want to talk about his theorem that any two words can be separated by an automaton with relativley few states.

In his famous paper from 1989, he proved an upper bound on the Separating Word Problem. This is the question: Given two strings ${S}$ and ${T}$, how many states does a deterministic automaton need to be able to accept ${S}$ and reject ${T}$? His theorem is:

Theorem 1 (Robson’s Theorem) Suppose that ${S}$ and ${T}$ are distinct strings of length ${n}$. Then there is an automaton with at most ${O(n^{0.4}\log^{0.6} n)}$ states that accepts ${S}$ and rejects ${T}$.

The story of his result is involved. For starters, it is still the best upper bound after almost three decades. Impressive. Another issue is that a web search does not quickly, at least for for me, find a PDF of the original paper. I tried to find it and could not. More recent papers on the separating word problem reference his 1989 paper, but they do not explain how he proves it.

Recall the problem of separating words is: Given two distinct words of length ${n}$, is there a deterministic finite automaton that accepts one and rejects the other? And the machine has as few states as possible. Thus his theorem shows that roughly the number of states grows at most like the square root of ${n}$.

I did finally track the paper down. The trouble for me is the paper is encrypted. Well not exactly, but the version I did find is a poor copy of the original. Here is an example to show what I mean:

 [ An Example ]

So the task of decoding the proof is a challenge. A challenge, but a rewarding one.

## A Cool Trick

Robson’s proof uses two insights. The first is he uses some basic string-ology. That is he uses some basic facts about strings. For example he uses that a non-periodic string cannot overlap itself too much.

He also uses a clever trick on how to simulate two deterministic machines for the price of one. This in general is not possible, and is related to deep questions about automata that we have discussed before here. Robson shows that it can be done in a special but important case.

Let me explain. Suppose that ${\alpha}$ is a string. We can easily design an automaton that accepts ${x}$ if and only if ${x}$ is the string ${\alpha}$. The machine will have order the length of ${\alpha}$ states. So far quite simple.

Now suppose that we have a string ${S}$ of length ${n}$ and wish to find a particular occurrence of the pattern ${\alpha}$ in ${S}$. We assume that there are ${\#(S,\alpha)}$ occurrences of ${\alpha}$ in ${S}$. The task is to construct an automaton that accepts at the end of the ${k^{th}}$ copy of ${\alpha}$. Robson shows that this can be done by a automaton that has order

$\displaystyle \#(S,\alpha) + |\alpha|$

Here ${|\alpha|}$ is the length of the string ${\alpha}$.

This is a simple, clever, and quite useful observation. Clever indeed. The obvious automaton that can do this would seem to require a cartesian product of two machines. This would imply that it would require

$\displaystyle \#(S,\alpha) \times |\alpha|$

number of states: Note the times operator ${\times}$ rather than addition. Thus Robson’s trick is a huge improvement.

Here is how he does this.

## His Trick

Robson’s uses a clever trick in his proof of the main lemma. Let’s work through an example with the string ${100}$. The goal is to see if there is a copy of this string starting at a position that is a multiple of ${3}$.

The machine starts in state ${(0,Y)}$ and tries to find the correct string ${100}$ as input. If it does, then it reaches the accepting state ${(3,Y)}$. If while doing this it gets a wrong input, then it switches to states that have stopped looking for the input ${100}$. After seeing three inputs the machine reaches ${(3,N)}$ and then moves back to the start state.

 [ The automaton ]

## The Lemmas

We will now outline the proof in some detail.

### Hashing

The first lemma is a simple fact about hashing.

Lemma 2 Suppose ${1 \le r \le m}$ and

$\displaystyle 1 \le k_{1} < \cdots < k_{m} \le n.$

Then all but ${{O}(m \log n)}$ primes satisfy

$\displaystyle k_{i} \equiv k_{r} \bmod p \text{ if and only if } i =r.$

Proof: Consider the quantity ${|k_{r} - k_{i}|}$ for ${i}$ not equal to ${r}$. Call a prime bad if it divides this quantity. This quantity can be divisible by at most ${\log n}$ primes. So there are at most ${{O}(m\log n)}$ bad primes in total. $\Box$

### Strings

We need some definitions about strings. Let ${| \alpha |}$ be the length of the string ${\alpha}$. Also let ${\#(S,\alpha)}$ be the number of occurrences of ${\alpha}$ in ${S}$.

A string ${\alpha}$ has the period ${p}$ provided

$\displaystyle \alpha_{i} = \alpha_{i+p},$

for all ${i}$ so that ${i+p}$ is defined. A string ${\alpha}$ is periodic provided it has a period ${p>0}$ that is less than half its length. Note, the shorter the period the more the string is really “periodic”: for example, the string

$\displaystyle 10101010101010$

is more “periodic” than

$\displaystyle 10000001000000.$

Lemma 3 For any string ${u}$ either ${u0}$ or ${u1}$ is not periodic.

Proof: Suppose that ${\beta=u\sigma}$ is periodic with period ${p}$ where ${\sigma}$ is a single character. Let the length of ${\beta}$ equal ${l}$. So by definition, ${1 \le p \le l/2}$. Then

$\displaystyle \beta_{i} = \beta_{i+p},$

for ${1 \le i \le l-p}$. So it follows that

$\displaystyle \beta_{l-p} = \beta_{l} = \sigma.$

This shows that ${u1}$ and ${u0}$ cannot both be periodic, since

$\displaystyle 1 \le l-p \le l/2 < l.$

$\Box$

Lemma 4 Suppose that ${\alpha}$ is not a periodic string. Then the number of copies of ${\alpha}$ in a string ${S}$ is upper bounded by ${{O}(M)}$ where

$\displaystyle M = \frac{|S|}{|\alpha|}.$

Proof: The claim follows once we prove that no two copies of ${\alpha}$ in ${S}$ can overlap more than ${l/2}$ where ${l}$ is the length of ${\alpha}$. This will immediately imply the lemma.

If ${\alpha}$ has two copies in ${S}$ that overlap then clearly

$\displaystyle \alpha_{i} = \alpha_{i+d},$

for some ${d>0}$ and all ${i}$ in the range ${1,\dots,l-d}$. This says that ${\alpha}$ has the period ${l-d}$. Since ${\alpha}$ is not periodic it follows that ${d > l/2}$. This implies that the overlap of the two copies of ${\alpha}$ are at most length ${l/2}$. Thus we have shown that they cannot overlap too much. $\Box$

### Main Lemma

Say an automaton finds the ${k^{th}}$ occurrence of ${\alpha}$ in ${S}$ provided it enters a special state after scanning the last bit of this occurrence.

Lemma 5 Let ${S}$ be a string of length ${n}$ and let ${\alpha}$ be a non-periodic string.Then, there is an automaton with at most ${\widetilde{O}(M)}$ states that can find the ${k^{th}}$ occurrence of ${\alpha}$ in ${S}$ where

$\displaystyle M = \#(S,\alpha) + |\alpha|.$

Here ${\widetilde{O}(M)}$ allows factors that are fixed powers of ${\log n}$. This lemma is the main insight of Robson and will be proved later.

## The Main Theorem

The following is a slightly weaker version of Robson’s theorem. I am still confused a bit about his stronger theorem, to be honest.

Theorem 6 (Robson’s Theorem) Suppose that ${S}$ and ${T}$ are distinct strings of length ${n}$. Then there is an automaton with at most ${\widetilde{O}(\sqrt {n})}$ states that accepts ${S}$ and rejects ${T}$.

Proof: Since ${S}$ and ${T}$ are distinct we can assume that ${S}$ starts with the prefix ${u1}$ and ${T}$ starts with the prefix ${u0}$ for some string ${u}$. If the length of ${u}$ is less than order ${\widetilde{O}(\sqrt {n})}$ the theorem is trivial. Just construct an automaton that accepts ${u1}$ and rejects ${u0}$.

So we can assume that ${u = w\alpha}$ for some strings ${w}$ and ${\alpha}$ where the latter is order ${\widetilde{O}(\sqrt {n})}$ in length. By lemma we can assume that ${\alpha1}$ is not periodic. So by lemma we get that

$\displaystyle \#(S,\alpha1) = \widetilde{O}(\sqrt{n}).$

Then by lemma we are done. $\Box$

## Proof of Main Lemma

Proof: Let ${S}$ have length ${n}$ and let ${\alpha}$ be a non-periodic string in ${S}$ of length ${l}$. Also let ${\#(S,\alpha) = m}$. By the overlap lemma it follows that ${m}$ is bounded by ${\widetilde{O}(|S|/|\alpha|)}$.

Let ${\alpha}$ occur at locations

$\displaystyle 1 \le k_{1} < \cdots < k_{m} \le n.$

Suppose that we are to construct a machine that finds the ${r^{th}}$ copy of ${\alpha}$. By the hashing lemma there is a prime ${p=\widetilde{O}(m)}$ so that

$\displaystyle k_{i} \equiv k_{r} \bmod p$

if and only if ${i=r}$. Note we can also assume that ${p > l}$.

Let’s argue the special case where ${k_{r}}$ is ${0}$ modulo ${p}$. If it is congruent to another value the same argument can be used. This follows by having the machine initially skip a fixed amount of the input and then do the same as in the congruent to ${0}$ case.

The automaton has states ${(i,Y)}$ and ${(i,N)}$ for ${i=0,\dots,p}$. The machine starts in state ${(0,Y)}$ and tries to get to the accepting state ${(l,Y)}$. The transitions include:

$\displaystyle (0,Y) \underset{\alpha_{1}}{\rightarrow} (1,Y) \underset{\alpha_{2}}{\rightarrow} (2,Y) \underset{\alpha_{3}}{\rightarrow} \cdots \underset{\alpha_{l}}{\rightarrow} (l,Y).$

This means that the machine keeps checking the input to see if it is scanning a copy of ${\alpha}$. If it gets all the way to the accepting state ${(l,Y)}$, then it stops.

Further transitions are:

$\displaystyle (1,N) \rightarrow (2,N) \rightarrow \cdots \rightarrow (p,N),$

and

$\displaystyle (0,Y) \underset{\neg \alpha_{1}}{\rightarrow} (1,N), (1,Y) \underset{\neg \alpha_{2}}{\rightarrow} (2,N), \dots, (l-1,Y) \underset{\neg \alpha_{l}}{\rightarrow} (l,N).$

The second group means that if a wrong input happens, then ${(i,Y)}$ moves to ${(i+1,N)}$. Finally, the state ${(p,N)}$ resets and starts the search again by going to the start state ${(0,Y)}$ with an epsilon move.

Clearly this has the required number of states and it operates correctly. $\Box$

## Open Problems

The open problem is: Can the SWP be solved with a better bound? The lower bound is still order ${\log n}$. So the gap is exponential.

Another exponential gap in complexity theory?

Jeffrey Shallit is a famous researcher into many things, including number theory and being a skeptic. He has a colorful website with an extensive quotation page—one of my favorites by Howard Aiken is right at the top:

Don’t worry about people stealing an idea. If it’s original, you will have to ram it down their throats.

Today I thought I would discuss a wonderful problem that Jeffrey has worked on.

Jeffrey’s paper is joint with Erik Demaine, Sarah Eisenstat, and David Wilson. See also his talk. They say in their introduction:

Imagine a computing device with very limited powers. What is the simplest computational problem you could ask it to solve? It is not the addition of two numbers, nor sorting, nor string matching—it is telling two inputs apart: distinguishing them in some way.

More formally:

Let ${x}$ and ${y}$ be two distinct ${n}$ long strings over the usual binary alphabet. What is the size of the smallest deterministic automaton ${M}$ that can accept one of the strings and reject the other?

That is, how hard is it for a simple type of machine to tell ${x}$ apart from ${y}$? There is no super cool name for the question—it is called the Separating Words Problem (SWP).

## Some History

Pavel Goral&ccaron;ik and Vaclav Koubek introduced the problem in 1986—see their paper here. Suppose that ${x}$ and ${y}$ are distinct binary words of length ${n}$. Define ${\bf{SEP}(x,y)}$ to be the number of states of the smallest automaton that accepts ${x}$ and rejects ${y}$ or vice-versa. They proved the result that got people interested:

Theorem 1 For all distinct binary words ${x}$ and ${y}$ of length ${n}$,

$\displaystyle {\bf SEP}(x,y) =o(n).$

That is the size of the automaton is asymptotically sub-linear. Of course there is trivially a way to tell the words apart with order ${n}$ states. The surprise is that one can do better, always.

In 1989 John Robson obtained the best known result:

Theorem 2 For all distinct binary words ${x}$ and ${y}$ of length ${n}$,

$\displaystyle {\bf SEP}(x,y) = O(n^{2/5}(\log n)^{3/5}).$

This bound is pretty strange. We rarely see bounds like it. This suggest to me that it is either special or it is not optimal. Not clear which is the case. By the way it is also known that there are ${x}$ and ${y}$ so that

$\displaystyle {\bf SEP}(x,y) = \Omega(\log n).$

Thus there is an exponential gap between the known lower and upper bounds. Welcome to complexity theory.

What heightens interest in this gap is that whenever the words have different lengths, there is always a logarithmic-size automaton that separates them. The reason is our old friend, the Chinese Remainder Theorem. Simply, if ${m < n}$ there is always a short prime ${p}$ that does not divide ${n - m}$, which means that the DFA that goes in a cycle of length ${p}$ will end in a different state on any ${x}$ of length ${m}$ from the state on any ${y}$ of length ${n}$. Moreover, the strings ${0^m}$ and ${0^N}$ where ${N}$ equals ${m}$ plus the least common multiple of ${1,\dots,m+1}$ require ${\Omega(\log N)}$ states to separate. Padding these with ${1}$s gives equal-length pairs of all lengths ${n}$ giving SEP(x,y)${\sim \log n}$.

Some other facts about SWP can be found in the paper:

• (a) It does not matter whether the alphabet is binary or larger.

• (b) For random distinct ${x,y}$, the expected number of states needed to separate them is at most ${4}$.

• (c) All length-${n}$ pairs ${x,y}$ can be distinguished by deterministic pushdown automata (with two-way input tape) of size ${O(\log n)}$.

Point (b) underscores why it has been hard to find “bad pairs” that defeat all small DFAs. All this promotes belief that logarithmic is the true upper bound as well. Jeffrey stopped short of calling this a conjecture in his talk, but he did offer a 100-pound prize (the talk was in Britain) for improving Robson’s bound.

## Some Questions

There are many partial results in cases where ${x}$ and ${y}$ are restricted in some way. See the papers for details. I thought I would just repeat a couple of interesting open cases.

${\bullet }$ How hard is it to tell words from their reversal? That is, if ${x}$ is a word can we prove a better bound on

$\displaystyle {\bf SEP}(x,x^{R}).$

Recall ${x^{R}}$ is the reversal of the word ${w}$. Of course we assume that ${x}$ is not the same as its reversal—that is, we assume that ${x}$ is not a palindrome.

${\bullet }$ How hard is it to tell words apart from their cyclic shifts?

${\bullet }$ How hard is it to tell words from their ${\dots}$ You get the idea: try other operations on words.

## Open Problems

The SWP is a neat question in my opinion. I wonder if there would be some interesting consequence if we could always tell two words apart with few states. The good measure of a conjecture is: how many consequences are there that follow from it? I wonder if there could be some interesting applications. What do you think?

Self-play and Ramsey numbers

 [ Talking about worst case ]

Avrim Blum is the CAO for TTIC. That is he is the Chief Academic Officer at the Toyota Technological Institute of Chicago. Avrim has and continues to make key contributions to many areas of theory—including machine learning, approximation algorithms, on-line algorithms, algorithmic game theory, the theory of database privacy, and non-worst-case analysis of algorithms.

Today I want to discuss a suggestion of Avrim for research on self-play. Read more…