Make that 1942

Raphaël Salem was a great mathematician who worked mainly on Fourier analysis. He will always be remembered for his many beautiful contributions and for the Salem Prize that is named after him. Also he has his own class of real numbers: a real algebraic number ${\alpha>1}$ is a Salem number if all its conjugates have absolute value at most ${1}$ and one has absolute value exactly one. See this for a list of known Salem numbers.

Today I want to discuss a results that is simple to state, that is widely used, and that has a clever proof.

One of the great things about proving mathematical theorems is that they are forever. We still talk about results from Euclid, 2300 years later; perhaps we will talk about your result years from now. Franchino Gaffurio noted the importance of arithmetic progressions to harmony in his treatise Theorica musicae in 1492. The result in question is a bit more recent—it was proved in 1942 by Salem and Donald Spencer.

What made me think about their theorem is that it is used in very diverse areas of mathematics. For example, it plays a role in the study of ${\omega}$, the matrix product exponent, and also was used recently by Dieter van Melkebeek in a proof of a beautiful result on the complexity of ${\mathsf{SAT}}$.

These applications of the theorem are quite different: one is looking for patterns in trilinear forms, and one is looking for the existence of a protocol that Alice and Bob can use to solve instances of boolean satisfaction. Very different problems, I think anyone would agree. Yet the same theorem is used.

Let’s turn to the statement of this wonderful theorem. Perhaps you will be able to use it in some proof in the future.

The Theorem

The paper of Salem and Spencer is titled: On Sets Of Integers Which Contain No Three Terms In Arithmetical Progression. The title kind-of gives away the whole result. This was probably more important in 1942, since a paper with a title like: A Result On Patterns In Numbers would be hard to find. Recall there was no search inside papers in those days. Nowadays we can use any title at all and people can still search and find your result. They can even search to see if you’ve worked on a particular equation. But then a good title was worth many citations.

Let ${S}$ be a set of integers. Say that the set is progression-free if for any three distinct integers ${x,y,z}$ in the set,

$\displaystyle x + y \neq 2z.$

Note that this is the same as saying that there is no progression of length three in the set: if

$\displaystyle x, x + \Delta, x +2\Delta$

are all in the set with ${\Delta > 0}$, then

$\displaystyle x + (x + 2\Delta) = 2(x + \Delta).$

Salem and Spencer defined ${\nu(N)}$ to be the maximum number of elements that can be in a subset of ${\{1,\dots,N\}}$ that is progression-free. They proved:

Theorem: For any ${\epsilon>0}$,

$\displaystyle \nu(N) > N^{1-\frac{\log 2 + \epsilon}{\log \log N}}.$

Note this means that ${\nu(N)}$ is extremely close to ${N}$, and that there always exists a set ${S}$ that is very dense yet avoids every three-term progression.

Unconventional Wisdom, Again

In their introduction they pointed out that at the time the result was surprising, since then the conventional wisdom was that ${\nu(N)}$ had to behave like ${N^{\alpha}}$ where ${\alpha < 1}$. Clearly their result is much stronger than this—again people guessed wrong. Here the people are Paul Erdös and Paul Turán who made the guess. In that same paper Erdös and Turán made a small calculation error that was pointed out in one of the shortest papers I have every seen. It was in the Journal of the London Mathematical Society:

As I have quoted before, from Roger Brockett:

“It is hard to argue with a counter-example.”

The Proof

I will not give the full proof as it is quite nicely given in the original paper, or take a look at a more modern treatment here.

The main idea of the proof is very simple. Let ${x}$ be a number in the range ${1,\dots,N}$ and use ${v_{x}}$ to denote the vector ${v_{0},\dots,v_{m}}$ where

$\displaystyle x = \sum_{k=0}^{m} v_{k}10^{k}.$

The vector ${v_{x}}$ is just the vector of the digits of ${x}$ in decimal notation. Look at the set ${S}$ of such ${x}$ so that all their digits restricted to be in the set ${\{0,1,2,3,4\}}$. The the critical observation is that for ${x}$ and ${y}$ in the set ${S}$,

$\displaystyle x + y = z \text{ if and only if } v_{x} + v_{y} = v_{z},$

where the latter sum is the vector sum of the digits. This follows since there is no possible carry in adding ${x}$ and ${y}$. For example,

$\displaystyle \begin{array}{rcl} &1234231301 \\ +&4023141312 \\ &----------------\\ &5257372613. \end{array}$

The reason this is an important observation is: this replaces looking for a set without a three progression to a set of vectors that contains no line. This is much easier, as you might expect. The remainder of the proof is to replace decimal by a higher radix and to replace the restriction on the digits by a more complex one. The idea is to restrict the digits so that the sum of integers with the restriction are mostly like a vector sum.

See the paper for the details—it is less than three pages long and not hard to follow. Also see this recent paper by Bill Gasarch, James Glenn, and Clyde Kruskal, which covers constructions of 3-free sets for small ${n}$ with tables for ${n \leq 250}$, and their followup on theory and heuristics for higher ${n}$. See also this 2010 post by Bill referencing also this by Gil Kalai on a “breakthrough” and “barrier.”

Open Problems

Can you use this wonderful theorem in your work? Do you know of another theorem like this that is widely used in very different parts of mathematics?

1. December 6, 2011 9:47 am

I ought to know the answer to this, but still. Your description of the proof could be a description of the Behrend construction of 1946. Was Behrend’s contribution to this story just a better optimization of the parameters or was the idea of taking a sphere his too? If the latter, then what sort of line-avoiding set did Salem and Spencer go for?

• December 6, 2011 10:27 am

Behrend introduced the use of a sphere and got better asymptotics. In their original paper, Salem and Spencer just used two properties for numbers written in base d with d large: having small enough digits that no carrying can occur when doing the addition, and having each of those digits occur exactly the same number of times.

• June 15, 2016 3:20 pm

The Behrend’s construction is to take integers with d-ary expansions $(a_1,\dots ,a_n)$ so that $0\le a_1 and the sum of squares of the $a_i$s (as integers) is a constant C.  (So they are all on a sphere, as Henry said.) Then there is no 3 term AP and the size is for some C by pigeonhole $d^n/2^n(n^2).$ (You can improve the $n^2$ by being careful.) If we take $d=\sqrt {\log N}$  then we get a 3-term free set of size $N/\exp (K \sqrt {\log N})$

Can’t we just take all $a_i$ s such that $d/4 \le a_i for every $i$ (where again $d=\sqrt {\log N}$) for an example ( worse but somehow conceptually simpler) of size $N/\exp (\log N^{1/4})$?

• June 15, 2016 3:39 pm

Surely that set contains lots of 3APs. Isn’t it Freiman isomorphic to a $d$-dimensional grid? Or have I misunderstood what you are saying?

• June 15, 2016 3:53 pm

oops I was confused with the Z/3Z case of triples summing up to 0. 🙂

December 6, 2011 6:01 pm

The linked paper is an improvement by Behrend (the same one mentioned in gower’s comment?). I think this is the Salem and Spencer paper: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1078539/pdf/pnas01647-0055.pdf

December 8, 2011 10:38 am

It is unusual that the representation of numbers is useful. I know of
two brainteaser like situation where it is.

Firstly:

\sum_{n=0}^\infinity a^n

But if a is a positive integer there is an unusual way to compute the
series: it is equal to 1.11111.. in base a. And by doing long hand
division it is easy to check that this is equal to 10/(a-1) (where 10
is in a in base a).

Secondly:

Suppose you have a biased coin that gives head 20% of the times, and
tail 80% of the time. You can flip this coin to make a draw of a
binary random variable that has an equal probability to be 0 or 1. But
can you do so with a bounded number of flips? The answer is no.

Suppose you could do so in at most N flips. Then you could partition
the 2^N possible sequence of flips into two sets of probability 1/2
each. The probability of each partition is equal to the sum of the
probabilities of the sequences comprising it, each of which is equal
to 0.2^n 0.8^{N-n} for some n. It is clear this is impossible if you
write everything in base 5: the probability of each sequence has a
finite expansion in base five, so their sum can never be equal to
1/2 = 0.22222222… which has an infinite expansion.

I wonder if there is a way to generalize either of these to more
general cases?

4. December 8, 2011 12:50 pm

Amusingly, the 1942 work of Salem and Spencer is the topic of the very first English language mathematical article that was ever called a “breakthrough” by any MathSciNet reviewer. Was this intentional on the part of Gödel’s Lost Letter and P=NP … or was it simply good taste in choosing “breakthrough” topics?

Specifically, the first-ever-theorem-to-be-called by MathSciNet reviewers a “breakthrough” was Robert Rankin’s improvement and extension of the Salem-Spencer bound as stated in Rankin’s 1960 article “Sets of integers containing not more than a given number of terms in arithmetical progression”, which was reviewed by Tom Apostol (MR0142526).

Since Apostol’s “breakthrough” in first applying the term “breakthrough” to mathematics, more than eight hundred subsequent MathSciNet reviews have followed suit. 😉

In aggregate, what does this rising tide of mathematical breakthroughs mean, and what does it portend for the 21st century? This I take to be among the most natural and important questions that are associated to the GASARCH/Fortnow Computational Complexity weblog’s presently active topic “What is a Breakthrough?”