Why the recent breakthrough is important

 photo sources: UNH and Simons article

Yitang Zhang, of the University of New Hampshire, has apparently proved a finite approximation to the famous Twin Prime Conjecture. This is a result of the first order. After ten days of progressively more detailed news, including today’s article in the New York Times, Zhang’s 56-page preprint has just been released on the Annals of Mathematics journal website. This is the final accepted version of the original submission, which was said in a story in Nature last week to have needed only “a few minor tweaks.”

Today Ken and I want to explain important aspects of the Twin Prime Conjecture.

Recall that the Twin Prime Conjecture states that there are an infinite number of primes ${p}$ and ${q}$ that are as close as possible: ${p-q = 2}$. Well 3 and 2 are closer, but that can only happen once, so the best one can hope for is primes that are within two of each other.

Zhang’s beautiful result is that there are an infinite number of primes ${p}$ and ${q}$ so that ${p-q = 2N}$ and ${N}$ is bounded by an absolute constant. The constant is large—in the tens of millions—but it is a constant. Perhaps we should call these “Cousin Primes,” since they are not within two; I will leave the naming of them to the number theorists. Whatever you call them, his result is a huge improvement to what was previously proved, and is a singular advance.

The proof is long, which is not unexpected. Ken saw the news of the paper’s release earlier today on Terry Tao’s Google+ page about the work, which gives some idea of how the proof goes. There are many links and comments in a post by Peter Woit that also mentions a recently announced proof by Harald Helfgott of the “ternary Goldbach conjecture” that every odd number above 5 is the sum of three primes.

So what can we possibly add to the discussion about Zhang’s breakthrough? Nothing on the proof right now. Something, however, on why the Twin Prime Conjecture can be really useful. This is all from a personal point of view, but one that I hope you will enjoy. Let’s first take a quick look at what was known before his work, and then discuss what it may be useful for.

## Before Zhang

One measure of the density of the primes is that the summation

$\displaystyle \sum_{p} \frac{1}{p}$

does not converge. That is, the sum

$\displaystyle \sum_{p

tends to infinity as ${x}$ tends to infinity. The growth is slow, but the sum does diverge. In 1915, Viggo Brun used sieve methods to prove that twin primes were rarer in a precise sense: the summation over twin primes

$\displaystyle \sum_{p} \frac{1}{p}$

converges. Indeed his result can be improved to show that the number of twin primes less that ${x}$ is bounded above by

$\displaystyle O\left(\frac{x}{(\log x)^{2}}\right).$

Using heuristic arguments, Godfrey Hardy and John Littlewood guessed that not only are there an infinite number of twin primes, but that there density is close to what a “random” model would predict. Let ${\pi_{2}(x)}$ be the number of twin primes less than ${x}$—recall ${\pi(x)}$ is used to denote the number of primes less than ${x}$—then the Hardy-Littlewood Conjecture is that

$\displaystyle \pi_{2}(x) \approx C\left(\frac{x}{(\log x)^{2}}\right),$

for an explicit constant

$\displaystyle C = 0.6601618158468695739278121100145\dots$

Tao is on record as saying that certain approaches based on sieve theory cannot resolve the Twin Prime conjecture—see this for a short discussion. Mark Lewko, in a comment to a MathOverflow thread on Zhang’s paper, indicates that its mechanism alone cannot reduce the gap under ${16}$, and it does not circumvent a more general obstacle to gaps below ${6}$. However, even if Zhang’s new techniques do not overcome such general barriers, at least they push against them with a lot more oomph.

## Another Problem

Years ago I worked on a problem that had nothing directly to do with the Twin Prime Conjecture. The question is a fundamental one about the complexity of Boolean functions. It is not classic, has not been open for a very long time, and could be trivial. But like many problems about Boolean functions it turned out to fight back hard, and the best we could do was to make a small dent in the problem.

The work was joint with Mihail Kolountzakis, Evangelos Markakis, Aranyak Mehta, and Nisheeth Vishnoi. See the paper for details. Indeed while I helped start the research with Markakis, Mehta, and Vishnoi, without Kolountzakis the work would have never been completed. Our result was then greatly improved by Amir Shpilka and Avishay Tal, as we covered here.

The problem is concretely about Boolean functions ${f}$ of ${k}$ variables, and seems not to involve prime numbers at all. For any subset ${S}$ of the coordinates, the corresponding Fourier coefficient is given by

$\displaystyle \hat{f}(S) = \frac{1}{2^k} \sum_{x \in \mathbb{Z}_2^k} f(x)\chi_S(x)$

where ${\chi_S(x)}$ is ${-1}$ if ${\sum_{i \in S} x_i}$ is odd, and ${+1}$ otherwise. The problem is:

What is the smallest value ${\tau(k)}$ such that for every symmetric 0-1 function ${f(x_1,\dots,x_k)}$ on ${\mathbb{Z}_2^k}$ that is not affine linear—by symmetry, this means neither constant nor a parity function—some ${\hat{f}(S)}$ with ${1 \leq |S| \leq \tau(k)}$ does not vanish?

## The Prime Gap Trick

A heuristic that I have used before is this: When trying to prove some theorem in number theory, assume any reasonable property that you need about primes. Either you will at least get a conditional theorem, or you might later be able to weaken the assumption you made. The latter is what happened to us. Or you might be luckier and get your theorem to follow both from the assumption and its negation, so that you don’t need it at all. I noted once how Littlewood did this with the Riemann Hypothesis—incidentally that post was about a theorem of Ravi Kannan, and I am attending a birthday workshop in his honor later this week.

In this case we reduced the problem to showing that a certain integer-valued polynomial is constant over the set ${\{0,1,\dots,k\}}$. Then we expressed the connection in the paper in these terms:

First, we show that ${P}$ is constant over the union of two small intervals ${\{0, ..., t\} \cup \{k - t, ..., k\}}$. This is obtained by looking at ${P}$ modulo carefully chosen prime numbers. One way to prove this (at least infinitely often) would be to assume the twin primes conjecture… We manage to replace [this] by choosing four different primes in a more involved manner…

Thus we actually did something else besides use twin primes, but this is how we got the idea. Moreover, Shpilka and Tal used gaps between consecutive primes in a different way, obtaining bounds in terms of the largest such gap in the interval ${1\dots k}$, which is known to be ${O(k^{0.525...})}$. If we can now get our hands on enough cases where the gaps are small, maybe we can improve the estimates further. Why is it useful to have primes with small gaps?

We have covered the use of the Chinese Remainder Theorem for analytical results before. Usually for complexity-theoretic applications such as this one, we want the primes themselves to be small—and don’t mind having a bunch of primes. For logspace we can work with polynomially-many primes in a sequential manner, so long as each needs only ${O(\log n)}$ bits to store.

When we don’t need size to be so constrained, however, it can be more useful to have the gap between primes ${p,p+c}$ in the representation be small. Then if we know in advance that values ${m = P(x)}$ are below ${p^2/c}$, we know that the values ${m \bmod{p}}$ and ${m \bmod{p+c}}$ have to be close in an absolute sense. In particular, they cannot be closer than ${c}$ to each other unless ${m < p}$, making them equal—and in our case we would get them all to be zero. For higher ranges of ${m}$ one retains a lot of this control. The larger ${p}$ and the smaller ${c}$, the bigger the effective range.

We will hence be further interested in how dense the pairs of “cousin” primes must be, and how efficiently they can be generated. Anytime there is a breakthrough, it is time to revisit old ideas and see whether they too can profit.

## Open Problems

How far can the gap between consecutive primes, for infinitely many such pairs, be reduced? Do this and Helfgott’s result on Goldbach herald a more general breakthrough in number theory?

[updated link and info about paper in opening paragraph; sourced photo and linked to Simons Foundation article]

1. May 21, 2013 8:17 pm

This is the 50th “Pip” post—the reasons for this to be our joint-author name are here. Actually Dick wrote almost all of this except parts of the intro noting today’s release of the paper and the last few paragraphs—the main reason I’m posting is that he and Judith are enjoying their anniversary this evening, while Ken is out fighting Androids.

May 21, 2013 11:01 pm

Consider a quantum dynamical system of $n$ qudits, each of dimension $d$, comprising a state-space $\mathbb{C}^{n^d}$. Fix $d$ to be a small number (e.g., $d \in (2,3,4,5)$), and consider the limit $n\gg d$ (physically, the limit of a many-qudit computer).

The answer to the following question is sought (as a warm-up to a question for TCS StackExchange):

What is known (rigorously) regarding the asymptotic probability $P(n,d)$ that $\exists k\ge0: \text{mod}(n^d,dn-n+1) = d^k$, for fixed $d$ and large $n$?

Remark: numerical experiments readily establish that the naive guess $P(n,d) \simeq \mathcal{O}(1/n)$ is a gross underestimate; empirically it seems that $P(n,d) \simeq \mathcal{O}(1/\text{log}\,n)$ or possibly even $P(n,d) \simeq \mathcal{O}(1/\text{log}\,\text{log}\,n)$. The question asked is what is provably known regarding the large-$n$ asymptotics of $P(n,d)$?

The question-asked connects closely to the celebrated (among number theorists!) Cunningham Project — which is humanity’s longest-running computational research program — and it relates specifically to “classical” (long-known) results regarding the algebraic factors, Aurifeuillian factors, and “other” factors of Cunningham numbers.

What is the connexion to quantum computing? That is the focus of a planned TCS StackExchange question (which is why these references are sought).

The Moral of the Story  Yes indeed, twin primes (and many other factoring theorems) are exceedingly useful!

• May 22, 2013 10:16 am

Typo — in the above question, $\text{mod}(n^d,dn-n+1)$ should have been (obviously) $\text{mod}(d^n,dn-n+1)$.

E.g., for $d=2$ and $n\sim 10^8$ — that is, a dynamical system of $n\sim 10^8$ qubits — it turns out that $P({\sim}10^8,2) \sim 0.1$, which is $\mathcal{O}(n)$ larger than the naive “factors are randomly distributed” estimate $P({\sim10^8},2) \sim 10^{-8}$.

May 22, 2013 12:28 am

Nice result. Unfortunately the paper is behind a pay-wall. Please supply a free version. Nobody should pay to read advances in pure science. Ever.

• May 22, 2013 8:30 am

Agreed—your request is hereby redirected to the “parties of the first and second part…”

May 22, 2013 10:14 am

What paywall? I thought the PDFs on the Annals of Mathematics are open access.

May 23, 2013 1:34 am

Well, for me it says the following:

We have not been able to recognize your IP address as one of the subscribers to the Annals of Mathematics. Please note that online access to PDF copies of articles is by subscription only.

If you would like to purchase a subscription please refer to our subscription webpage.”

May 24, 2013 9:19 pm

_Annals_ was once an arXiv overlay journal, but this is no longer the case. See Scott Morrison’s blog post, http://sbseminar.wordpress.com/2012/02/01/journals-and-the-arxiv/, for more information.

Apparently journals need revenue to operate. In the coming brave new world where no one will have to pay to read about scientific advances, authors are going to have to pay to write about them.

May 27, 2013 5:18 am

Well, I disagree with you Xbad. Almost every respectable scientist, in TCS at least, puts his or hers publications online for free these days. So in the very near future no one will pay to read pure science. See e.g., many new open access journals in TCS. They don’t have revenue and they do operate.

May 27, 2013 7:32 pm

I put my own publications (math and physics) online for free as well, and I hope that Zhang will decide to do the same. But there are very troubling signs that science publishing as a whole is rushing headlong towards the author-pays model. This includes some ventures promoted by very prominent mathematicians.

• May 31, 2013 7:14 am

A bit angry: “Nobody should pay to read advances in pure science. Ever.”

Why should people pay tens of thousands (or even hundred thousands) dollars for getting training to make them capable of reading pure science, and then reading the fruits of pure science should be free (which mean that the marginal expenses for administrating the journals be paid by the public or by the scientists themselves?)

May 22, 2013 9:01 am

Zhang’s breakthrough proves that there is at least one $k$ such that $p_{n+1} – p_n = 2k$ holds for an infinite number of n.

As a layperson to this area I am curious if there is a general feeling in the number theory community about which of the following possibilities is likely to be true?

The above statement holds for every $k$

The above statement holds for an infinite number of $k$

The above statement only holds for a finite number of $k$, so Zhang’s result is close to being the best possible outcome.

The above statement only holds for a finite number of $k$ one of which is $1$, the twin-prime conjecture

Results along these lines could help further unravel the tapestry of the distribution of prime numbers.

May 22, 2013 6:20 pm

It is usually suspected that the statement holds for every k. This is called Polignac’s conjecture.

May 24, 2013 11:33 am

Thanks Anon! Your response is much appreciated. The conjecture you cite, if proven, would be an astonishing property.

June 20, 2016 1:19 pm

Wait, isn’t it OBVIOUS that for any two prime numbers>2, p1 – p2 will be 2N?? All the primes are odd, so duh?

5. May 22, 2013 12:05 pm

a stellar result. way cool! again shows some of the deep connections between TCS & number theory which Ive always seen as fundamental & imho is sure to continue to strengthen in strong & surprising ways. and the deep/mysterious links between primes & TCS are probably still largely not mapped out.

6. May 24, 2013 11:06 pm

How about ‘Affine Primes’, the affinity being increased as the upper-bound decreases.

Guy

• May 28, 2013 10:10 am

… an affine prime was had by all …

7. November 5, 2013 10:16 pm

tribute to zhang quoting lots of recent cyber refs/links. also, the isolated outsider/”academic grind” angle. on how zhang worked in isolation almost entirely outside academia.

December 26, 2014 3:16 am

There are infinitely many primes P and P + 2 is a prime number
The twin prime formula Using the criterion of prime numbers, we can get the following conclusions:

“if the natural numbers q and q+2 cannot be any less than\sqrt{q+2} prime divisibility, then q and q+2 is prime.。
This is because a natural numbern is prime if and only if it can’t be any less than other prime divisibility on\sqrt{n} .
Using mathematical language expresses the conclusions above, is:
There is a group of natural numberb_{1}, b_{2} \cdot , b_{k} making

Thep_{1},p_{2},\dots,p_{k} from small to large order during the first k prime numbers: 2, 3, 5,…. And meet。

\forall 1 \le i \le k, \ \ 0 < b_{i} < p_{i}, \ b_{i} \neq 0, \ b_{i} \neq p_{i} – 2.
Natural number q so obtained if meet the q<p^{2}_{K+1}-2，qandq+2 ，

are a pair of twin primes. We can think of (1) equivalence of content type conversion become congruence equations。

Because of (2).p_{1},p_{2},…,p_{k} are all prime numbers, so two two coprime,based on the Chinese Remainder Theorem (China remainder theorem), for a givenb_{1}, b_{2} \cdot , b_{k}, (2) is only a less than positive integerp_{1} p_{2} \cdots p_{k} solution。.

Example
For example, k=1,q=2m_{1}+1，solutionq=3, 5 。 Since5<3^2-2 ，

so that 3 and3+2 ；5 and5+2 are the twin prime. Thus the interval(3, 3^2) in the
whole of prime twins.。
For instance

k=2, equation of q=2m_{1}+1=3m_{2}+2， solutionq=5, 11, 17。Since17<5^2-2，
so the11 with11+2 ；17 and 17+2 are the twin primes.
Because this isall possible b_{1}, b_{2} \cdot , b_{k} value, so the interval (5, 5^2) are of prime twins.。
k=3 5m_{3}+1 5m_{3}+2 5m_{3}+4
q=2m_{1}+1=3m_{2}+2= 11,41 17 29
Because this is all possibleb_{1}, b_{2} \cdot , b_{k} value, so the interva(7, 7^2)are of prime twins。

k=4 7m_{4}+1 7m_{4}+2 7m_{4}+3 7m_{4}+4 7m_{4}+6
q=2m_{1}+1=3m_{2}+2=5m_{3}+1= 71 191 101 11 41
q=2m_{1}+1=3m_{2}+2=5m_{3}+2= 197 107 17 137 167
q=2m_{1}+1=3m_{2}+2=5m_{3}+4= 29 149 59 179 209
Because this is all possibleb_{1}, b_{2} \cdot , b_{k} value, so the interva(11, 11^2)are of prime twins。 　8 less than the 121-2 of the solution。

Copy this to be a not leak to find all the prime twins within an arbitrary largenumber of. For all the possibleb_{1}}, b_{2} … , b_{k} value, (1) and (2) in the（p_{1}）（p_{2}）(p_{3})…（p_{k}）range,， with（p_{1}-1）（p_{2}-2）(p_{3}-2)…（p_{k}-2）…..（3）solution。

Conclusion
The twin prime conjecture is (1) and (2) type, value arbitrarily large when a q<p^{2}_{K+1}-2 solution in K.