Making Primes More Random
Ben Green grew up about 100 meters from the house in Bristol where Paul Dirac was born and lived until age 10. He has followed Dirac to Cambridge University as a Fellow of Trinity College, which adjoins St. John’s College where Dirac studied and taught. Both also held appointments at Princeton. Green is known for many things including, proving with Terence Tao that for every natural number there exist -many primes in arithmetic progression. That is, there is a prime and an integer such that for all , the number is also prime.
Today I want to talk about mathematical tricks—not magic, not games, but small insights that are very useful. Tricks.
There are two kinds of tricks. One is a device for understanding a theorem, the other is a shortcut to getting it in the first place. The Dirac belt trick is the former kind: it illustrates physical systems that are preserved under two turns of a circle—a 720-degree rotation—but not under one turn. The Dirac delta function is the latter kind: It allows you to postulate a function that is zero except at one point, such that , and also whose Fourier transform is the constant-1 function. This can speed up calculations, and certain “meta-theorems” often remove the need to do anything more to make a proof involving such calculations rigorous.
Green’s co-author Tao maintains tricks as a category on his blog. Both of them assisted Tim Gowers in setting up the Tricki Wiki for collecting and discussing mathematical tricks, along with Alex Frolkin and Olof Sisask. In a followup, Gowers asked whether it needed more examples of research-level tricks compared to undergraduate-level tricks. Let’s talk a little more about what tricks are.
I really like “tricks.” One recurrent theme in mathematics is that big results are easy to find, usually are well known—especially to experts—and beginners can find them in almost any textbook. This is true whether the area is graph theory, number theory, or complexity theory; real analysis, complex analysis, or functional analysis; algebraic geometry, differential geometry, or topology. How did the last manage to avoid being called “-theory” or “-analysis”?
A trick is different from an ordinary “tool” in that it has some element of surprise. It achieves something by a way that is at first counter-intuitive. That’s what makes it fun. Magic tricks without surprise, without the watcher’s expectation being deceived, are nothing.
The trouble with tricks is that they often have no name, or an un-descriptive name, as with the “W-trick” below, and consequently are hard to find. The tricks often are not written down, or if they are, they are hard to find even if you have a general idea of what to search for. A trick may be used in a long proof, causing a causal reader to miss it. Or even if you see them, you may not internalize them and add them to your own bag. Another way that tricks are learned is through oral conversations, lectures, and technical talks. When I teach a class I try to highlight the tricks—at least I try.
The Tricki Wiki was an attempt to make tricks, mathematical tricks, better documented. I thought it was, and still is, a brilliant idea; yet it seems to not have become as popular as it should have. Maybe in time it will.
For now let me explain one trick that is simple, useful, and powerful: the W-trick. Then Ken will ask whether research-level power can be obtained by extending an old and simple trick of arithmetic.
The trouble with the prime numbers is that while they embody considerable randomness, they are biased in many obvious ways. For one they are all odd, except the poor number . Even among the odd numbers the bias is progressive: only one prime is divisible by , and so on. This messes up lots of simple arguments, and makes some proofs about primes extra complicated. A second trouble is that their density is close-but-not-quite constant: the number of primes below goes as .
Enter the W-trick. Consider the mapping
where is the product of all the primes less than some threshold and is relatively prime to . Often is just fine.
The numbers that primes map to are called “W-tricked primes.” They behave more like “pseudorandom numbers” than primes do. W-tricked primes are well distributed modulo provided is greater than or equal to all the prime factors of .
The key to using W-tricked primes in the Green-Tao theorem on arithmetic progressions is that such progressions are preserved by this linear map: if the W-tricked primes have a progression of length , for example, then so do the real primes.
Recently I have used this same idea to show that W-tricked numbers are more likely to be relatively prime: have no common factor in common. If and are chosen randomly in a long interval, then it is well known that they will be relatively prime about of the time, about . But W-tricked numbers will be relatively prime almost all the time: as goes to infinity the probably of a common factor goes to zero.
Duality is an umbrella term for a host of tricks. For example, the proof of the Four-Color Map Theorem begins by forming the dual graph in which every “country” becomes a vertex, and two vertices are connected if their countries share a boundary. Repeating the process would restore the original map, which makes the graph strictly a dual. In linear programming and other optimization techniques the presence of dual programs is often important. The Fourier transform gives a dual form to any function, and the Hadamard transform provides an important dual basis in quantum computation. Change-of-basis is another umbrella term for many tricks in linear algebra—or maybe many of them are just tools.
A kind of duality may operate with the Green-Tao theorem. Instead of the set of primes being “tricked up” to be denser, one can also “trick down” the integers into a suitably pseudorandom subset . Whereas the primes have density approaching zero in , their density in may stay above some constant. The trick is set up so that one can carry over Szemeredi’s Theorem that every subset of of positive density has arbitrarily long arithmetic progressions, from to . This view was developed in a paper by Omer Reingold, Luca Trevisan, Madhur Tulsiani, and Salil Vadhan, people we know well in complexity theory. Gowers proved something similar independently.
Often the dual is “the same” as the original but just viewed in a mirror. This may be the case here, since this is actually proved by enlarging to where stays having positive density in . The W-trick is a concrete instance of the latter. Still, these papers show that the dual view of the Green-Tao “transference principle” has its own uses, all derived from tools of pseudorandomness developed in complexity theory. Cool and worth deeper study.
The Square Trick
We have previously discussed an old trick known to the Babylonians for reducing any integer multiplication to the lookup of three squares, or alternatively two squares:
Ken uses this as a first example of a Turing reduction that needs more than one oracle call. In the post I stopped at consulting a table of large enough squares, but Ken wants to go further with simulating the squaring part. The hitch is in the denominators of the fractions—if we could pretend they don’t exist, we’d be home.
The reason is another trick that allows replacing squaring by shifting. Regard the finite field as a vector space of dimension over . A vector in this space is normal if its iterated squares
are all linearly independent, and hence form a basis of the vector space. Then for any field element represented over this basis, the left-shift (wrapping the first bit around to be the last) represents . This trick is exploited to implement codes that do frequent squaring.
Alas this only works in characteristic 2, so that the final dividing by 2 or 4 in the formulas for is like dividing by zero. Over the corresponding concept involves -th powers of , not squares. The vanishing of multiples of the characteristic is needed anyway so that expressions of the form leave only the terms in , which is what gives the shifting behavior for to begin with. Unfortunately, the formula above for becomes undefined in characteristic 2. Ken still wonders, though, whether there is some consistent way to postulate it, so that—as with the Dirac delta function—it can be used in computations.
The general question is how do we document and pass on tricks to others? This is an important issue especially for new researchers to an area of mathematics.
A specific question that I think arises is, can we create a “non-linear” W-trick? The idea would be to build a nonlinear mapping that allowed us to map primes to another set and use the existence of progressions there to conclude something useful about primes themselves? Is there any way to create a W-tricked set of primes that allows progress on open questions in the structure of primes? I am sure that Green and Tao, among many others, have thought about this. But perhaps you might have a new idea here?