How to avoid the pain of estimating tough sums

 Cricketing source

Andrew Granville is a number theorist, who has written—besides his own terrific research—some beautiful expository papers, especially on analytic number theory.

Today Ken and I wish to talk about his survey paper earlier this year on the size of gaps between consecutive primes.

The paper in question is here. It is a discussion of the brilliant breakthrough work of Yitang Zhang, which almost solves the famous twin-prime conjecture. You probably know that Zhang proved that gaps between consecutive primes are infinitely often bounded by an absolute constant. His constant initially was huge, but using his and additional insights it is now at most ${600}$ and plans are known that might cut it to ${6}$.

As Andrew says in his paper’s introduction:

To Yitang Zhang, for showing that one can, no matter what

We believe we need the same attitude to make progress on some of our “untouchable” problems. Perhaps there is some budding complexity theorist, who is making ends meet at a Subway™ sandwich shop, and while solving packing problems between rolls in real time is finding the insights that can unlock the door to some of our open problems. Could one of these be ready to fall?

• Is NP closed under complement?

• What is the power of ${\bmod \ 6}$ polynomials?

• Can we at least prove SAT is not computable in linear time?
• And so on …

Who knows.

## Monotone Counting

Throughout mathematics—especially number theory—computing the number of objects is of great importance. Sometimes we can count the exact number of objects. For example, it is long known that there are exactly ${n^{n-2}}$ labeled trees, thanks to Arthur Cayley.

The number of labeled planar graphs is another story—no exact formula is known. The current best estimate is an asymptotic one for the number of such graphs on ${n}$ vertices:

$\displaystyle g\cdot n^{-7/2}\cdot \gamma^n\cdot n!,$

where ${\gamma\approx 27.22687}$ and ${g\approx 0.43\times 10^{-5}}$.

Thanks to a beautiful result of the late Larry Stockmeyer we know that estimating the number of objects from a large set may be hard, but is not too hard, in the sense of complexity theory. He showed that

Theorem 1 For any ${\epsilon > 0}$ one can design an ${n^{O(1)}}$-time randomized algorithm that takes an NP-oracle ${H}$, a predicate ${R(x,y)}$ decidable in ${n^{O(1)}}$-time (where ${n = |x|}$) and an input ${x}$ and outputs a number ${A_x}$ such that with probability at least ${1 - 2^{-n}}$, the true number of ${y}$ such that ${R(x,y)}$ holds is between ${(1-\epsilon)A_x}$ and ${(1+\epsilon)A_x.}$

The predicate ${R(x,y)}$ is treated as a black box, though it needs to be evaluated in polynomial time in order for the algorithm to run in polynomial time. The algorithm has a non-random version that runs within the third level of the polynomial hierarchy.

We won’t state the following formally but Larry’s method can be extended to compute a sum

$\displaystyle \sum_{k=1}^{N} a_{k}$

to within a multiplicative error without an NP-oracle, provided the map ${k \rightarrow a_{k}}$ is computable in polynomial time, each ${a_{k} \ge 0}$, and the summands are not sparse and sufficiently regular. Simply think of ${N = 2^n}$, identify numbers with strings so that ${k}$ runs over ${\{0,1\}^n}$, and define ${R(k,y)}$ to hold if ${1 \leq y \leq a_k}$.

## Cancellation

Larry’s method fails when a sum has both negative and positive terms; that is, when there is potential cancellation. Consider a sum

$\displaystyle S = \sum_{k=1}^{N} a_{k}.$

Even if the terms ${a_{k}}$ are restricted to be ${+1}$ and ${-1}$, his method does not work. Rewrite the sum as

$\displaystyle P=\sum_{a_{k}>0} a_{k} \text{ and } Q =\sum_{a_{k}<0}a_{k}.$

Then knowing ${P}$ and ${Q}$ approximately does not yield a good approximation to the whole sum ${S}$. We could have ${P = A + C}$ and ${Q = B - C}$ where ${C}$ is a large term that cancels so that the sum is ${A+C +B -C = A+B}$. So the cancellation could make the lower order sums ${A}$ and ${B}$ dominate.

This happens throughout mathematics, especially in number theory. It also happens in complexity theory, for example in the simulation of quantum computations. For every language ${L}$ in BQP there are Stockmeyer-style predicates ${R(x,y)}$ and ${S(x,y)}$ such that ${h = |y|}$ equals the number of Hadamard gates in poly-size quantum circuits recognizing ${L}$ whose acceptance amplitude is given by

$\displaystyle \frac{|\{y: R(x,y)\}| - |\{y: S(x,y)\}|}{\sqrt{2^h}}.$

Although the individual sums can be as high as ${2^h}$ their difference “miraculously” stays within ${\sqrt{2^h}}$, and the former is never less than the latter—no absolute value bars are needed on the difference. See this or the last main chapter of our quantum algorithms book for details. Larry’s algorithm can get you a ${(1 + \epsilon)}$ approximation of both terms, but precisely because the difference stays so small, it does not help you approximate the probability. This failing is also why the algorithm doesn’t by itself place BQP within the hierarchy.

## When Only Positivity is Needed

What struck Ken and me first is that the basic technique used by Yitang Zhang does not need to estimate a sum, only to prove that it is positive. This in turn is needed only to conclude that some term is positive. Thus the initial task is lazier than doing an estimate, though estimates come in later.

The neat and basic idea—by Daniel Goldston, János Pintz, and Cem Yíldírím in a 2009 paper that was surveyed in 2007—uses indicator terms of the form

$\displaystyle t_k = \ell(k+b_1) + \ell(k+b_2) + \cdots \ell(k+b_r) - \ln(3N),$

where ${\ell(p)}$ is defined to be ${\ln(p)}$ if ${p}$ is prime and is ${0}$ otherwise. Here ${k}$ runs from ${N}$ to ${2N}$ and ${0 < b_1 < b_2 < \cdots < b_r < N}$. If the term ${t_k}$ is positive then at least two of the elements ${\ell(k+b_i)}$ and ${\ell(k+b_j)}$ must be prime, which means that the gap between them is at most the fixed value ${g = b_r - b_1}$. Doing this for infinitely many ${N}$ yields Zhang’s conclusion that infinitely many pairs of primes have gap at most ${g}$. Can it really be this simple?

The strategy is to minimize ${g}$, but it needs to provide a way to get a handle on the ${\ell(\cdot)}$ function. This needs forming the sum

$\displaystyle S = \sum_{k=N}^{2N} w_k t_k,$

where the ${w_k}$ are freely choosable non-negative real weights. The analysis needs the ${b_i}$ to be chosen so that for every prime ${p \leq r}$ there exists ${k < p}$ such that ${p}$ does not divide any of ${k+b_1,\dots,k+b_r}$. This defines the ordered tuple ${(b_1,\dots,b_r)}$ as admissible and enters the territory of the famous conjecture by Godfrey Hardy and John Littlewood that there exist infinitely many ${k}$ such that ${k+b_1,\dots,k+b_r}$ are all prime. This has not been proven for any admissible tuple—of course the tuple ${(0,2)}$ gives the Twin Prime conjecture—but thanks to Zhang and successors we know it holds for ${(0,b)}$ for some ${b \leq 600}$.

The analysis also needs to decouple ${N}$ from the ${b_i}$—originally there was a dependence which only enabled Goldston, Pintz, and Yíldírím to prove cases where the gaps grow relatively slowly. Andrew’s survey goes far into details of how the sum ${S = \sum w_k t_k}$ and its use of the ${\ell(\cdot)}$ function must be expanded into further sums using arithmetic progressions and the Möbius function in order for techniques of analysis to attack it. The need for estimation heightens the cancellation problem, which he emphasizes throughout his article. The issue is, as he states, of central importance in number theory:

The two sums … are not easy to evaluate: The use of the Möbius function leads to many terms being positive, and many negative, so that there is a lot of cancellation. There are several techniques in analytic number theory that allow one to get accurate estimates for such sums, two more analytic and the other more combinatorial. We will discuss them all.

As usual see his lovely article for further details. Our hope is that they could one day be used to help us with our cancellation problems.

## Tricks

I thought that we might look at two other strategies that can be used in complexity theory, and that might have some more general applications.

${\bullet }$ Volume trick

Ben Cousins and Santosh Vempala recently wrote an intricate paper involving integrals that compute volumes. At its heart however is a simple idea. This is represent a volume ${V}$ in the form

$\displaystyle V = \frac{A_{1}}{A_{2}} \cdot \frac{A_{2}}{A_{3}} \cdots \frac{A_{m-1}}{A_{m}},$

where ${A_m}$ is the hard quantity that we really want to compute. The trick is to select the other ${A_{i}}$‘s in a clever way so that each ratio is not hard to approximate. Yet the cancellation yields that

$\displaystyle V = A_{1} / A_{m}.$

If ${A_{1}}$ is simple, and if we have another way to estimate ${V}$, then we get an approximation for the hard to compute ${A_{m}}$.

${\bullet }$ Sampling trick

Let ${S = \sum_{k} a_{k}}$ and ${R = \sum_{k} a_{k}r(k) }$ where ${r(k)}$ is a random function with expectation ${p>0}$. Then we know that ${S}$ is equal to ${E[R]/p}$. Note that this does not need the values ${r(k)}$ to be independent.

Can we use this idea to actually compute a tough sum? The sums we have in mind are inner products

$\displaystyle \langle X,Y \rangle = \sum_k \bar{x}_k y_k,$

where the vectors ${X}$ and ${Y}$ are exponentially long but have useful properties. They might be succinct or might carry a promised condition such as we discussed in connection with an identity proved by Joseph Lagrange.

Both cases arise in problems of simulating quantum computations. Our hope is that the regularities in these cases enable one to restrict the ways in which cancellations can occur. A clever choice of dependence in the ${r(k)}$ values then might interact with these restrictions. Well, at this point it is just an idea. Surveying things that boil down to this kind of idea could be another post, but for now we can invite you the readers to comment on your favorite examples.

## Open Problems

Can we delineate methods for dealing with cancellations, also when we don’t need to estimate a sum closely but only prove it is positive?

[fixed omission of NP-oracle for Stockmeyer randomized counting]

1. August 28, 2015 1:20 am

No need to bitch about, “some” SATs are linearly satisfiable, not only 2-SAT but also all renamable-Horn; http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.1930

• August 31, 2015 11:29 am

Thanks very much—thinking backward from the mentioned result with Dick, I forgot about the NP-oracle condition. We’re still not sure of exactly how best to put it.