Cook’s Class Contains Pi
Cook’s class and the BBP algorithm for Pi
Steve Cook won the Turing Award for his famous paper on SAT that started the P=NP problem. He has done many other wonderful things in many areas of computational complexity, I especially love his work on proof systms. I am proud that my Cook number is .
Cook is also famous for having a complexity class named after him: . Steve’s Class is the set of languages accepted in polynomial time and poly-logarithmic space. Think of as the set of languages that can be recognized efficiently, but with very small space. Today as problems studied begin to contain vast amounts of data, as the size of objects that we wish to understand grows, Steve’s Class has become even more important.
When I was an assistant professor at Yale University, David Dobkin and I would often argue about what is better having named after you or having named after you. For example, would you like a constant (Euler) or an inequality (Jensen) or an equation (Schrödinger) named for you, and so on? We finally decided the best was a “universe”. If you do not know the universe we were thinking about it is called Herbrand’s Universe named after the logician Jacques Herbrand. Having a complexity class named after you, like Steve has, is still pretty cool.
Computing fast involves two ideas. First, a fast converging summation and second the use of efficient algorithms for high precision arithmetic operations. For example, the methods use series like the following, due to S. Ramanujan,
These and other remarkable formulas are the basis of modern computations of : you might ask how does one find such formulas? I have no idea, that is definitely not my area of expertise.
The problem with these fast algorithms is not speed, they run in almost linear in the number of bits that you want of . The problem is the amount of space that they require. Prior to 1995, algorithms used linear space. So if you wanted to compute the first trillion bits of , the computation cost was reasonable; however, the space cost was huge.
In 1995, David Bailey, Peter Borwein, Simon Plouffe discovered a beautiful algorithm that changed all this. Their algorithm–called BBP–computes the bit of the number without first computing any of the previous bits. This went against the conventional wisdom . Before their work most thought that you needed to compute the earlier bits of to get the later bits. They shattered the conventional wisdom. If you have read any other parts of my blog, then you know this is one of my main ideas. We all guess wrong sometimes. What about P=NP?
From the viewpoint of complexity theory they showed that the problem of computing the bit of is in . The BBP algorithm has an interesting history and some controversy surrounding it. See this for a view from Plouffe about it . I only point this out to show you that the story behind results can be quite interesting. Indeed.
The BBP Algorithm
There are two things about the BBP algorithm that I find intriguing and want to share with you:
- The algorithm uses the exponential trick that we sometimes take for granted as computer scientists;
- The algorithm has a bug. It is not really an algorithm. Well it is almost an algorithm.
The first point is that the algorithm uses the following:
Lemma 1 The value of modulo can be computed in time polynomial in the number of bits in .
This well known fact–to computer scientists–is of central importance to many areas. It is used in cryptography, in number theory, and elsewhere. It is interesting that they felt the need to explain the
exponential trick in some detail . For example, they not only give the detailed code of this trick, they give the following example:
“thus producing the result in only 5 multiplications, instead of the usual 16”. I would like to talk more about this important idea in a later post.
Let’s now turn to finally explaining how their algorithm works. Rather than compute the bits of it is easier to explain the algorithm for the simpler case of the number . We first note that
is a well known summation–to those who know such sums. Their idea is this: to compute the bit of a number one needs only to compute the fractional part of . Denote the fractional part of by . Suppose that . Now let
where in an integer and . Then, is equal to
since all the other terms are integers: we use that . Suppose that , then ; if , then . Here will be selected later on.
Thus, to find the bit of we need only compute the factional part of :
Call the first sum and the second . They key insight is that each of these sums can be computed to high precision in polynomial time and small space.
The sum is equal to,
for . Note, this uses the fact that
Clearly, this can be computed in polynomial time to bits of precision in space at most . Note, this is where they need to use the exponential trick: it is important that the values of are small and easy to compute. The sum is even easier to compute. It is almost a geometric series so the terms decay off fast: thus, to compute it to bits of precision is easy in polynomial time and in space.
They then have the factional part of : if it is larger than they know that the bit is ; if it is smaller than they know that the bit is . A very pretty idea.
A “small detail” what is the value of : recall is the precision of the calculation. In practice they have found that is quite enough, with set to about From a complexity point of view we need to argue that making poly-logarithmic in suffices. They do not supply this argument. We will get back to this issue shortly, it is the reason the algorithm may not work.
The BBP method is the same for , they replace the summation for by the following summation for :
Part of the exciting story of their algorithm is how they found this summation. It was not known to anyone, they had to go out and discover the summation. How they did this is a long story that I will talk about another day. They used guessing and a clever search method based on lattice algorithms.
Note, the structure of their summation is of the same type as that for . It particular, it is of the form:
where and are polynomials and is an integer. Such sums all can be used to get the base digit of the value of the sum.
The beautiful algorithm of BBP does not actually work. The problem is simple. In a nutshell: what if the fractional part they get lies between and . Is the bit or is it ? They cannot tell. This is the problem. Of course they could increase the precision and lower , but how far do they have to go? In practice, as I said earlier, this issue seems to be a non-issue. However, to prove that their algorithm is in requires a bound. They do not have one.
The problem has to do with long runs of ‘s in the number they are computing the bits of. For example, suppose that had a huge number of ‘s in a row somewhere in its binary expansion. Then, how could we tell what is the correct bit? The fractional part must be determined whether is larger or smaller than : what happens if it is extremely close to ? The answer is this is the bug in the algorithm. The problem is this:
how do you tell them apart if the run of is very long? The following remakable quote is in Wikipedia:
”There is a vanishingly small possibility that a particular computation targets a section that is subject to small round-off errors being carried for many digits (e.g. in base 10, a long string of zeros or string of 9’s within the constant) and that the error will propagate to the most significant digit, but being near this situation is obvious in the final value produced.”
Excuse me, this is nonsense. The statement “vanishingly small possibility” is a statement of probability theory. There is nothing random here at all. I agree that it seems that the problem may never arise in practice, but that is different. The last point of the quote is correct. If the value of the fractional part is very close to we will know that we are in trouble. But that means that the BBP is not really an algorithm.
The obvious open problem is fix BBP and make it a real algorithm. I have a strong belief that the following should hold:
“Theorem” 2 Computing the bit of is in .
The plan of a proof is to use their basic BBP method. However, when one detects a long run of ‘s, then use the fact that satisfies a transcendence measure. . This means that satisfies:
for integers large enough and some . I believe that this can be used to show that the above theorem will work. Roughly, a long run of ‘s would violate the above inequality for
I may write a post working out the details of this argument, or you can take a try and work it out.