Barrington Gets Simple
On the power of bounded width computations
David Barrington is famous for solving a long standing open problem. What I love about his result is that the problem was open for years, his solution is beautiful, and also that he surprised us. Most of us who had worked on the problem guessed the other way: we thought for sure it was false. Again conventional wisdom was wrong. I wonder about other problems.
Bounded Width Computation
There are many restricted models of computations, without these weak models we would have no lower bounds. Proving in a general model–such as unrestricted boolean circuits–that an explicit problem cannot be computed efficiently seems to be completely hopeless. However, once some restriction is placed on the model there is at least hope that an explicit problem has a lower bound. Barrington worked on a restricted model called bounded width computations. While most (many? all?) of the community believed that this model was extremely weak and would yield lower bounds, he shocked us and showed that bounded width computations were quite powerful.
A bounded width computation is a way of computing a boolean function. Suppose that are the inputs to the function. You can think of a bounded width computation as a line of boxes. Each box is associated with some input . The box gets bits from the left, uses these and the input and decides on its output which goes to the box on its right. Of course the first (left most) box is special and gets a fixed value from the left; the last (right most) box outputs the answer. The size of the computation is the number of boxes or equivalently the length of the line. So far anything can be computed by such devices. To see this just have each box concatenate the left input with their . Then the last box gets as a string and so can easily determine the value of any boolean function.
The key is that we restrict the number of bits that can be passed from one box to the next. In particular bounded width computations must have their boxes only receive and send bits independent of the number . This is the restriction that makes the model interesting. While the bits that can be passed must stay bounded, the length can grow faster than . Therefore, different boxes can use or read the same input . Without this added ability the model would reduce to just a finite state type machine and would be quite weak when coupled with the bounded restriction.
The question that Barrington solved was: exactly what can bounded width computations compute with size at most polynomial? The conventional wisdom at the time was that such a model could not compute, for example, the majority function. My intuition–that was dead wrong–was that in order to compute the majority function the boxes would have to somehow pass the number of bits that are to their left. This of course does not work because this grows as and is certainly not constant. I even worked with Ashok Chandra and Merrick Furst and proved a non-linear lower bound on the size of a bounded width computation for majority. While our bound was just barely non-linear we thought that we might be able to improve it and eventually show that majority was hard. By the way this work with Chandra and Furst will be covered soon in another post on communication complexity.
Barrington proved the following beautiful theorem:
Theorem 1 Any boolean formula of size in can be computed by a bounded width computation of size at most .
The theorem shows immediately that we were wrong. Since majority can be shown to have a polynomial size formula his theorem showed that the conventional wisdom was wrong. Often a weaker theorem is stated that shows that if the formula is depth , then the bounded width computation is at most size . The above is really not stronger since it has long been known that any formula of size could be converted into one of depth. The corresponding question about depth for circuits is still wide open: we have no idea how to make a circuit have smaller depth and not explode its size.
That formula’s can be made to have logarithm depth in their size without increasing their size too much has a long and interesting history. I do not have the space now to explain the details, but the key is that binary trees have good recursive properties–let’s leave it at that for now. Spira worked on the first result, then Brent, Pratt, and others. See Sam Buss’s paper for a recent result and some more background.
I will not give the full proof of Barrington’s Theorem, but will give an overview. His brilliant insight was to restrict the values that boxes pass from one to another to values from a finite group . Clearly, as long as the group is fixed this will satisfy the bounded width restriction: a group only takes bits. What is not at all obvious is that by restricting his attention to finite groups he could see how to make the bounded width model compute any formula. I think this is a nice example of a trick that we should use more often. By adding constraints he made the problem easier to understand: Henry Kissenger once said, “The absence of alternatives clears the mind marvelously.” By focusing on groups David was forced to use certain constructs from group theory that made his method work.
Barrington’s theorem is proved by showing that every formula can be computed by a bounded width computation over the same fixed group . The group is assumed to be a non-abelian simple group. A key notation is that of commutator of two group elements: the commutator of is . Since is non-abelian and simple it follows that .
The way that a bounded width computation will compute a formula is simple: if the formula is true let the output be the identity element of the group; if the formula is false, then the output will be a that is not the identity. To make the induction work we will insist that can be any non-identity element. This is a standard Pólya trick that makes the proof work nicely. Clearly, we can handle any input variable . Negation is also simple: just multiple by and this flips the value of and .
The key case is how do we do an “or” gate? Suppose that we want to compute . Further assume that we want the values of this computation to be . Then, we can select so that . Let’s use (resp. ) to denote the bounded width computation that computes (resp. ). Then, the claim is that
computes the “or”. The arrow means they form a series of “boxes”. It should be clear that if either or outputs , then outputs . What happens if they both do not output . Then we get the value
But this is the value that we needed. Pretty clever.
I have some interesting related results that I wanted to talk about. Since this post is already getting a bit long I will explain some of the related work on Barrington’s Theorem in the future.
This area is rich with open problems. For starters we do not understand what groups are needed to make Barrington’s Theorem work. He shows that any non-abelian simple group is enough. In the opposite direction it is known that no nilpotent group will work. The big open question is what about solvable groups? We believe that they should not be able to compute the majority function, but we all guessed wrong before so perhaps we are wrong again. It is not hard to see that the exact technology that David uses will not work for solvable groups. But that, of course proves nothing. I consider this one of the most interesting open questions in this area of theory. The reason that nilpotent groups can be handled is that they are always direct products of p-groups. And p-groups can be handled by known results.
I have a completely wild idea that I hesitate to share less you stop reading my blog and think I have lost it. But I cannot resist so I will tell you it anyway. Suppose that you could prove from first principles that no group of odd order could compute the majority function. This would be a major result, since it would imply the famous (to group theorists at least) Odd Order Theorem of Walter Feit and John Thompson. They showed that any odd order group cannot be simple. So if no odd order group can compute majority, then by Barrington’s Theorem no odd order group is simple. The original proof of the Feit-Thompson Theorem is over 200 journal pages–255 to be exact.
A possible more doable idea would be to prove the also famous theorem of Burnside. Burnside’s theorem from group theory states that if G is a finite group of order then is solvable. Hence each non-abelian finite simple group has order divisible by three distinct primes.
The reason I think that there is at least some hope that complexity theory could prove such theorems is that we bring a completely different set of tools to the table. I think it could be the case that these tools could one day prove a pure theorem of mathematics in an entirely new manner. That would be really cool.