Who’s Afraid of Natural Proofs?
Obstacles to circuit lower bounds
Alexander Razborov is one of the world experts on circuit lower bounds. Perhaps the expert. In 1990 he won the Nevanlinna Prize for his work on lower bounds, more recently in 2007 he won the Gödel Prize (with Steven Rudich) for their beautiful work on Natural proofs (NaP).
His joint work on NaP is the subject of today’s post. Should we be worried about the so-called “Natural proof obstacle” to resolving P=NP? Or is this obstacle not an obstacle? I do not know, but would like to discuss this issue with you today.
Do “impossibility proofs” matter? I mean proofs that show that “you cannot solve problem by method ”? Sometimes they do: the Godel Incompleteness Theorem is a perfect example, it essentially destroyed any attempt to complete David Hilbert’s program and prove arithmetic consistent. Yet, Gerhard Gentzen carried on and proved that first order arithmetic is consistent, although he had to use transfinite induction up to . This is the limit ordinal of the sequence,
The great logician Alfred Tarski on hearing of Gentzen’s Theorem was asked if he now was more confident in arithmetic’s consistency–he answered “by an epsilon.” (Gentzen’s work was based on a logic that he called “natural dedcution”–curious coincidence?)
Another example is the work of Paul Cohen on set theory. I met Paul Cohen at a Royal Society workshop on the notion of “mathematical proof”: this has nothing to do with NaP. It was focused on the growing number of computer aided proofs, and what the math community should think about proofs that are not checkable by humans. There is now a growing list of such proofs; first there was the four color theorem, then Kepler’s problem, then again the four color theorem, then some results in geometry, the list is growing. I will have a post in the future on this interesting issue.
Cohen is, of course, world famous for creating the notion of forcing, and for using forcing to prove that the Axiom of Choice and the Continuum Hypothesis are independent from ZF set theory. After he proved this great result he was asked if he had been worried about his approach, since it came very close to violating a known impossibility theorem. He–it is claimed–said what result?
I find this story, true or not, revealing. Perhaps to make progress on hard problems one needs to have a certain world view, keep positive, and believe that a solution is possible. We must not let impossibility theorems scare us into not thinking about problems. We can let them guide us, we can let them show us what we cannot do, and what we can do. But we must move forward.
Suppose that you want to prove that a boolean function has high complexity–in some sense. Then, you might think that you could use special properties of this particular boolean function. Makes sense. Perhaps the function is the determinant of a square matrix, and your goal is a non-linear lower bound on its complexity. The determinant has many special properties that perhaps you can use in your lower bound proof: for example, the determinant of a invertible matrix is non-zero, or that certain determinants count the number of spanning trees of a graph, and so on.
The obstacle is, roughly, that a large class of approaches to circuit lower bounds must prove more. Your lower bound for must also prove that many other unrelated functions also have large complexity. Thus, you cannot use any special properties of your function. None. This seems like a major obstacle. Imagine that I ask you to prove that a certain number is prime, but your proof must at the same time prove many other unrelated numbers are prime. That sounds nuts. When you prove that a number is prime, you prove that it is prime: nothing else.
I think of the obstacle as a kind of drag-along principle: if you prove is complex, then you prove much more.
The Obstacle One: Bait and Switch
There are two concrete theorems that show different versions of the drag-along principle. I like to call the first one “Bait and Switch”: it is close to a question that I posed in the first post of this blog. However, it is a bit different from what I had in mine at the time, I was looking for a stronger result. More on this in the future.
Let be the set . Then, we plan to study the circuit and formula complexity of boolean functions. We use to denote the space of all boolean functions with -inputs, i.e. all so that . We also use to denote the space of all -input monotone functions.
If is -input boolean function, i.e. , then we use to denote the size of the smallest circuit for : in our circuits all the negations are at the inputs and we only allow as operations. We use to denote the monotone complexity of for .
I follow the treatment of Sanjeev Arora and Boaz Barak. The key idea is to look at complexity measures on boolean functions. A measure is a complexity measure if it assigns positive values to each boolean function. It is normalized so that it assigns to variables and their negations. Also it must satisfy the following two simple rules:
As they point out, a lower bound on yields a lower bound on the size of the formula complexity of : this follows by a simple induction. The Bait and Switch Lemma is the following:
Theorem: Suppose is a complexity measure and there exists a function such that . Then, for at least of all in ,
Proof: Let be any function in . Define where . Then, : this follows from the definition of exclusive “or”, since
By way of contradiction assume that contains more than of all of . If we pick the above function randomly, then are also random functions (though not independent). Using the trivial union bound we have
This implies by the definition of complexity measure that , which is a contradiction.
The reason this is an interesting theorem is that it shows that any formula lower bound based on a complexity measure would have to prove more. It would have to prove that not only but many other functions are hard. Thus, it would not be able to use the special structure of the boolean function .
This is a blog. Right. Well I must point out that about twenty-five years ago, a famous complexity theorist circulated a proof using exactly this method to show that SAT had super-polynomial complexity. The proof of course was wrong, but it took “us” a while to find the bug. No it was not me, and I will not say who.
Obstacle Two: Natural Proofs
The second drag-along principle is NaP, and is much more sophisticated. I will not explain the details of the notion of NaP’s. The main idea is suppose that we wish to prove that a boolean function has large complexity. A “natural” idea is to find a predicate on boolean functions in so that (i) is true and (ii) is true for a large fraction of .
Then, it should follow that is large. This is reasonable, since the conditions roughly say that has some key property in common with random boolean functions. Standard counting methods prove that most boolean functions are hard. Making all this work is the brilliance of the NaP paper. Note, they need a stronger condition on the predicate and other assumptions. See their paper for the details.
They show most (all?) of the standard lower bound proofs follow the above structure. That is the reason that NaP seems to such a powerful idea. A lower bound is forced to drag-along too much.
I wonder if the following can avoid the drag-along obstacles: use monotone complexity. The idea is simple, in order to get a lower bound on a boolean function , for the “right” ,
To carry out this program we will study an old idea that defines a transformation on boolean functions that has several interesting properties. If is a boolean function, then is a monotone boolean function where is the transformation. Moreover, its monotone circuit complexity is almost the same as ‘s non-monotone circuit complexity. More precisely,
If you believe what I just said, then feel free to skip to the open problem section. Or you can read on and see the details. Or you can come back later. Or you can do something else. Your choice.
Now let us turn to defining our transformation. We need two simple auxiliary functions and . Let equal
and let equal
Let . Then, is equal to
Note, that is a function of variables, i.e. . The main properties of our transformation are contained in the next theorems.
Proof: Suppose that is an -input boolean function. Let be equal to . If is not monotone, there must be so that and changing some input from to makes . Clearly, ; otherwise, after the change would still be . Since it must be the case that . But then after the change must be equal to . This is a contradiction.
Proof: Suppose that is a boolean function. Let be equal to . Then, by definition and . Thus, is equal to .
Proof: Suppose that is an -input boolean function. follows from Theorem 2. Now suppose that has a circuit of length : let the circuit for be We will construct a monotone circuit for . Use and as the inputs. To start let be the circuit where is constructed as follows:
- if is equal to the input , then is the same input ;
- if is equal to the negated input , then is the input ;
- if , then ;
- if , then .
Next compute :
This takes at most extra steps. So the total computation is now long.
The key is the following claim: . Suppose that is different from for some values of the inputs and . Then, clearly, ; otherwise, they would agree. Also must equal ; again, if not, the two values could not disagree. We now claim that for each , . Suppose that this was false. Then, let for some . Clearly, the common value cannot be since . Also the common value cannot be since . This proves that for each , .
Finally, a simple induction proves that for . This is clear since the only difference in the two computations is that takes the place of . This shows that and that .
All the above can be generalized to handle formula complexity instead of circuit complexity. There is essentially no difference in the proofs.
My question is does the following violate either the Bait and Switch Lemma or the NaP result. Suppose that is a boolean function that you wish to show has a certain complexity . Prove that the monotone complexity of is at least . Then, use Theorem 3 to conclude that the general complexity of is . Note, the above is interesting even for , provided
Bait and Switch Lemma (BSL): This does not seem to obviously apply. The problem is that we get a lower bound on which is a monotone function. There does not appear to be a BSL for monotone functions. The problem is how can we “add” something to a monotone function and then remove it later on?
Natural Proofs: This also does not seem to directly apply. The problem now is that is highly not random. It is a monotone function, but even more: it is almost always . As a matter of fact, it will be equal to with probability at most . How can a function that is essentially a constant look like a random one?
I think these questions may be easy to resolve, and that the two obstacles are indeed obstacles. But I am confused. For Arora’s view on NaP see his comments here.
By the way this is the post of this blog. As Knuth pointed when he turned , such events are important to computer scientists. So thanks for your support and interest, and I hope that together we can get to many more posts. Thanks again….dick lipton.