Obstacles to circuit lower bounds

Alexander Razborov is one of the world experts on circuit lower bounds. Perhaps the expert. In 1990 he won the Nevanlinna Prize for his work on lower bounds, more recently in 2007 he won the Gödel Prize (with Steven Rudich) for their beautiful work on Natural proofs (NaP).

His joint work on NaP is the subject of today’s post. Should we be worried about the so-called “Natural proof obstacle” to resolving P=NP? Or is this obstacle not an obstacle? I do not know, but would like to discuss this issue with you today.

Do “impossibility proofs” matter? I mean proofs that show that “you cannot solve problem ${X}$ by method ${Y}$”? Sometimes they do: the Godel Incompleteness Theorem is a perfect example, it essentially destroyed any attempt to complete David Hilbert’s program and prove arithmetic consistent. Yet, Gerhard Gentzen carried on and proved that first order arithmetic is consistent, although he had to use transfinite induction up to ${\epsilon_{0}}$. This is the limit ordinal of the sequence,

$\displaystyle \omega, \omega^{\omega},\omega^{\omega^{\omega}},\dots$

The great logician Alfred Tarski on hearing of Gentzen’s Theorem was asked if he now was more confident in arithmetic’s consistency–he answered “by an epsilon.” (Gentzen’s work was based on a logic that he called “natural dedcution”–curious coincidence?)

Another example is the work of Paul Cohen on set theory. I met Paul Cohen at a Royal Society workshop on the notion of “mathematical proof”: this has nothing to do with NaP. It was focused on the growing number of computer aided proofs, and what the math community should think about proofs that are not checkable by humans. There is now a growing list of such proofs; first there was the four color theorem, then Kepler’s problem, then again the four color theorem, then some results in geometry, the list is growing. I will have a post in the future on this interesting issue.

Cohen is, of course, world famous for creating the notion of forcing, and for using forcing to prove that the Axiom of Choice and the Continuum Hypothesis are independent from ZF set theory. After he proved this great result he was asked if he had been worried about his approach, since it came very close to violating a known impossibility theorem. He–it is claimed–said what result?

I find this story, true or not, revealing. Perhaps to make progress on hard problems one needs to have a certain world view, keep positive, and believe that a solution is possible. We must not let impossibility theorems scare us into not thinking about problems. We can let them guide us, we can let them show us what we cannot do, and what we can do. But we must move forward.

The Obstacle

Suppose that you want to prove that a boolean function ${f}$ has high complexity–in some sense. Then, you might think that you could use special properties of this particular boolean function. Makes sense. Perhaps the function is the determinant of a square matrix, and your goal is a non-linear lower bound on its complexity. The determinant has many special properties that perhaps you can use in your lower bound proof: for example, the determinant of a invertible matrix is non-zero, or that certain determinants count the number of spanning trees of a graph, and so on.

The obstacle is, roughly, that a large class of approaches to circuit lower bounds must prove more. Your lower bound for ${f}$ must also prove that many other unrelated functions also have large complexity. Thus, you cannot use any special properties of your function. None. This seems like a major obstacle. Imagine that I ask you to prove that a certain number is prime, but your proof must at the same time prove many other unrelated numbers are prime. That sounds nuts. When you prove that a number is prime, you prove that it is prime: nothing else.

I think of the obstacle as a kind of drag-along principle: if you prove ${f}$ is complex, then you prove much more.

The Obstacle One: Bait and Switch

There are two concrete theorems that show different versions of the drag-along principle. I like to call the first one “Bait and Switch”: it is close to a question that I posed in the first post of this blog. However, it is a bit different from what I had in mine at the time, I was looking for a stronger result. More on this in the future.

Let ${B}$ be the set ${\{0,1\} }$. Then, we plan to study the circuit and formula complexity of boolean functions. We use ${ {\cal B}_n}$ to denote the space of all boolean functions with ${n}$-inputs, i.e. all ${f}$ so that ${f:B^n \rightarrow B}$. We also use ${ {\cal M}_n}$ to denote the space of all ${n}$-input monotone functions.

If ${f}$ is ${n}$-input boolean function, i.e. ${f \in {\cal B}_n}$, then we use ${C(f)}$ to denote the size of the smallest circuit for ${f}$: in our circuits all the negations are at the inputs and we only allow ${\{\vee,\wedge\}}$ as operations. We use ${C_{m}(f)}$ to denote the monotone complexity of ${f}$ for ${f \in {\cal M}_{n}}$.

I follow the treatment of Sanjeev Arora and Boaz Barak. The key idea is to look at complexity measures on boolean functions. A measure ${\mu(f)}$ is a complexity measure if it assigns positive values to each boolean function. It is normalized so that it assigns ${1}$ to variables and their negations. Also it must satisfy the following two simple rules:

$\displaystyle \mu(f \vee g) \le \mu(f) + \mu(g)$

and

$\displaystyle \mu(f \wedge g) \le \mu(f) + \mu(g).$

As they point out, a lower bound on ${\mu(f)}$ yields a lower bound on the size of the formula complexity of ${f}$: this follows by a simple induction. The Bait and Switch Lemma is the following:

Theorem: Suppose ${\mu}$ is a complexity measure and there exists a function ${f \in {\cal B}_{n}}$ such that ${\mu(f) > s}$. Then, for at least ${1/4}$ of all ${g}$ in ${{\cal B}_{n}}$, ${\mu(g) > s/4.}$

Proof: Let ${g}$ be any function in ${{\cal B}_{n}}$. Define ${f = h \oplus g}$ where ${h=f \oplus g}$. Then, ${\mu(f) \le \mu(g) + \mu(\bar g) + \mu(h) + \mu(\bar h)}$: this follows from the definition of exclusive “or”, since

$\displaystyle f = (f \oplus g) \oplus g = h \oplus g = h \wedge g \vee {\bar h} \wedge {\bar g}.$

By way of contradiction assume that ${\{g : \mu(g) < s/4\}}$ contains more than ${3/4}$ of all of ${{\cal B}_{n}}$. If we pick the above function ${g}$ randomly, then ${\bar g, h, \bar h}$ are also random functions (though not independent). Using the trivial union bound we have

$\displaystyle Prob[ \text{All of } h,\bar h,g,\bar g \text{ have } \mu \le s/4 ] > 0.$

This implies by the definition of complexity measure that ${\mu(f) \le s}$, which is a contradiction. $\Box$

The reason this is an interesting theorem is that it shows that any formula lower bound based on a complexity measure would have to prove more. It would have to prove that not only ${f}$ but many other functions are hard. Thus, it would not be able to use the special structure of the boolean function ${f}$.

This is a blog. Right. Well I must point out that about twenty-five years ago, a famous complexity theorist circulated a proof using exactly this method to show that SAT had super-polynomial complexity. The proof of course was wrong, but it took “us” a while to find the bug. No it was not me, and I will not say who.

Obstacle Two: Natural Proofs

The second drag-along principle is NaP, and is much more sophisticated. I will not explain the details of the notion of NaP’s. The main idea is suppose that we wish to prove that a boolean function ${f \in {\cal B}_{n}}$ has large complexity. A “natural” idea is to find a predicate $\Phi$ on boolean functions in ${{\cal B}_{n}}$ so that (i) $\Phi(f)$ is true and (ii) $\Phi(g)$ is true for a large fraction of ${{\cal B}_{n}}$.

Then, it should follow that ${C(f)}$ is large. This is reasonable, since the conditions roughly say that ${f}$ has some key property in common with random boolean functions. Standard counting methods prove that most boolean functions are hard. Making all this work is the brilliance of the NaP paper. Note, they need a stronger condition on the predicate $\Phi$ and other assumptions. See their paper for the details.

They show most (all?) of the standard lower bound proofs follow the above structure. That is the reason that NaP seems to such a powerful idea. A lower bound is forced to drag-along too much.

Monotone Circuits

I wonder if the following can avoid the drag-along obstacles: use monotone complexity. The idea is simple, in order to get a lower bound on a boolean function ${f}$, for the “right” ${g}$,

$\displaystyle \textit { prove a monotone lower bound on } g$

then

$\displaystyle \textit { get a general lower bound on } f.$

To carry out this program we will study an old idea that defines a transformation on boolean functions that has several interesting properties. If ${f}$ is a boolean function, then ${\L (f)}$ is a monotone boolean function where ${\L}$ is the transformation. Moreover, its monotone circuit complexity is almost the same as ${f}$‘s non-monotone circuit complexity. More precisely,

Theorem: For any boolean function ${f}$,

$\displaystyle C(f) \le C_m(\L (f)) \le C(f) + 4n.$

If you believe what I just said, then feel free to skip to the open problem section. Or you can read on and see the details. Or you can come back later. Or you can do something else. Your choice.

$\displaystyle \S\S\S$

Now let us turn to defining our transformation. We need two simple auxiliary functions ${\alpha}$ and ${\beta}$. Let ${\alpha(x_1, \dots ,x_n,y_1, \dots ,y_n)}$ equal

$\displaystyle \bigwedge_i (x_i \lor y_i)$

and let ${\beta(x_1, \dots ,x_n,y_1, \dots ,y_n)}$ equal

$\displaystyle \bigvee_i (x_i \land y_i) .$

Let ${f:B^n \rightarrow B}$. Then, ${\L (f) }$ is equal to

$\displaystyle { \alpha(x_1, \dots ,x_n,y_1, \dots ,y_n) \land f(x_1, \dots ,x_n) \lor \beta(x_1, \dots ,x_n,y_1, \dots ,y_n). }$

Note, that ${\L (f)}$ is a function of ${2n}$ variables, i.e. ${ {\L (f)}:B^{2n} \rightarrow B}$. The main properties of our transformation are contained in the next theorems.

Theorem 1 For any boolean function ${f}$, ${\L (f)}$ is a monotone function.

Proof: Suppose that ${f}$ is an ${n}$-input boolean function. Let ${ \L (f)}$ be equal to ${g(x,y)}$. If ${g}$ is not monotone, there must be ${a,b}$ so that ${g(a,b) = 1}$ and changing some input from ${0}$ to ${1}$ makes ${g = 0}$. Clearly, ${\beta(a,b) = 0}$; otherwise, after the change ${\beta}$ would still be ${1}$. Since ${g(a,b) = 1}$ it must be the case that ${\alpha(a,b) = 1}$. But then after the change ${\beta}$ must be equal to ${1}$. This is a contradiction. $\Box$

Theorem 2 For any boolean function ${f}$,

$\displaystyle \L (f)(x_1, \dots ,x_n,\lnot x_1, \dots ,\lnot x_n) = f(x_1, \dots ,x_n).$

Proof: Suppose that ${f(x_1, \dots ,x_n)}$ is a boolean function. Let ${y}$ be equal to ${\lnot x_1, \dots ,\lnot x_n}$. Then, by definition ${\alpha(x,y) = 1}$ and ${\beta(x,y) = 0}$. Thus, ${ \L (f)}$ is equal to ${f(x)}$. $\Box$

Theorem 3 For any boolean function ${f}$,

$\displaystyle C(f) \le C_m(\L (f)) \le C(f) + 4n.$

Proof: Suppose that ${f}$ is an ${n}$-input boolean function. ${C(f) \le C_m(\L (f))}$ follows from Theorem 2. Now suppose that ${f}$ has a circuit of length ${l}$: let the circuit for ${f}$ be ${f_1, \dots ,f_l.}$ We will construct a monotone circuit for ${\L (f)}$. Use ${x_1, \dots ,x_n}$ and ${y_1, \dots ,y_n}$ as the inputs. To start let ${g_1, \dots ,g_l}$ be the circuit where ${g_i}$ is constructed as follows:

1. if ${f_i}$ is equal to the input ${x_j}$, then ${g_i}$ is the same input ${x_j}$;
2. if ${f_i}$ is equal to the negated input ${\lnot x_j}$, then ${g_i}$ is the input ${y_j}$;
3. if ${f_i = f_j \lor f_k}$, then ${g_i = g_j \lor g_k}$;
4. if ${f_i = f_j \land f_k}$, then ${g_i = g_j \land g_k}$.

Next compute ${g_m = \alpha \land g_l \lor \beta}$:

$\displaystyle {g_1, \dots ,g_l, \dots ,g_m.}$

This takes at most ${4n}$ extra steps. So the total computation is now ${l + 4n}$ long.

The key is the following claim: ${g_m = \L (f)}$. Suppose that ${g_m}$ is different from ${\L (f)}$ for some values of the inputs ${x}$ and ${y}$. Then, clearly, ${\beta(x,y) = 0}$; otherwise, they would agree. Also ${\alpha(x,y)}$ must equal ${1}$; again, if not, the two values could not disagree. We now claim that for each ${k}$, ${x_k = \lnot y_k}$. Suppose that this was false. Then, let ${x_k = y_k}$ for some ${k}$. Clearly, the common value cannot be ${1}$ since ${\beta = 0}$. Also the common value cannot be ${0}$ since ${\alpha = 1}$. This proves that for each ${k}$, ${x_k = \lnot y_k}$.

Finally, a simple induction proves that ${g_i = f_i}$ for ${i \le l}$. This is clear since the only difference in the two computations is that ${y_k}$ takes the place of ${\lnot x_k}$. This shows that ${g_l = f}$ and that ${g_m = \alpha \land f \lor \beta}$. $\Box$

All the above can be generalized to handle formula complexity instead of circuit complexity. There is essentially no difference in the proofs.

Open Problems

My question is does the following violate either the Bait and Switch Lemma or the NaP result. Suppose that ${f}$ is a boolean function that you wish to show has a certain complexity $t(n)$. Prove that the monotone complexity of ${L(f)}$ is at least $t(n)-4n$ . Then, use Theorem 3 to conclude that the general complexity of ${f}$ is $t(n)$. Note, the above is interesting even for $t(n)=cn$, provided $c \gg 4.$

Bait and Switch Lemma (BSL): This does not seem to obviously apply. The problem is that we get a lower bound on ${\L(f)}$ which is a monotone function. There does not appear to be a BSL for monotone functions. The problem is how can we “add” something to a monotone function and then remove it later on?

Natural Proofs: This also does not seem to directly apply. The problem now is that ${\L(f)}$ is highly not random. It is a monotone function, but even more: it is almost always ${1}$. As a matter of fact, it will be equal to ${0}$ with probability at most $1- (3/4)^n$. How can a function that is essentially a constant look like a random one?

I think these questions may be easy to resolve, and that the two obstacles are indeed obstacles. But I am confused. For Arora’s view on NaP see his comments here.

By the way this is the $32^{nd}$ post of this blog. As Knuth pointed when he turned $64$, such events are important to computer scientists. So thanks for your support and interest, and I hope that together we can get to many more posts. Thanks again….dick lipton.

1. March 25, 2009 1:49 pm

While Sasha Razborov is certainly a leading complexity theorist, calling him an “expert on circuit lower bounds” strikes me as going back to 1600 and looking for the world expert in quantum mechanics :)

March 25, 2009 8:59 pm

I am not sure I see where’s obstacle in the first Theorem you stated (from Arora-Barak) at least with respect to proving circuit lower bounds. After all, we already know that most functions have exponential circuit complexity. So the conclusion of proving a lower bound for a specific function should not deter us. Right?

March 25, 2009 9:07 pm

The issue is that we know that most functions are hard, but we want to prove that this function is hard. That is the issue. It is like knowing that a large fraction of the numbers with 1000 bits are prime. But we may wish to show that this number is prime. Does that help?

3. March 27, 2009 7:59 pm

Thanks for the 100000 excellent articles thus far! This is really, really good content — have you considered writing a book, even if it was just a collection of musings such as these? I was reading Rabin and Karp’s Turing Award lectures yesterday after class, and thinking it’d be great to have a literate “tour of complexity” book, even if it was just annotations of results ala Proofs from the BOOK.

BTW, the Dyson reactor was the Triga. Elisha Otis was at the New York World’s Fair in 1853; I was wrong about the Chicago Fair being 1896 (it was 1893).

ps Hehehe, wearing a Knuth shirt to 7530 Thursday was not based on your postscript! I’d just got my first Knuth check the day previous: http://www-cs-faculty.stanford.edu/~knuth/boss.html

March 29, 2009 12:28 am

I have to second Anon’s comment. We wish to show that a particular function $f$ is hard, for a particular $f$. The BSL then means that a lot of other functions are hard. But we already knew that lots of other functions are hard.

Dick, you’re response seems to support the idea that the BSL doesn’t hinder progress. If we wish to show that a particular number is prime, then knowing that lots of other numbers are prime certainly doesn’t hurt. It may not help, but it certainly doesn’t hurt.

March 29, 2009 2:43 pm

I’m with Anon and Jeremy. To use your example, suppose we wanted to test if some number n is composite. Unfortunately, suppose we have a BSL-type Lemma: If n is composite, then most of the numbers between 10n and 100n are as well.

But we already knew the conclusion, so I see no reason to be worried at all. We could just go and prove that n is composite.

6. April 4, 2009 7:50 am

Great site this rjlipton.wordpress.com and I am really pleased to see you have what I am actually looking for here and this this post is exactly what I am interested in. I shall be pleased to become a regular visitor :)

April 4, 2009 12:25 pm

I guess what that result says is that one can prove a function if complex, but using any special property of that specific function is unlikely to help.

8. April 8, 2009 4:47 pm

In my view, the drag-along principle should not be viewed as some kind of “absolute” barrier, but more of an observation about the kinds of lower-bound proofs we have been able to come up with.

There are several known ways to get around the drag-along principle: (1) monotone circuits, as you observe here and as noted in Razborov and Rudich’s paper; (2) diagonalization; (3) detailed combinatorial analysis, as in the best known linear lower bounds on circuit sizes of explicit functions (N. Blum, Lachish-Raz). The problem isn’t so much that we don’t know how to escape the drag-along principle at all; the problem is that for the explicit functions of greatest interest, these known ways of escaping it don’t seem very promising. The most obvious attempts to separate NP from P by diagonalization run into Baker-Gill-Solovay, and so far nobody seems to have any idea how to exploit the special combinatorial structure of SAT to prove a circuit lower bound. As for monotone circuits, there is again no definite barrier here, but what is lacking is an idea of how to proceed once (say) the P vs. NP problem is recast in this framework.

To put it another way, if you have some idea about how to prove interesting general circuit lower bounds by means of monotone circuits, then I think you should definitely pursue it, rather than assuming that there must be some barrier here that you’re missing.

April 8, 2009 8:15 pm

I agree. But we I still do not see exactly how the current barriers work against certain other approaches. Such as monotone or others. I think there is a lot more to say about this issue and I thank you for your thoughts.

9. April 9, 2009 10:20 am

If you’re interested in pushing the naturalization barrier further, here’s something to think about. Razborov and Rudich stated their result as a barrier against proving P/poly circuit lower bounds. Avi Wigderson suggested that perhaps their result could be strengthened to a barrier against proving superlinear circuit lower bounds. I can’t see how to do this, but it is true that Razborov and Rudich’s construction is easily adapted to show that if SIZE($n^c$) contains $2^{n^\epsilon}$-hard pseudorandom number generators, then there is no natural proof that is useful against SIZE($n^e$) for any $e > 1+ c/\epsilon$. For details see Theorem 1 of my paper on almost-natural proofs. It would be interesting to strengthen this to a barrier against superlinear circuit lower bounds.

10. October 5, 2009 3:59 pm

First of all, I found this Dicks discussion, as well as *new* results appearing there a “hammer”. Usually people think that monotone circuits and lower bounds for them are useless. Dick shows — this is not the case.

About “natural proofs barrier”. Here I have very mixed feelings. (I myself am fighting with lower bounds.) Sasha and Rudich say : to prove a lower bound you have to find a criteria that is: (i) large (fulfilled by a random function), (ii) constructive (can be easily verified starting from the truth-table), and (iii) useful (satisfying it means being out, having large circuit complexity). Though (iii) is a trivial requirement, both (i) and (ii) seem to me quite artificial. Say, a random graph is not K_4 free, nor is K_4-freeness easy to test (in log of number of vertices). So, why should we stick on properties of random functions and/or properties that are constructive?

I work with lower bounds for years, and “natural proofs” haven’t frightened me at all! I hang on what Avi Wigderson once said: prove *any* lower bound!

Go ahead! Prove lower bounds! Don’t look at any “barriers”!

11. October 6, 2009 8:48 am

P.S. I just realized that the meaning of the expression “to be a hammer” is not so common. This actually is a German substitute for “amazing”. I often heard Germans saying “das ist ein Hammer” (“this is a hammer”) when seeing something more than cool. Neither English nor German is my native language, but I like this expression …

When saying “a hammer” I meant Dicks very nice trick to make *any* boolean function monotone, that is, to reduce the lower bounds problem to the monotone case. This is different from the standard trick to take a slice of a function f, that is, to let f'(x) be equal to f(x) for all vectors x with k ones, and let f’ be equal to 0 (resp. 1) for all vectors with fewer (resp., more) ones.

We already have several arguments to prove high lower bounds for monotone circuits and formulas. Unfortunately, so as they are, they work neither for slice functions nor for functions of the form L(f). Still, this “not working for some functions as yet” is much better (at least less depressing) situation than “no argument for any function”, as in the case of non-monotone circuits.

[Also, I use *word* to underline the word, this not “word”.]

September 22, 2010 9:46 am

A small typo: proving Cm(L(f)) >= t(n) should imply C(f) >= t(n) – 4n, instead you wrote Cm(L(f)) >= t(n) – 4n should imply C(f) >= t(n) .

What a nifty construction! I have always felt that negations do not play a significant role in circuit complexity, but I never thought of making this that explicit.