An overview of the new circuit lower bound of Ryan Williams

Joel Seiferas is a complexity theorist who has proved many important results. One of his results that is especially critical in modern complexity theory is the famous nondeterministic time hierarchy theorem. I also note that one of his most recent papers is on attempting to make the famous AKS sorting algorithm less galactic.

Today I thought we would outline in more detail how Ryan Williams’ great result goes. The “we” is Ken Regan and I.

Joel, besides his theorems and results, almost singlehandedly maintained a indispensable resource for theoretical computer science. This was in the days before Citeseer, DBLP, and other online bibliographies, even before Netscape and AltaVista. He typed citations of papers as they appeared in an abbreviated format, producing a compact plaintext file that was easy to browse. There were no page or issue numbers, just enough so you could find the paper—in your physical library. Joel’s handiwork can still be seen here.

Recall Ryan proves that ${\mathsf{NTIME}(2^n)}$ does not have ${\mathsf{ACC}^0}$ circuits of polynomial size. The first class is a uniform complexity class—it consists of languages accepted by nondeterministic Turing Machines that can run in exponential time. The class is uniform because any language in the class is defined by a program: the program can be clever, can do different things on different length inputs. But it is a fixed finite program.

The second is a non-uniform complexity class—it consists of languages defined by constant depth circuits with special gates. The gates are allowed to compute certain function types based on the sum of the inputs—and additionally, in ${\mathsf{ACC}^0}$ circuits, NOT gates are is allowed too. Each gate sums its input bits and then decide whether to output a ${0}$ or a ${1}$ based on the sum:

1. OR: Output ${1}$ if the sum is non-zero.
2. MOD: Output ${1}$ if the sum is zero modulo a fixed ${m>1}$.

Since ${\mathsf{ACC}^0}$ is non-uniform the circuit can depend on the input length ${n}$ in any manner at all. This is why it is called non-uniform. You can think of it as defined by a “program,” but one that is infinite in length:

If ${n=1}$, then use this circuit ${\dots}$

If ${n=2}$, then use this circuit ${\dots}$

${\vdots}$

If ${n=1415926535 8979323846}$, then use this circuit ${\dots}$

${\vdots}$

The reason there is such a mystery about the behavior of circuits is this ability to vary with the input length ${n}$, the ability to change what they do with the value of ${n}$. Nonuniform circuits can exploit the existence of problem-solving structures that are hard to compute, such as witness sets that arise from applications of the “Probabilistic Method.” Even so, it seems hard to believe that this ability really can be exploited to make problems much easier, but even among quite natural problems we cannot rule this out. The “we” here is not Ken and I, but is everyone.

The Landscape of Ignorance

In this section we summarize what is known about the various complexity classes: ${\mathsf{ACC}^0}$, ${\mathsf{P}}$, ${\mathsf{NP}}$, ${\mathsf{PP}}$, ${\mathsf{EXP}}$, ${\mathsf{DTIME}[2^n]}$, ${\mathsf{NTIME}[2^n]}$, and so on. You can skip this section and still understand the outline of Ryan’s proof. If you read on do not let the abundance of \textsf{sans-serif} words scare you. The general take-away is that we know some things, but not as much as we would like.

We do not know whether uniform ${\mathsf{ACC}^0}$ is properly contained in ${\mathsf{NP}}$, but we do know it is different from ${\mathsf{PP}}$ by a theorem of Eric Allender and his student Vivek Gore. Eric later extended this to separate uniform ${\mathsf{TC^0}}$ from ${\mathsf{PP}}$, and Ryan’s paper also builds on Allender-Gore. Note that since ${\mathsf{PP} \subseteq \mathsf{EXP}}$ and ${\mathsf{EXP}}$ has simple reductions to ${\mathsf{DTIME}[2^n]}$, it was known that ${\mathsf{DTIME}[2^n]}$ and hence ${\mathsf{NTIME}[2^n]}$ do not have uniform ${\mathsf{TC^0}}$ circuits. For nonuniform circuits, however, the best-known separation from nonuniform ${\mathsf{ACC}^0}$ was basically the same as for ${\mathsf{TC^0}}$ and ${\mathsf{P/poly}}$ itself: a class with somewhat more alternation than ${\mathsf{NTIME}[2^n]}$ that was defined in terms of “Merlin-Arthur games.”

Understand, most people believe that even nonuniform ${\mathsf{ACC}^0}$ is really weak, in some sense exponentially weaker than ${\mathsf{P}}$. And ${\mathsf{P}}$ is exponentially weaker than ${\mathsf{DTIME}[2^n]}$, let alone ${\mathsf{NTIME}[2^n]}$. Thus two exponentials and a little more were still not giving us a separation with nonuniform circuits—until Ryan’s paper, that is.

This is why circuits are so mysterious. Thanks to Ryan they are a bit less today.

Outline of the Proof

Suppose that you wish to prove that ${\mathsf{NTIME}(2^n)}$ is not contained in the circuit family ${\mathsf{ACC}^0}$ of polynomial size. Here is how you might proceed, especially if you are Ryan.

There is no cost in assuming that this is false: that ${\mathsf{NTIME}[2^n]}$ is contained in the circuit class ${\mathsf{ACC}^0}$. You can assume this—of course you then must show that it leads to a contradiction. Godfrey Hardy once said:

Reductio ad absurdum, which Euclid loved so much, is one of a mathematician’s finest weapons. It is a far finer gambit than any chess play: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game.

The plan is to use the assumption to show that we can simulate any nondeterministic Turing machine that runs in time ${2^n}$ in time ${o(2^n)}$. Provided the little ${o(2^n)}$ is not too close to ${2^n}$, then by the wonderful time hierarchy for such machines you will have your contradiction. This theorem, due to Joel Seiferas, Michael Fischer, and Albert Meyer is one of the great results of complexity theory. Note, it is critical that the machines are nondeterministic, since the corresponding hierarchy theorem for deterministic machines is much weaker.

We will now outline the steps of the proof. We will leave out the detailed constants and calculation parts—proofs like this have to check various bounds carefully. They are not that hard, but will detract from giving you the overall flow of the beautiful ideas. We hope this outline is helpful.

${\bullet}$ Pump Up The Volume. We have assumed that ${\mathsf{NTIME}[2^n]}$ is in ${\mathsf{ACC}^0}$. This is a powerful assumption, but it implies even more. So the first thing to do is to use it, via standard arguments, to show that another complexity class has ${\mathsf{ACC}^0}$ circuits. Let’s just call this class ${\mathsf{N^*}}$. For the expert it is ${\mathsf{E^{NP}}}$.

${\bullet}$ All The Small Things. Steve Cook’s famous theorem showed that any nondeterministic computation could be encoded into a 3-SAT problem. Since he first proved this almost 40 years ago, there have been improvements made to the exact encoding. At first any encoding was fine, later for more refined results more efficient encodings became necessary.

The next step is to assume that ${L}$ is a language in ${\mathsf{NTIME}[2^n]}$. Then we can conclude that there is a very simple circuit ${\mathsf{Des}(k)}$ that describes the set of 3-clauses. This is a critical step and uses the better encodings. The point is important: we cannot afford to write down the clauses that describe the computation of ${L}$. They would be at least ${2^n}$ long, actually longer, and we would already be in trouble. How can we run faster than ${2^n}$ if we write down an object that size? We certainly cannot. Hence, we need a small circuit that indirectly describes the clauses. The circuit ${\mathsf{Des}}$ can be used to get the ${k^{th}}$ clause in polynomial time. Again the key is that the describing circuit is tiny compared with the set of clauses it describes.

This type of Cook’s Theorem is called ${\mathsf{SUCCINCT \ SAT}}$. While the details of proving it are quite neat, the idea that such a describing circuit exists should not be too shocking. The point is that Cook’s Theorem is quite “regular” and this allows all to work.

${\bullet}$ Can I Get A Witness? We have to decide if a given input ${x}$ is in ${L}$. We know that this is true if and only if—I hate iff—the clauses described by the circuit ${\mathsf{Des}}$ are satisfiable. So the next step is to try and guess an assignment that makes all the clauses happy. Again we have a problem: the assignment is huge, and so we cannot even think about writing it down. But we have a powerful assumption: we have that ${\mathsf{N^*}}$ is in ${\mathsf{ACC}^0}$ of polynomial size. Thus the idea is to use an ${\mathsf{N^*}}$ machine to guess the assignment, but again use our assumption to get another polynomial sized circuit ${\mathsf{Wit}}$. This circuit has the property that ${\mathsf{Wit}(i)}$ is the assignment to the ${i^{th}}$ variable used in the clauses.

There is a technical point here—actually the reason we “pumped up the volume” to the class ${\mathsf{N^*}}$. In order to show that ${\mathsf{Wit}}$ exists we use a standard trick: we need to select the “first” satisfying assignment. If there were many assignments, then there would be the possibility that out witness circuit could be wrong. This step needs ${\mathsf{N^*}}$ and uses a binary search trick to get the first assignment. Are you still with us? This last part is a technical but necessary part; you can just avoid thinking about if you like.

${\bullet}$ Trust Yourself. Let’s summarize: we have a small circuit that describes the clauses and a small circuit that describes an alleged witness. The last step is to check that they are in agreement, that is to verify that every clause is satisfied. Again there are way too many clauses to do this by brute force. So what do we do?

This is Ryan’s main insight: this is another SAT problem itself, but it is about a polynomial sized ${\mathsf{ACC}^0}$ circuit. The SAT problem checks whether or not each clause is satisfied by the assignment encoded by the witness circuit. The good news is this is a small object—it is only polynomial size. The bad news is it a SAT problem for a quite powerful class of circuits. But the last insight is that Ryan can prove this can be done in ${o(2^n)}$.

${\bullet}$ Don’t Worry About The Government.

Finally, Ryan exploits the fact that ${\mathsf{ACC}^0}$ circuits ${C_n}$ of size ${s}$ can be represented as a simple function of many simple sub-circuits. Namely, there are circuits ${D_n}$ that have just two levels: a single output gate that computes a symmetric function of ${s' = 2^{\mathsf{polylog}(s)}}$-many level-1 gates, where each level-1 gate is an AND of input bits, some of which may be negated. It should be noted that his top-level argument does not use the standard feature of such “${\mathsf{SYM+}}$” circuits that the AND gates have only ${\mathsf{polylog}(s)}$ fan-in, though this property is needed inductively in their construction, and is achieved by his paper’s Appendix. This is like having a system of very small local governments with overlapping jurisdictions, united by a symmetric, hence unbiased, but very simple central authority. Would this be a better system of government, without state and county layers? Since it would need many thousands of Congressmen, probably not.

It is also important that ${\mathsf{ACC}^0}$ circuits are closed under a second kind of “localization”: For any circuits ${C_n}$ and ${\ell < n}$, we can form new circuits ${C'_m}$ with ${m = n - \ell}$ inputs, by choosing a set ${J}$ of ${\ell}$-many input variables, and taking a big OR of ${2^\ell}$-many circuits ${C_m^a}$. Each of the ${C_m^a}$ is obtained by fixing the variables in ${J}$ to an assignment ${a \in \{0,1\}^{\ell}}$. Then, and only then, do we apply the above to get ${D'_m}$ from ${C'_m}$. For a good choice of ${\ell}$, close but not too close to ${n}$, the ${2^\ell}$ term gets broken down by the ${\mathsf{polylog}(s)}$ terms in the above conversion. Applications of dynamic programming, replacing a fast multiplication method for “skinny” matrices in the original draft, then complete Ryan’s proof.

Open Problems

We have been thinking about how to axiomatize the properties of ${\mathsf{ACC}^0}$ circuits (in particular, the ${n^{O(1)}}$-size case of the class ${\mathsf{ACC}^0}$), so as to state general conditions for a circuit class ${{\cal C}}$ to enjoy Ryan’s separation theorems. We would like to get this for ${{\cal C} = \mathsf{TC}^0}$ or even ${{\cal C} = \mathsf{P/poly}}$. Well, everyone would like to get this—there has been much discussion of how possibly to extend Ryan’s methods, and we hope we can help in the conversation.

Finally, there are two special “hidden” features in this post—do you see them?

1. November 16, 2010 8:52 am

Oh, that was helpful! I think I have to read it more than once, but it is such a blessing you and Ken did that job. Thanks so much!

November 16, 2010 9:03 am

You probably refer to a misplaced paragraph (I assume Joel is the author of the outline) and the digits of
pi as a value of n.

November 16, 2010 9:30 am

I hope you don’t mind if I note that the last paragraph before the “Open Problems” is a bit rushed, especially compared to the rest of the very nice exposition. I think many people would be interested in your insights on the fast ACC SAT algorithm.

• November 16, 2010 10:54 am

Actually, I forgot to remove the sentence “For a good choice of {\ell}, close but not too close to {n}, the {2^\ell} term gets broken down by the {\mathsf{polylog}(s)} terms in the above conversion.” That’s not actually used, per my sentence with “It should be noted…” in the paragraph above it.

I should continue by saying that the circuits D’_m have m = n – \ell inputs, where \ell will be chosen equal to n^e for some small e (epsilon). The big point is then to show that satisfiability of D’_m can be checked in time order-of 2^m not 2^n, multiplied by lower-order factors in n. Since m = n – n^e, this will create a bound on the nondeterministic time that is o(2^n), sufficiently far to invoke the known NTIME hierarchy theorems. The remaining detail is to show that there is no slippage in the exponent of 2^m. Originally this was done by making two matrices, one with rows indexed by all possible assignments to the first m/2 variables and columns indexed by the polylog(s)-many AND gates, and the other similar but transposed for the other m/2 variables. The entries are 0 if the corresponding AND gate is forced to zero by the corresponding part of the assignment. Multiplying the matrices and feeding into the symmetric gate then determines whether D’_m is satisfiable. Since the matrices are “thin”, a cited theorem by Coppersmith shows they can be multiplied with only logarithmic overhead. This has now been replaced by a more-direct method. Dick and I are still checking details on that, and how much of the setup is actually necessary—this is what we allude to in the open-problems section. And more information may be forthcoming…

4. November 16, 2010 9:51 am

I can see clearly now!

November 16, 2010 1:31 pm

daveagp

Very good

5. November 16, 2010 6:42 pm

I was surprised to see a picture of Joel on a blog post about Ryan’s result! Check out http://www.math.ias.edu/~ryanw/ for a nice photo of Ryan Williams.

• November 16, 2010 11:43 pm

Hi, Danny. To offer a partly-apt chess analogy: Kasparov may have found the best Theoretical Novelties in the Gruenfeld System, but we still call it the Gruenfeld System :-).

To answer more directly, Dick and I take the long view. Here the seminal idea is that you can get a tighter hierarchy for NTIME than DTIME—as perceived by Seiferas-Fischer-Meyer [JACM 1978], with that author order.

November 17, 2010 9:27 am

It seems that you have left out a key piece of Ryan’s argument. The “Pump up the Volume” part of the outline is misleading. and does not seem to apply to the proof that NEXP does not have small ACC^0 circuits. In the proof that E^NP can’t have small ACC^0 circuits, the implied collapse is indeed at the NE level (2^n time can be sped up to o(2^n) time). However, the argument that NEXP does not have small ACC^0 circuits requires a version of Impagliazzo, Kabanets, and Wigderson’s “easy witness” technique, in particular that under the assumption that NEXP is in P/poly (a much weaker assumption then E^NP is in ACC^0), one can find witnesses for NEXP languages that are succinctly described by polysize circuits . This certainly does not work via “standard arguments” as you describe. In this case, BTW, there is no binary search for witnessing assignments since the witnessing circuits exist directly via the version of the [IKW] result (and the fact that the assumption means that those polysize circuits are actually ACC^0 circuits of suitable size).

November 17, 2010 2:00 pm

Thanks for the terrific write-up of a guide to Ryan’s welcome new result. I hadn’t
encountered SUCCINCT SAT before (sheltered life and all that) and I’m surprised how
little Google turns up under that terminology. What a cute trick to arrange that “the describing circuit is tiny compared with the set of clauses it describes.” The usual
problem that I’ve encountered with this sort of succinctness in other contexts is that by
simple counting arguments there are never enough short/succinct descriptions to go
around. I haven’t wrapped my head around whether that’s a problem that needs to be
evaded in the current proof or if it just seems like a problem.

An example evasion in Impagliazzo/Kabanets/Wigderson “In Search of an Easy
Witness: Exponential Time vs. Probabilistic Polynomial Time” is elaborated as
“Since there are few binary strings with small descriptions, such a search is more efficient…if our search fails, then we obtain…a certain ‘hardness test’ [with which to use]
known hardness-randomness tradeoffs.” Does anything like that happen in this proof?
Aren’t there about as many CNF formulas as binary strings?

Anyway, I haven’t even grokked the definition of SUCCINCT SAT yet. I get that it is a
problem whose inputs are circuits expressing compressed encodings of 3-CNF formulas, but what I don’t get is what standard encoding of 3-CNF formulas is assumed and how sensitive SUCCINCT SAT is to the cleverness of that starting encoding. No point in compressing a bloated encoding of course, surely you have to start by removing all the bloat you can. What hurts is that I’m sure this is actually the easy part that should have been covered in my first complexity theory class.

Anyway the tour through the proof as presented here is highly intuitive and makes me
want to dig through the details. Good show!

• November 19, 2010 3:40 pm

Paul, you are right, thanks! In terms of our thinking, the “Can I get a Witness?” section was supposed to gloss over the use of IKW, and we got our wires crossed on whether ” N* ” should mean NEXP or E^NP. We’ve actually been wanting to streamline the argument so that it uses a single N* that sits over both NE/NEXP and E^NP, so that the proof for NE doesn’t have to be routed thru details of the proof for E^NP. (This might, however, have to sacrifice the better concrete lower bound for E^NP.)

Here’s our amended version of the chain of events:

() Assumption A1: NTIME[2^n] \subseteq ACC^0.

() Goal: Simulate a general NTIME[2^n] machine N in nondeterministic time 2^n/g(n), for big enough g(n) to contradict A1.

() A1 ==> A2: NEXP \subseteq ACC^0.

() NEXP <= P/poly (which is weaker than A2) ==> A3: every language in NEXP has poly-size witness-describing circuits ==> A3′: every yes-instance of Succinct 3SAT has a succinct satisfying assignment. (This uses IKW, which uses de-randomization.)

() A3′ && P \subseteq ACC^0 ==> A4: the Succinct 3SAT instances obtained from the cases where the machine N accepts its input x have ACC^0 circuits describing a witness. (This routes thru the proof for the E^NP case; again you are right that the binary-search step for E^NP is not part of where these roads join.)

Then A4 feeds into the “Trust Yourself” section…

7. November 18, 2010 6:40 am

(probably many asked it already) Is it possible that Ryan method will lead to strengthening his theorem to the case where ACC is replaced by higher complexity class where the natural proof barrier applies like tc0 or even P

• November 19, 2010 4:40 pm

Indeed, at one point the “Don’t Worry” section was titled, “What’s Special About ACC?” As Ryan himself emphasizes, and my reply to Paul (Beame) just now re-iterates, there’s a de-coupling of the assumption “N*”-in=ACC^0 into N*-in-P/poly and … later … P/poly things you get are in ACC^0. Thus alternatives for doing the “later” things for higher classes would yield separations, e.g. Ryan’s paper mentions a sufficient condition to separate from NC^1.