A new result on the power of bounded depth circuits

Maurice Jansen has spent the past few years helping push back the frontiers of complexity theory. His penetrating work is revealing the structure of what makes a problem hard, and could lead to the breakthrough that we all are looking for.

Today we would like to discuss recent work that he has done in his current postdoc with Rahul Santhanam, whom we mentioned recently here. Their paper will appear in the proceedings of the 38th ICALP conference, July 4–8 in Zürich, Switzerland. It is exciting work that uses methods that we feel are—and will continue to be—central to finding lower bounds.

Perhaps one of the reasons he has been so successful is that he has also pushed hard on the physical frontiers of theory. Even in the age of the Internet, physical presence means a lot. Maurice has traveled from Russia to China to India, working with some top researchers and getting papers into these countries’ main theory conferences. He also organized a workshop on algebraic complexity while at Aarhus with Peter Bro Miltersen, at which Manindra Agrawal, Pascal Koiran, and Ketan Mulmuley were plenary speakers giving multiple talks. No doubt he has helped make complexity theory more global. The more who think hard about our great problems, the better.

In particular, Maurice has held a postdoc at Tsinghua University in Beijing, at the Institute for Theoretical Computer Science which was co-founded by Andy Yao. Maurice has also visited India where he worked with B.V. Raghavendra Rao and Jayalal Sarma, both students of Meena Mahajan. Globetrotting along with Maurice have been his wife Cathy and their young child. They toured some of the beautiful sights of China, including Xi’an of the famous Terracotta Warriors.

Wherever he is, Maurice has been squarely devoted to difficult problems of lower bounds against families of constant depth arithmetical circuits. Some of this work has been alone and some joint. We might not consider ICALP in Zürich to be as exotic as his other travels, but Switzerland has been “frontier” territory in wars from Hannibal to WW II.

The Frontiers

Let “exponential” size mean size ${2^{n^a}}$ for some ${a > 0}$, let “fully exponential” refer to size ${2^{\Omega(n)}}$, and let “demi-exponential” mean size ${s(n)}$ where ${s(s(n))}$, or ${s(s(s(n)))}$, or some finite iteration ${s^k(n)}$, is exponential—it is like a demitasse, a half cup. We ignore NOT gates at the inputs. Well we do not ignore them, they are very impotant gates, but we do not count them in the size of a circuit. Arithmetical circuits can have unbounded fan-in ${+}$ and ${*}$ gates, and besides the input gates, can have gates with fixed numerical constants.

The current frontiers of what is known about lower bounds are:

1. Constant-depth circuits of unbounded fan-in AND and OR gates need exponential size even to compute simple functions such as parity.
2. Constant-depth circuits of Boolean and Mod-${m}$ gates (for fixed ${m}$) need demi-exponential size to compute functions in ${\mathsf{NEXP}}$. This is Ryan Williams’ recent breakthrough result—previously nothing much more was known for these circuits than for the ostensibly more-powerful circuits in the next item.
3. Constant-depth circuits of threshold gates need half-exponential size to compute functions in ${\mathsf{MA_{EXP}}}$, which is a “Merlin-Arthur” class that extends ${\mathsf{NEXP}}$. The same is known for any kind of 0-1 logic circuit. Put another way, ${\mathsf{MA_{EXP}}}$ is the smallest standard complexity class that is known not to have polynomial-size Boolean circuits.
4. Polynomial-size, constant-depth threshold circuits that are uniform, however, cannot compute the permanent (Eric Allender, after his previous work with Vivek Gore). The size lower bound is known to extend only to functions that are not demi-exponential, not for any ${k}$.
5. For constant-depth arithmetic circuits over the rationals, however, essentially nothing is known.

Indeed, Manindra Agrawal and V. Vinay showed that if the permanent in ${N = n^2}$ variables cannot be computed by ${2^{o(n)}}$ size arithmetical circuits of depth 4, then it requires ${2^{\Omega(n)}}$ size, period. Thus we have the

Arithmetical Frontier: Prove super-polynomial lower bounds on arithmetical circuits of depth 4, or of depth 3.

Indeed there are not even any ${\Omega(N^2)}$ bounds on depth 3 for the permanent, and none beyond ${\Omega(N^2)}$ on depth 3 for any polynomial.

Given this state of affairs, what do we do? We have already done some game-changing in going from Boolean to arithmetical circuits. Alternatively, we can look for plausible premises to add, and/or place reasonable restrictions on the circuits. This is the kind of work Maurice has been assiduously doing, ever since his Ph.D. work under Ken which centered on arithmetical circuits whose numerical constants have bounded magnitude. The main new conceptual ingredient is one we have propounded here before.

Enter Succinctness

A string ${x}$ of length ${2^n}$ is succinct if there is an ${n}$-input circuit ${D}$ of size ${\mathsf{poly}(n)}$, or at least size ${\ll 2^n}$, such that for all ${i}$, ${D(i)}$ outputs the bit ${x_i}$. A graph ${G = (V,E)}$ is succinct if its edge relation ${E}$ is succinct, i.e., there is a small ${D}$ such that ${D(i,j) = 1}$ if and only if ${(i,j) \in E}$. A circuit ${C}$ is succinct if its underlying graph is succinct, together with information about the type of a given gate in ${C}$.

In particular, Maurice and Rahul define a family of arithmetical circuits ${C_n}$, each of size ${s(n)}$, to be succinct if there are Boolean circuits ${D_n}$ with ${m = 3 + \lceil\log_2 n\rceil + 2\lceil\log_2 s(n)\rceil}$ inputs and size ${2^{o(m)}}$ that decide questions of the form:

Is gate $a$ connected to gate $b$? Does it have type $t$?

Here ${t}$ can designate ${+}$ or ${*}$ for addition and multiplication gates, respectively, or ${i}$ for the ${i}$th input gate, or one of the allowed constants ${-1}$, ${0}$, or ${1}$. Thus these succinct circuits also have bounded constants. However, their notion of succinctness extends to allow ${t}$ also to belong to the double-exponential sequence ${\pm 2,4,16,256,256^2,\dots}$; call them spaced constants. They prove:

Theorem: The permanent does not have constant-depth succinct arithmetical circuits of polynomial size.

Note that we are talking about the non-existence of circuits ${C_n}$ of size ${s(n) = n^{O(1)}}$ whose direct-connection language can be decided by Boolean circuits of size ${n^{o(1)}}$, since ${m = O(\log n)}$ and so ${2^{o(m)} = 2^{o(\log n)} = (2^{\log_2 n})^{o(1)} = n^{o(1)}}$. For technical reasons their proofs employ bounds ${n^{1/\gamma(n)}}$ where ${\gamma(n)}$ is unbounded but stays ${o(\log\log n)}$. Of course allowing the circuits ${D}$ to have size sub-exponential in their number ${m}$ of inputs is pretty generous, since previously we have thought of the succinct describing circuits as having polynomial size. Thus their work makes a fairly minimal restriction of succinctness, and hence their result is strong.

Using Succinctness

Succinctness is weaker than uniform and more restrictive than general. A uniform family of circuits are all describable by one single algorithm; a succinct family of circuits all have small descriptions, but the description can vary from one to another in any way at all. Of course, a general family of circuits need have no small descriptions, and can vary from one to another in an arbitrary manner.

One challenge that succinct raises over uniform is that if a circuit has a succinct description, then it is possible to guess the description of the much larger circuit. This is very powerful, the ability to guess a much larger object without incurring the cost of writing it down is quite useful. The challenge is: there is no generic way to know that you have guessed correctly. Thus the use of succinctness assumptions requires some checking method to see that the guessed circuit is really the correct one.

Proof Idea

Their proof is clever and exploits a condition we’ll call Hypothesis H asserting the existence of a “nice” table ${a = [a(n,i,j)]}$ of numbers. From H the theorem follows. Maurice and Rahul do not know how to prove H directly, but they use a proof-by-contradiction method. Denying Theorem 1 causes the nonuniform counting hierarchy to collapse into nonuniform ${\mathsf{TC^0}}$. This collapse is enough to exploit their ability to create a “moderately nice” table in the polynomial hierarchy. On applying the famous theorem of Seinosuke Toda and the proof by Viktória Zankó that the 0-1 permanent is complete for ${\mathsf{\#P}}$ via many-one reductions, the table is good enough to imply Theorem 1 anyway. The only way to resolve this contradiction is for Theorem 1 to be true.

To state H, which is really a family of hypotheses over different constant depths ${d}$, define the language

$\displaystyle \mathsf{Bit}(a) = \{(1^n,i,j,k): \text{ bit } k \text{ of } a(n,i,j) = 1 \}.$

Hypothesis H then asserts that there exist polynomial ${p,q}$ and a function ${a}$ with ${a(n,i,j)}$ defined for ${0 \leq i < n}$ and ${0 \leq j < p(n)}$ to be a string of at most ${q(n)}$ bits, so that the following holds.

• Easy: ${\mathsf{Bit}(a)}$ is decidable by succinct ${\mathsf{TC}^0}$ circuits.
• Hitting: For all sufficiently large ${n}$, for every depth-${d}$ arithmetic circuit ${E}$ of size ${n}$ with spaced constants, if the polynomial ${E(x_1,\dots,x_m)}$ computed by ${E}$ (${m \leq n}$) is not identically zero, then there exists ${j}$ such that

$\displaystyle E(a(n,0,j),a(n,1,j),\dots,a(n,m-1,j)) \neq 0.$

Note that the circuit ${E}$ need not be succinct, though it has spaced constants. Nor does H impose a second succinctness assumption as a condition on Theorem 1 itself. Instead what is happening is that denying Theorem 1—i.e. allowing succinct constant-depth arithmetical circuits for the permanent—allows one to find succinct Boolean threshold circuits for ${\mathsf{Bit}(a)}$, which together with satisfying the hitting condition yields the desired contradiction.

This looks convoluted, since it is; but it is not circular and is quite ingenious. It is the kind of argument that we need to understand better, since it seems very powerful and could lead to great progress on lower bounds.

Further details are bound up with black-box de-randomization of polynomial identity testing, and can be found in the paper. It is a globetrotting tour of major complexity concepts, even more than we’ve mentioned.

Open Problems

Can we better understand the kind of diagonalization that the above notion of hitting represents, and extend Maurice and Rahul’s kind of argument? Can we use it for Boolean as well as arithmetical complexity theory?

Can the restriction on constants be removed from their theorem? Note other lower bounds on arithmetic circuits also require restrictions on constants. Is there really a way to use the size or bit-pattern of constants to “cheat” and decrease the size of a circuit? Jacques Morgenstern’s 1973 ${\Omega(n\log n)}$ lower bound on FFT uses the bounded-constants assumption, and we believe today, over forty years later, there is no proof that relaxes it in any significant way.

April 28, 2011 4:41 pm

Can you give an example of an (explicit, fairly nice) demi-exponential function? Examples aren’t springing to mind.

• April 28, 2011 5:05 pm

They are notoriously hard to represent explicitly. Best I can do it seems is link this MathOverflow item for half-exponential functions.

• April 29, 2011 11:57 am

This is my first attempt at using Luca Trevisan’s LaTeX2WP package, so apologies are extended if the following has typesetting errors.

To expand upon Ken’s remarks regarding demi-exponential functions (which is a fine name for them!), the analytic structure of these functions derives from the Lambert ${W}$ function, which is the subject of a classic article On the Lambert W Function (1996) by Corless, Gonnet, Hare, Jeffrey, and Knuth (yes, one somehow knew that Donald Knuth’s name would arise in connection to such an interesting function!).

The connection arises via the following construction. Suppose that a demi-exponential function ${d}$ satisfies ${d \circ d \circ \dots \circ d \circ z = \gamma \beta^z}$, where ${d}$ is composed ${k}$ times. We say that ${k}$ is the order of the demi-function, ${\gamma}$ is the gain and ${\beta}$ is the base. It is easy to show that the fixed points of ${d}$ are given explicitly in terms of the ${n}$-th branch of the Lambert function as ${z_f = -W_n(-\gamma \ln \beta)/\ln \beta}$. Then by a series expansion about these fixed points (optionally augmented by a Padé resummation) it is straightforward to construct the demi-exponential functions both formally and numerically.

Provided the demi-exponential base and gain satisfy ${\gamma \le 1/(e \ln \beta)}$, such that the fixed points associated to the ${n=-1}$ branch of the ${W}$-function are real and positive, this construction yields smooth demi-exponential functions that pleasingly accord with our intuition of what demi-exponential functions “should” look like.

Counter-intuitively though, whenever the specified gain and base are sufficiently large that ${\gamma > 1/(e \ln \beta)}$, then the demi-exponential function has no real-valued fixed points, but rather develops jump-type singularities. In particular, the seemingly reasonable parameters ${\beta=e}$ and ${\gamma=1}$ have no smooth demi-exponential function associated to them (at least, that’s the numerical evidence).

Perhaps this is one reason that demi-exponential functions have a reputation for being difficult to construct … it is indeed very difficult to construct smooth functions for ranges of parameters such that no function has the desired smoothness!

It might be feasible (AFAICT) to write an article On demi-exponential functions associated to the Lambert W Function, and to include these functions in standard numerical packages (SciPy, MATLAB, Mathematica, etc.). Some tough challenges would have to be met, however. Especially, there is at present no known integral representation of the demi-exponential functions (known to me, anyway), and yet such a representation would be very useful (perhaps even essential) in rigorously proving the analytical structures that the numerical Padé approximants show us so clearly.

April 29, 2011 6:11 am

Ravi Kannan got Knuth prize. But no mention of it in the theory blogs! Why?

• April 29, 2011 9:13 am

Kannan’s Knuth Prize was Tuesday’s top story on the Fortnow/GASARCH Computational Complexity weblog.

April 30, 2011 2:08 am

You are right! I wish Dick and Ken will write one of these days on some of Ravi’s work mentioned in the Knuth prize announcement.

May 2, 2011 10:12 am

Curious,

We should. He work on matrix sampling and low rank approximations is magical.

April 29, 2011 12:24 pm

a very recent related manuscript is available here:
http://homepages.inf.ed.ac.uk/cgi/mjansen1/JaSa11-2.pdf

June 21, 2013 4:41 am

Morgenstern proves a $\Omega(nlogn)$ lower bound for Fourier transform in the bounded coefficient model.

Let $x=[x1,x2,…xn]’$ be given vector and $F$ be Fourier transform matrix.

It is not known if $Fx$ needs $nlogn$ operations in the unbounded coefficient model when $xi$’s are treated generically as variables.

If one treates $xi$ as a finite precision integer of bounded size $2^b$ (instead of variables), is it possible that $Fx$ needs just $O(n)$ $\{+,-,x\}$ operations in the unbounded coefficient model?

assume $F$ is of finite precision. Or replace $F$ with $H$ in the above question where $H$ is Hadamard transform.