A FOCS 2009 paper that sheds new light on the power of computation over composite moduli

Avi is one of the greatest complexity theorists in the world—-do I need to add Wigderson? He has done so much great work that it would take the entire post to even begin to list some of his contributions. As you probably know, he along with Omer Reingold, Salil Vadhan, just won the prestigious Gödel Prize for their brilliant work on the zig-zag product.

Today I would like to talk about recent work that he and Arkadev Chattopadhyay have done on one of my favorite general questions: understanding the power of computing over finite rings, not finite fields. Another simple way to say this is: what is the power of modular computations over composite moduli? We know very little about their power, and this in my opinion, is one of the great challenges we face in complexity theory.

I have many stories about Avi, but I thought I will share the story of his first published paper. At the time he was a graduate student, I was running a problem seminar. One problem that I discussed one day was on coloring graphs. Back then the following question was open: Let ${G}$ be a graph with ${n}$ vertices and assume that ${G}$ has a ${3}$-coloring. If you restrict your algorithm to polynomial time, how many colors do you need? Then, we did not know about PCP methods, so for all we knew perhaps ${O(1)}$ colors were enough?

I could prove that there was a polynomial time algorithm that used at most ${O(n/\log n)}$ colors. This, had also been observed by others, and is really simple. Just break the vertices of ${G}$ into ${n/t\log n}$ groups of size ${t\log n}$. Each group can be colored with ${3}$ colors, so try all colorings and in time exponential in ${t\log n}$ you will find the ${3}$ coloring. Then, do the same for each group–using another set of ${3}$ colors. This uses ${3n/t\log n}$ colors and takes polynomial time, as long as ${t}$ is fixed.

Avi was attending my problem seminar. The next day he saw me and said that he had a polynomial time algorithm that used only ${O(\sqrt n)}$ colors. I was impressed, since I had thought hard about this problem. He then explained his algorithm, which was correct and quite neat. This was his first paper.

You may know his proof, but for those who do not here are the key ideas. Avi’s idea was to look at the degrees of the vertices of ${G}$. If there is one vertex, say ${v}$, with degree at least ${\sqrt n}$, then, its neighborhood is a bipartite graph–this uses that ${G}$ is ${3}$ colorable. So ${v}$ and all adjacent to it can be colored in ${3}$ colors. Remove them and continue. On the other hand, if all the degrees of the vertices are less than ${\sqrt n}$, then in this case use Brook’s Theorem:

Theorem: A graph with maximum vertex degree ${d}$ can be colored in at most ${d+1}$ colors.

This is a basic theorem in graph theory. Avi knew not only the theorem, but realized that it could be done in polynomial time. Like many graph theorems, the proof is constructive.

I think all and all a brilliant piece of work. I was impressed with Avi, and depressed that we all had missed such a beautiful argument.

Modular Computation

The central problem is what is the power of modular computations. Computations over primes are reasonably well understood. That is not true for composite moduli. The reason that computations are harder to understand is based on the following simple fact:

Lemma: Let ${f:(\mathbb{Z}_{m})^{n} \rightarrow \mathbb{Z}_{m}}$. If ${m}$ is a prime, then ${f}$ is equal to a polynomial modulo ${m}$.

This is false for ${m}$ a composite. The fact that not all functions are polynomials destroys the technology that Alexander Razborov and Roman Smolensky developed, so brilliantly, for understanding circuits of constant depth with modular gates over a fixed prime.

One early result that shows the difference between prime and composite moduli is the beautiful work of David Barrington, Richard Beigel, Steven Rudich that showed that the degree of a polynomial that computed the OR function could be ${O(\sqrt n)}$ modulo ${6}$. They actually prove much more, see their paper for details. The conventional wisdom is to expect better lower bounds technology in the future.

I cannot resist mentioning a result that is due to Nayantara Bhatnagar, Parikshit Gopalan, and myself. We show an equivalence between size of symmetric polynomials modulo composites and certain natural simultaneous communication protocols. This reduces the problem of proving bounds on the degree of symmetric polynomials to proving bounds on simultaneous communication protocols. Since communication bounds are often easier to understand, this equivalence helps explain the Barrington-Beigel-Rudich results. It also gives some new results. I may do a detailed post on this in the future.

The problem with our work is that it only works for symmetric polynomials. If the best way to compute some symmetric function is with a non-symmetric polynomial, then our method fails. Thus, one of the neat questions is: how much does the restriction to symmetric polynomials cost?

Linear Equations over Finite Rings

Avi and Arkadev Chattopadhyay have a wonderful paper that will appear in FOCS 2009. They have some new results on the lack of power of modular computation. They consider the constant depth (in fact depth 3) circuit family ${\Gamma}$ that has three layers: (i) the top gate is a majority gate, (ii) the next level gates are all ${AND}$ gates, and (iii) the base gates are ${\text{MOD}_{m}}$ that can have arbitrary accepting sets.

As an aside they actually allow the “dual” family to ${\Gamma}$ by replacing all the ${AND}$ gates by ${OR}$ gates.

A word about “accepting sets.” Because the modular computations are over rings, one must allow arbitrary sets for acceptance. By this we mean that a modular gate works like this:

$\displaystyle c_{1}x_{1} + \cdots + c_{n}x_{n} \equiv a \in A \bmod m$

where ${A}$ is allowed to be any set. Again this is because not all functions are polynomials. Arkadev points out to me that no superlinear lower bounds exist for depth-two ${\text{MOD}_{6}}$ circuits when the gates have arbitrary accepting set.

Then,

Theorem: Let ${m,q}$ be co-prime integers such that ${m}$ is the product of two distinct primes. If ${C}$ is a type ${\Gamma}$ circuit that uses ${\text{MOD}_m}$ gates and computes ${\text{MOD}_{q}}$, then the size of the circuit is ${2^{\Omega(n)}}$.

I will not try to explain how they prove this result, please see their paper for that. However, I will try to give some idea of the proof. They are naturally led to the problem of understanding systems of ${t}$ equations of the form

$\displaystyle l_{i}(x_{1},x_{2},\dots,x_{n}) \in A_{i} (\bmod \, m)$

where each ${l_{1},\cdots,l_{t}}$ are linear forms.

Let ${S}$ be the set of solutions to a system of equations of this form. They have to prove that there is an exponentially small correlation between the solution set ${S}$ and ${\text{MOD}_{q}}$.

The easy direction is to see if this was false, then there could be a problem. Suppose that some system of linear equations correlated some with ${\text{MOD}_{q}}$. Then, we might be able to amplify the correlation via the majority gate, so that it correlates well. This is not a proof, but I think this gives some motivation as to why such equations arise in their proof.

The complexity of such a system is because we are over a finite ring, and not a field. Thus, there is no Gaussian elimination available to help understand the system. Their attack is based on exponential sums, and uses several cool insights. Please take a look at the paper for details.

Open Problems

Their main theorem is restricted to moduli that are the products of two primes. Unless they have solved it already, an obvious open problem is what happens for three or more primes. Three is sometimes special: I still love Albert Meyer’s comment:

“Prove the theorem for ${3}$ and let ${3}$ go to infinity.”

Of course the other main question is what is the power of composite modular functions?

Also related: is there a theory of linear equations over finite rings? This could help with our understanding of modular computation, but would also be of independent interest.

July 7, 2009 9:17 am

Minor typos: I think you want to use four colors to color v and all adjacent vertices, where v has degree at least sqrt n. (Or three colors to color just the adjacent vertices.) The neighborhood including v need not be bipartite. Also, the paragraph after “Linear Equations over Finite Rings” is repeated.

You say that “what is the power of modular computations over composite moduli?” is “one of the great challenges we face in complexity theory.” I can see why this might be hard but don’t have any sense for why it is a great challenge. Are there applications? Or what is the motivation?

Thanks for the post.

July 7, 2009 12:04 pm

The vertices adjacent to v are 2-colorable. In the three coloring v gets colored say red. All those adjacent to v cannot be colored red. So the neighbors can only use the two remaining colors green and blue. Also critical point is can color bipartite graph in polynomial time.

The motivation for the composite? I guess it a simple type of computation that we cannot understand. It’s a kind of test: how can you get polynomial time if cannot get mod 6 gates?

Thanks for other typo.

2. July 7, 2009 9:40 am

A minor comment, Dick: the “d + 1 colors” Theorem you mention follows from a simple greedy algorithm; Brooks’ Theorem guarantees d-colorability (unless the graph — assumed connected — is a clique or an odd cycle). Of course, this issue is not a big deal in Avi’s approach.

July 7, 2009 12:00 pm

Of course, my error. Yes does not matter in the application.