How finding upper bounds can separate complexity classes Dexter Kozen is a famous theorist, who among many other wonderful things co-invented the notion of alternating Turing Machines. He also is the author of a terrific book—his other books are pretty neat too—but this one is my favorite. Dexter also proved a controversial theorem decades ago about the power of diagonal methods in the separation of complexity classes. More on that later.

Today I want to talk about a possible approach to the separation of powerful complexity classes. The approach is based on a diagonal argument, as predicted by Dexter’s paper. The method is general, and tries to separate classes not by lower bounds, but finding upper bounds.

When alternating Turing Machines were discovered, the idea was somehow in the “air.” Two papers at the same FOCS presented essentially the same results; these papers were later merged into one famous paper by Ashok Chandra, Dexter Kozen, and Larry Stockmeyer.

I was at Yale, in the computer science department, when Dexter co-invented alternation. It’s a long time ago, but I was working with a graduate student named David Cotton. He was, one of the few students who I worked with over the years that dropped out of the PhD program—I somehow failed him. He had some family issues that probably made it hard for him to continue, but I always felt bad that he did not get his degree. I have lost touch with him, and hope he is doing well.

I mention David because just before FOCS that year, David was beginning to work on alternation. We did not call it that, of course, but as I said the idea was in the “air.” He was working on relating alternating time and space, when we heard about the FOCS results. We did not have the results, but I think that David was getting close, since he was a very strong problem solver.

One example of his ability was evident one day at a problem seminar that we ran at the time at Yale. The seminar meet once a week, where we discussed open problems and people presented ideas or even solutions to the problems.

One week someone, I forget who, presented a neat problem about finding a linear time algorithm for a certain problem—let’s just call it X. The plan was to meet the next week and see if anyone had solved the problem. David did not attend the meeting when we discussed X. Since this was pre-email and Internet, I do not think he even heard about the problem.

That week I spent a huge amount of time trying to get a linear time algorithm for the problem. Partly, X was a cool problem, and mostly it was the competitive spirit. I wanted to solve the problem, before anyone else did. I eventually found a complex recursive algorithm that seemed to be linear, but the analysis of the running time depended on solving a tough recurrence—one that was too hard for me to solve. So I when down the hall to see Stan Eisenstat, who is a numerical analyst, and can solve just about any recurrence. He convinced me quickly that my algorithm ran in ${n\log n}$ time, and I stopped working on the problem—I had no other ideas.

At the next meeting of the seminar, Stan started to explain a linear time algorithm that he had “found.” I quickly saw that Stan was presenting “my” algorithm, but he had changed his mind and now could prove that it ran in linear time. However, the analysis was really complex, soon the blackboard was covered in mathematics: a lemma here and an equation there. It was a mess.

Then, David walked in; he always arrived late for the seminar. He saw Stan at the board and asked someone what was he doing. They handed David a sheet that described the problem X that Stan was trying to solve. David said to no-one in particular, “but that is trivial.” Stan ignored him. Then, David said it louder, and finally Stan turned to David, and as Stan slowly put his chalk down, in the chalkboard tray, he said with a smile on his face: “so why don’t you come up and show us your solution.” David stood up, walked up to the board, picked up the chalk, and proceeded to present an algorithm that ran in ${2n-1}$. The algorithm, once you saw it, was obviously correct—we all had missed it. Stan was speechless, as we all were. The meeting broke up.

Too bad David moved on to other things. In any event, let’s turn to see how we might solve P=NP. Perhaps if David was around still, perhaps he could help us make the following ideas work.

Upper Bounds That Imply Lower Bounds

I would love to see us start to prove some serious separation results. I know I have argued for possible collapses, but I do believe that some of our classes may be different. So I decided that it was time to write about an attack on separation theorems.

Here goes. The main issue that I see is that lower bounds are very difficult. You then say, “how can we then prove separation results without proving lower bounds?” The answer is based on using simple diagonalization along with some combinatorial cleverness. The hope is that this method will be able to avoid the known pitfalls—such as oracle results.

The method is based not on finding lower bounds, but on proving upper bounds. A simple analogy is: suppose I want to show you that ${X < Y}$. I can do two things: I can prove that ${X}$ is small, and conclude that ${X}$ must be less than ${Y}$; or I could prove that ${Y}$ is large and conclude that ${Y}$ is larger than ${X}$. They are really different approaches. Here is an old example that I have mentioned before—I have already discussed this theorem. If all sets in P have quadratic size circuit complexity, then a neat result of Ravi Kannan shows that P ${\neq}$NP. More generally,

Theorem: Suppose there is a constant ${c>0}$ so that for any ${S \subseteq \{0,1\}^{*}}$ computable in polynomial time, for each ${n}$ the circuit complexity of ${S \cap \{0,1\}^{n}}$ is ${O(n^{c})}$. Then, P does not equal NP.

Proof: By way of contradiction assume that P ${=}$NP. Ravi’s theorem proves that for any ${d}$ there is a set in second level of the polynomial hierarchy that requires a circuit of size ${n^{d}}$ for all ${n}$. But, since we have assumed that P ${=}$NP, the hierarchy collapses: thus, there is a set ${S_{d}}$ that is in P, yet ${S_{d}}$ requires circuit complexity of ${n^{d}}$. This is a contradiction, since we are free to choose ${d}$ so that ${d>c}$. $\Box$

Conditional Upper bounds and Separation

The basic insight is that we are good at creating algorithms, and terrible at proving that algorithms do not exist. My approach to separation is to show that the following type implication can be proved: $\displaystyle {\cal U} \implies \mathsf{L} \neq \#\mathsf{P}.$

Where the statement ${{\cal U}}$ is an assertion that a certain upper bound is true, conditioned on the assumption that ${\mathsf{L} = \#\mathsf{P}}$.

The conditional assumption is “free,” since we can assume it and reach a contradiction. Thus, the upper bound ${{\cal U}}$ can use any consequence of the collapse ${\mathsf{L} = \#\mathsf{P}}$. In particular the upper bound can use guessing and counting as part of the algorithm. Note, this follows since the collapse makes all of these fall into the class ${\mathsf{L}}$.

I have several upper bounds in mind. Today I will present the simplest one that is based on the determinant. There are technically weaker statements that I believe can be used. Of course the ultimate goal is to try and get an upper bound assumption that is provable. Such a bound, would lead us to an unconditional separation of the classes ${\mathsf{L}}$ and ${\#\mathsf{P}}$.

Note, before we go further that there is one interesting difference in this approach from trying to find a circuit lower bound. We currently have no idea how to prove strong circuit lower bounds in the general model. Even worse there is evidence that they may be hard to prove—for example, the work on Natural Proofs. On the other hand, there is no reason, that I know, to believe that clever upper bounds, especially conditional ones, exist for reasonable problems. We need only apply our algorithmic creativity to show that the right ${{\cal U}}$ is true and then we are done.

We are trying to prove implications of the form: $\displaystyle {\cal U}|\neg S \implies S.$

Here ${{\cal U}|T}$ means the statement ${{\cal U} \wedge T}$: I like the notation since it emphasizes the fact that ${T}$ is being conditionally assumed.

What To Try To Prove?

I think that P=NP is possible, but if I had to try to prove that they are unequal I would proceed as follows. I would try to first prove something much “easier,” well at least not as hard as the full problem. What I would do is try to prove that $\displaystyle \mathsf{L} \subsetneq \#\mathsf{P}.$

It is entirely possible that ${\#\mathsf{P}}$ could equal ${\mathsf{PSPACE}}$ so that the above could certainly be true.

This reminds me of a great talk given by Juris Hartmanis at Yale back in the 1970’s—see my earlier post. During the talk, he put up a slide that showed, $\displaystyle \mathsf{L} \subseteq \mathsf{P} \subseteq \mathsf{NP} \subseteq \#\mathsf{P} \subseteq \mathsf{PSPACE}.$

He then pointed out that at least one of the subset relationships must be strict, but we have no idea which one. The reason they must be strict is that ${\mathsf{L}}$ is easily shown by diagonalization to be weaker that ${\mathsf{PSPACE}}$: thus, they all cannot be equal. But which subsets are strict?

Describing Circuits

We need a notion of non-uniformity. Suppose that ${A}$ is a circuit and ${x=x_{1},\dots,x_{n}}$ is a string. Say that ${A}$ describes ${x}$ if for all ${i=1,\dots,n}$, $\displaystyle A(i)=x_{i}.$

Let us use ${A \rightarrow x}$ to denote this.

Now suppose that ${S}$ is a set. We use ${S \in }$ TC ${^{0}//f(n)}$ to mean that that there is a depth ${d}$ and a polynomial ${n^{c}}$, so that for all ${n}$ there exists a general boolean circuit ${A}$ of size ${f(n)}$ with the following property: There is a threshold circuit ${T}$ of depth ${d}$ and size at most ${n^{c}}$ such that

1. The circuit ${A}$ describes ${T}$, that is ${A \rightarrow \mathsf{Encode}(T)}$ where ${\mathsf{Encode}(T)}$ is some natural encoding of the threshold circuit ${T}$;
2. The threshold circuit ${T}$ computes ${S}$ for all inputs of length ${n}$.

The above definition is like a second level of non-uniformity. Instead of requiring a short description for the threshold circuit, we require a circuit of small enough size, which provides a short description of the threshold circuit that computes $S$

By ${\mathsf{det}_{n}}$ let us mean the boolean function of deciding the determinant of an ${n}$ by ${n}$ matrix, where the entries are from the finite field of two elements.

How To Try To Prove It?

Here is a sample of the type of result we are working on:

Theorem: Suppose that computing ${\mathsf{det}_{n}}$ is in TC ${^{0}//n^{\epsilon}}$ for any ${\epsilon>0}$. Then, L ${\subsetneq}$ #P is true.

I am working on the proof of this with my graduate student Shiva Kintali. The rough outline is this. Assume by way of contradiction that L= ${\#}$P—call this the “collapse.” Then, suppose that ${M}$ is a Turing machine that runs in time say ${n^{100}}$. The actual running time will depend on the constant in the collapse of ${\#}$P to L—but, for the outline we will ignore this. The goal is to simulate ${M}$ in time ${n^{99}}$, which will violate the classic time hierarchy theorem.

How do we do this? Well we have a number of powerful tools that we can use:

• We have the collapse, which allows us to “count” the size of any easy-to-describe set. This is very powerful.
• We also have that the determinant can be computed by some circuit that has constant depth and polynomial size. Note, we do not know which circuit it is, but, we do know that there is some circuit.
• We can “find” the circuit by using the collapse again, since the collapse allows us also to do guessing efficiently. Of course, we must be able to check that our guess is correct.
• Finally, we can reduce the simulation of ${M}$ to determining whether or not a graph is connected. This can be done by calculating a determinant by the famous Matrix Tree Theorem. Note, we are using the following fact: a graph is connected if and only if it has a spanning tree, which in turn is true if and only if a certain matrix has a non-zero determinant.

Open Problems

Can we make this work? We actually think we can weaken the hypothesis on the determinant quite a bit. Roughly we only need that an approximate determinant type function can be computed efficiently—more on this soon.

1. October 2, 2009 3:44 pm

Since #P is a class of function problems and L is a class of decision problems, I take it you mean “P^{#P}” instead of “#P”.

Just an observation: if the determinant problem is in TC^0 (uniformly) then we have at least NL != P^{#P}. The determinant is hard for NL (in particular the class “GapL”), so if it is in TC^0 then TC^0 = NL. However, we also know TC^0 != P^{#P} (since the permanent is not in TC^0; this is a result of Allender). Maybe this approach can also get the conclusion you want, under a slightly non-uniform assumption like yours?

2. 3. 