A discussion and proof that $\mathsf{NLOG}$ is closed under complement

Neil Immerman and Robert Szelepcsenyi are two computer scientists who are famous for solving a long standing open problem. What is interesting is that they solved the problem at the same time, in 1987, and with essentially the same method. Independently. Theory is like that sometimes.

They solved the so called “LBA problem”. This problem was first raised in the 1960’s. When I was a graduate student we all knew about the problem. But no one thought about the question as far as I could tell. It stood unsolved for so long for two reasons, in my opinion. First, no one really worked on the problem. It appeared to be unapproachable. Second, we all guessed wrong. The conventional wisdom (CW) was that the problem went one way. That was obvious. But Immermann and Szelepcsenyi showed that it went the other way. Sound familiar? Could P=NP go the same way? Who knows?

There is one more comment I must add. Robert was a student when he solved the problem. The legend is that he was given a list of homework problems. Since he missed class he did not know that the last problem of his homework was the famous unsolved LBA question. He turned in a solution to the homework that solved all the problems. I cannot imagine what the instructor thought when he saw the solution. Note, it is rumored that this has happened before in mathematics. Some believe this is how Green’s Theorem was first solved. In 1854 Stoke’s included the “theorem” on an examination. Perhaps we should put P=NP on theory exams and hope $\dots$

## The LBA Problem

The LBA problem was first stated by S.Y. Kuroda in 1964. The problem arose in a natural way since in the 1960’s there was great interest in the power of various types of grammars. Grammars were used to describe languages. There was a hierarchy of more and more complex types of grammars. At the bottom were grammars for regular languages, next came context-free grammars. Then, came context-sensistive grammars which were very powerful. They could describe quite complex languages. A natural question that arose immediately is: if $L$ is a context-sensistive language is it’s complement $\bar{L}$ also one? This was easy to resolve for the regular case, yes, and for the context-free case, no. For the context-sensitive case the answer was elusive. The CW was that it had to be the case that the answer was no, but no one had any clue how to proceed.

A language is context-sensitive precisely if it is accepted by a nondeterministic Turing machine whose work tape is exactly the same length as the input. Such a machine is called a Linear Bounded Acceptor or LBA. The LBA problem, then asked, whether or not such machines were closed under complements. That is whether or not if a language was accepted by an LBA, then so was it’s complement. Since LBA’s are nondeterministic, the CW was simple: there must be a context-sensitive language whose complement is not context-sensitive. Note, the intuition was reasonable: how could a non-deterministic Turing Machine with only linear space check that no computation accepted some input? The obvious methods of checking all seemed to require much more than linear space.

## The Solution

George Polya suggests that when trying to prove something, often a good strategy is to try to prove more. This and other great suggestions are in his classic book: How to Solve It. The rationale behind his suggestion is simple: when you are proving “more” you often have a stronger inductive statement.

Neil and Robert did exactly this. They did not determine whether or not the start state can reach the final state. Instead they counted exactly how many states could be reached from the start. Counting how many states are reachable gives them a “Polya” type advantage that makes their whole method work.

The key to their solution is the following cool idea. Suppose that we have any set $S$ of states. Call a nondeterministic machine $G$ a weak generator of $S$ provided it always generates a sequence
$s_{1}, \dots, s_{t},v$
where each $s_{i}$ is a state and $v \in \{0,1\}$. If $v=1$, then the states $s_{i}$ are all distinct and
$S = \{ s_{1}, \dots,s_{t} \}.$
Further, the machine has some nondeterministic computation that outputs a sequence that ends in $1$.

Let $S_{k}$ be all the states that are reachable in at most $k$ steps from the start state. Reducing the LBA problem to calculating the size of $S_{k}$ for a large enough $k$ is an easy remark. See their paper for this. Neil and Robert’s great insight is that there is a weak generator for $S_{k}$ for any $k$. Moreover, this generator can be constructed from just the cardinality of $S_{k}$. This is the remarkable idea: knowing only the size of the set $S_{k}$ one can construct a weak generator for $S_{k}$.

We now need to explain how they can create a weak generator for $S_{k}$ from just the cardinality of $S_{k}$. Let $m=0$. The idea is to go through all states $s$ in a fixed order. For each $s$ we will guess whether $s$ is in $S_{k}$. If we guess no, then we move on to the next state. If we guess yes, then we guess a path from the start to $s$ of length at most $k$. If the guessed path is correct, then we output $s$ and increment the counter $m$ by $1$. If the path is incorrect we move onto the next state. When we have gone through all states $s$ we check the value of $m$. If it is equal to the cardinality of the set $S_{k}$, then we output $v=1$. Otherwise, we output $v=0$.

The key is this is a weak generator for $S_{k}$. All the properties are easy to check. The main point is that if the machine guesses wrong during the computation, then some state will be missed and $m$ will not equal the size of the set $S_{k}$. Very neat.

The last point is that we can construct the size of $S_{k}$ inductively. Clearly, the size of $S_{0}$ is $1$. Suppose that we know the size of $S_{k}$. We will show how to construct the size of $S_{k+1}$. Since we have the size of $S_{k}$ we have a weak generator $G$ for $S_{k}$. We will use that to compute the size of the set $S_{k+1}$. Let $m=0$ again be a counter. We again go through all states $s$ in some fixed order. For each state $s$ we run the generator $G$. We can assume that the generator creates
$s_{1}, \dots, s_{t}, 1.$
If the last is not $1$, then we just reject later when we see this. Otherwise, for each $s_{i}$, we check (i) is $s=s_{i}$ or (ii) is $s$ reachable in one step from $s_{i}$. In either case we increment $m$ by $1$ and move on to the next state $s'$. If both fail, them we try $s_{i+1}$. It is not hard to see that this computes the number of states that are reachable from the start in $k+1$ or less steps.

## A Proof From the Sixties

With all due respect to Neil and Robert there is nothing about their clever solution that could not have been done 20 years ago. They solve the LBA problem directly, and they use no fancy methods or techniques that were developed during the last twenty years. It’s very clever, but there is nothing there that was unknown in the sixties. I often wonder whether there is some similar simple idea that would resolve P=NP? Is there some basic idea(s) that we are all missing? I hope that someone reading this will see something that we all have missed. That would be great.

## Open Problems

An obvious question is: can any similar method be used to show that NP is closed under complement? One interesting note is that for their counting method to work they do not need to get the sets $S_{k}$ exact cardinality. Even a coarse approximation would suffice. This suggests some ideas that might work in other contexts.

September 23, 2009 2:17 pm

“Note, it is rumored that this has happened before in mathematics.”…George Dantzig did something similar as a student. See http://en.wikipedia.org/wiki/George_Dantzig

2. April 27, 2011 10:11 am

The legend about Robert is not entirely true. Branislav Rovan was Robert’s master thesis advisor at Comenius University in Bratislava, Slovakia. While advising Robert, he showed him a table with known results for operations on languages. Complement for LBAs was empty but professor Rovan warned Robert that he should not spend time on it since it seemed to be a hard open problem. Robert didn’t listen and solved it in the end.

This story was told by professor Rovan (http://www.dcs.fmph.uniba.sk/~rovan/) during a lecture on LBAs while I was doing my master’s at Comenius University. You might try to ask him for further details.