New Ideas On Nondeterministic Simulation
Using the weak trace method to get better simulations of nondeterministic Turing machines
Rao Kosaraju is a world famous computer scientist who has worked on many aspects of the theory of computation. He has done seminal work on algorithms, on data structures, on geometry, on applications of theory to diverse areas, and on complexity theory. For example, he worked on, perhaps, the first “correct” proof of the Vector Addition Reachability Problem–see this for more details.
This is a continuation of my last two posts on the simulation of nondeterministic Turing machines (NTM). The last couple of days have been busy trying to get my head around what is going on with the major question: How fast can we simulate a non-deterministic Turing Machine (NTM)?
As I thought about Turing machines, I could not stop thinking about one of Rao’s beautiful results: his algorithm for finding the strong connected components (SCC) of a directed graph. I thought of this for two reasons: first, his major trick is a “reversal” of the edges of the graph, and second there is a time-reversal aspect to the history of his algorithm. See this for a nice exposition of how Rao’s algorithm works. Reversals play an important role, as you will see, in understanding how to simulate NTM’s.
As I understand, Rao discovered his algorithm in 1978, while the first algorithm that solved SCC in linear time was published in 1972 by Bob Tarjan. Our friends at Wikipedia get the time sequence a bit messed up: sometimes they point out that Bob’s algorithm is a variation of Rao’s. That is a bit strange, to call an earlier algorithm a variation of a later one. Bob was first. Rao was second. Rao’s algorithm has some advantages over Bob’s; for example, it is somewhat simpler than Bob’s. But there is no doubt that Tarjan’s algorithm came first.
The story of how Rao discovered his algorithm may help explain the confusion of who did what when. Is that clear: ”who did what when”–sounds a bit like a comment made at a Senate hearing.
Rao told me that one day he was eating lunch at the Johns Hopkins’ faculty club, which is a quite pleasant place to eat lunch, when he realized that he had forgotten his notes for his algorithms class. Right after lunch he had planned to present Tarjan’s SCC algorithm to this class. Rao was in trouble, since he could not remember Bob’s linear time algorithm. Rao’s office was too far away: by the time he got back to his office and got his notes, he would be very late to class. This was way before the Internet, so there was no way to lookup anything, which meant there was no way to get to his notes. No way to find the details of Tarjan’s algorithm.
What to do? Rao said he had a few minutes, so he sat down and worked out Bob’s linear time algorithm for SCC, “from memory”. He was relieved that he seemed to “recall” all the details, and just made it to his class on-time. There, he presented “Bob’s” linear time algorithm to his students, which went well. After class he went back to his office and finally looked at his notes. He was shocked to see that what he presented in class was an algorithm that was fundamentally different than Tarjan’s: Rao had “remembered” a new algorithm for SCC. Pretty neat. Perhaps, from time to time, we should all forget our notes before teaching?
I would like to say that is how I found the better simulation I discussed in my last post, or how I found the simple theorem I will discuss shortly. But, that would be incorrect. I also think Rao’s story shows something about his tremendous abilities that few have.
Summary of Results
Here is a summary of what is going on: Suppose that the NTM has states, tapes, and distinct non-blank symbols for the tapes. Let the time the machine takes be . In the following I will summarize what I know about the simulation of by a deterministic Turing machine.
I will not be concerned with terms of the form where is absolute, since the dominant term is usually exponential in . The exact powers of will be worked out in a paper.
Some of the results require a bit of care since the simulating machine is a deterministic Turing machine and not a random access machine. See Table 1.
Classic Method Based on Tree Search: This uses time and polynomial in space. The value of is . The idea here is to view the computation as a tree. Each node is a Turing machine configuration. The branching factor is based on the number of ways the finite state control can change and the possible motions of the tape heads.
Method Based on DAG search: This uses time and space . The value of is . The idea here is to view the computation as a directed acyclic graph (DAG). Each node is a Turing machine configuration. Since it is a DAG, the number vertices is roughly the number of possible tape configurations. This places outside of the exponential term. See the post on Bart’s idea for more details.
Our Method Based on Weak Traces: This uses time and space that is polynomial in . The value of is .
Note, if space must be polynomial in the weak trace method is best. If only time matters, then the DAG method is best. However, we have a new method that dominates all the above provided one is only interested in the exponential base of .
A New Method Based on Weak Traces: This uses time and space . The value of is where is arbitrary and the value of depends on .
The key to the last result is that we get as the base on the exponential term. This is a step toward the conjecture. If we could push this a bit further than I believe that we could solve the conjecture.
The high level view of the last result is this. First, we prove that we can reduce the number of reversals in a NTM without making any change to the running time. This yields a weak trace method that removes the factor of due to tape motion. The second idea is based on a simple observation: in a weak trace there is no reason to record the symbol left on a tape location that will not ever be re-visited. This symbol can be dropped. The way we exploit this is to argue that either (i) dropping these redundant bits yields a big improvement, or (ii) we can use a different strategy based on the DAG method to simulate the NTM.
This will all be discussed in detail in a future post–not the next one–enough on NTM for a few days. Finally, I am working with Subrahmanyam Kalyanasundaram and Farbod Shokrieh on a detailed paper on the new bound.
A key lemma is the following reversal reduction lemma:
Lemma: Suppose that is a multi-tape NTM that runs in time . Then, for any , there is another multi-tape NTM that simulates and runs in time
and makes at most reversals. Further, both machines have the same number of tapes and tape alphabet.
The point of this lemma is that the time of the simulation is essentially the same, yet the number of reversals is reduced by an arbitrary factor.
Thus, we have proved, for the case of two tapes and binary alphabet,
Theorem: For a two-tape binary alphabet NTM with running time , there is a deterministic simulation with running time,
for any .
Can we decrease the constant below ? I believe that this should be possible.
A last open problem for today: what happens with the simulation of a bounded error probabilistic Turing machine? Can we get the same type of efficient deterministic simulations?