An application of the notion of block respecting to better circuit simulations of Turing computations

Wolfgang Paul is one of the great theorists who unfortunately left theory, decades ago, for other pastures. He left to become a professor in Germany, where he created a group to work on parallel computers. I recall that this was during the time of the ultra-machine project of Jack Schwartz at NYU, and many theorists worked on the design of parallel computers. Paul actually worked with his group on building real parallel computers. Later he continued working on electronic design technology—especially formal properties of such systems. For example, one of his latest papers is “On the correctness of upper layers of automotive systems.” He has received the prestigious Gottfried Wilhelm Leibniz Prize, among many other awards.

Today I want to talk about one of his great ideas, which played a crucial role in the proofs of several theorems on the separation of complexity classes. The idea is the notion of a “block respecting” Turing Machine (TM). He told me that he believed that this notion—while only used as part of the proof of his main theorems—was his most important contribution to complexity theory. I agree, at least partially, since the notion of block respecting has been used a number of times to prove other results. It probably is not as well known as it should be, and today I will show you a recent application of its power.

Paul was visiting Cornell, I recall for the year, when he proved his great separation theorem, with John Hopcroft and Leslie Valiant, showing that space was more powerful than time. The summer after the result was announced I was spending most of it at IBM Yorktown. Ron Rivest and Albert Meyer were also there for the whole summer, and the three of us decided to rent a house together near IBM.

One day Ron and I thought we could use the planar separator theorem in a clever way to improve the space-time separation theorem of Paul-Hopcroft-Valiant. Ron and I explained the argument to Albert, who believed it, and then it got late and Ron and I said goodnight to Albert and went to bed.

I was awakened from a deep sleep by Albert banging on my door in the middle of the night. Albert also did the same with Ron’s door, and soon Ron and I got to our senses, at least partially. Albert, then, proceeded to tell us that we were wrong, and he backed his claim up with a beautiful argument that showed that our whole approach was incorrect. Ron and I then retired—for a second time—to our respective rooms and went back to sleep. Perhaps Albert could have waited for the morning, but I guess not. In the light of day it was clear that Albert had completely destroyed any hope we had of improving the separation theorem, so Ron and I never again tried to improve it.

Block Respecting Turing Machines

In 1975 at FOCS, John Hopcroft, Wolfgang Paul, Leslie Valiant presented their
famous theorem:

Theorem: Every deterministic TM of time complexity ${T(n)}$ can be simulated by a deterministic TM of space complexity ${T(n)/\log T(n)}$.

One of the keys to the proof of the theorem is the notion of a block respecting TM. A TM is block respecting for size ${B}$, provided if each Turing tape is divided into blocks of length ${B}$ and the machine is only allowed to cross a block boundary at times ${ckB}$, i.e., multiples of ${cB}$, where ${c}$ is an absolute constant.

Theorem: If a deterministic TM runs in time ${T(n)}$, then there is a block respecting TM that accepts the same language and runs in time ${O(T(n))}$.

What is cool about this theorem is that it is proved by an idea that is really a system programming trick. Essentially, the key is to think of the blocks of each Turing tape as made of “pages”, and the goal then becomes one of trying to avoid too many page faults.

Imagine the tapes have been divided into the blocks: each of size ${B}$. If one just runs the TM, one can easily arrange that at times ${ckB}$ each tape head is at the first cell of each block. The difficulty is that during the ${cB}$ steps between these times, the TM’s heads could repeatedly cross a boundary. Actually, the worst case could happen: one of the heads could jump back and forth over the cell boundary at each time step.

The analogy to paging in real systems would be the behavior that is sometimes called thrashing. If the blocks were real pages, then this jumping back and forth could lead to a huge cost in loading and unloading the pages. Paul’s great insight is that for a small cost in running time this thrashing can be avoided. See their paper for the details. Perhaps this type of insight was why Paul was drawn toward real computers and away from TM’s. In any event it was a huge loss for the theory community.

If a TM that runs in time ${T(n)}$ is block respecting with block size ${B}$, then there is a natural graph that is defined for each input ${x}$. This graph is the block-respecting graph of the TM on input ${x}$. The graph consists of vertices that represent the blocks of each of the tapes of the TM that are visited at time ${ck}$ for ${k=1,\dots,T(n)/B}$. Suppose there were two tapes. Then let ${U_{k}}$ and ${V_{k}}$ be the blocks visited by the TM at time ${ck}$ on each of the respective tapes. Then, the graph has another vertex ${W_{k}}$ and edges:

$\displaystyle \begin{array}{rcl} U_{k} \rightarrow W_{k} &\text{ and }& V_{k} \rightarrow W_{k} \\ W_{k} \rightarrow U_{k+1} &\text{ and }& W_{k} \rightarrow V_{k+1}. \end{array}$

Think of ${W_{k}}$ as the operation of taking the contents of the two tapes ${U_{k}}$ and ${V_{k}}$ and creating the new contents of those blocks after the next ${cB}$ steps.

Note, for each input ${x}$, there can be a different graph. Thus, one of the key steps in any application of the block respecting concept is to “guess” somehow the actual block-respecting graph for the given input. While the graph can vary with different inputs, it has a critical property. The graph is much easier to describe than the whole computation of the TM. The TM takes ${T(n)}$ steps and so requires about that number of bits to describe its operation. However, the block-respecting graph has order ${T(n)/B}$ vertices, and so can be described in many fewer bits. This reduction is of critical importance is all applications. I will shortly exploit this to prove a result.

Approximation Theorem

Say that ${f_{n}(x)}$ where ${x=(x_{1},\dots,x_{n})}$ is computable by a deterministic TM that runs in quasi-linear time, if the machine runs in time bounded by ${n(\log n)^{c}}$ for some constant ${c}$.

Let ${g(x)}$ take on values in the set ${\{0,1,*\}}$: call such a function a partial boolean function. Say that ${g}$ approximates a boolean function ${f}$ provided that whenever ${g(x)}$ has a boolean value, ${g(x) = f(x)}$. Let the support of the approximation be the set of ${x \in \{0,1\}^{n}}$ so that ${g(x)}$ is boolean. Thus, if ${g=f}$, the support is all ${n}$ bit binary strings, while if ${g}$ is always ${*}$, the support is the empty set.

Theorem: Suppose that ${f_{n}(x)}$ is a boolean function that is computed by a deterministic TM that runs in quasi-linear time. Then, there is a partial-boolean function ${g_{n}(x)}$ that approximates ${f_{n}(x)}$, and the circuit complexity of ${g_{n}(x)}$ is at most ${O(n \log \log n)}$. Moreover, the cardinality of the support of the approximation is at least ${2^{n-o(n)}}$.

Note, this is both stronger than the famous Fischer-Pippenger circuit bound, and weaker. It is strong since the size of the circuit is now ${O(n \log \log n)}$ instead of ${O(n \log n)}$. It is weaker since the approximation only holds for a set of size at least ${2^{n-o(n)}}$ instead of holding everywhere.

Proof: An outline of the proof. Use the block respecting theorem with ${B}$ equal to a power of ${\log n}$. Notice that there are a smaller number of block respecting graphs than the number of inputs. We can use pigeonhole principle here. Some block respecting graph must work for many inputs. Also the simulation of this circuit uses Fischer-Pippenger only on computations of size ${B}$. Thus, the circuit is of size

$\displaystyle \frac{T(n)}{B} O(B \log B)$

which is a quasi-linear bounded function. $\Box$

Open Problem

The main result here is that there is a boolean circuit that beats the Fischer-Pippenger theorem on many inputs. However, the fraction of inputs is an exponentially small fraction of ${2^{n}}$. Can we do better? Is there a way to increase the fraction of inputs that the approximation works to a polynomial fraction? Another question is the approximation concept used here is one that never makes a mistake: when ${g(x)=0}$ or ${g(x)=1}$ we know that it is right. Can we do better if we allow errors?

Finally, the methods here only work up to quasi-linear time computations. Can we do something similar for arbitrary time bounds on the TM’s?

August 5, 2009 8:31 pm

Note, this is…stronger than the famous Fischer-Pippenger circuit bound…since the size of the circuit is now O(n \log \log n) instead of O(n \log n)

Doesn’t the Fischer-Pippenger result give a circuit of size O(n \log^{c+1} n)?

August 6, 2009 5:30 am

I have a question: The second theorem of this post, does it hold for any choice of B?

Aside from this question, it may be obvious but I would just like to add that I love your blog-posts and I read all of them. Thank you!

August 6, 2009 8:44 am

Yes any B. But. The B must satisfy usual TM restrictions, that can construct the value B in time T. So except for degenerate values of B all is well.

3. August 10, 2009 8:09 pm

Does the use of the pigeonhole principle make the approximating circuits (ostensibly) non-uniform? Indeed, there are almost-fully-exponentially-many possible block-respecting graphs involved. This seems to be a notable difference from the Pippenger-Fischer circuits.

It might be interesting to work this out for some concrete languages or functions that have linear-time TMs, but are not known to have linear-size circuits, such as the function I wrote about here.