How to use memory without a trace

Harry Buhrman is a quantum complexity theorist. That is, if you take a measure of him, you will sometimes find that he is doing complexity theory, sometimes quantum computing, sometimes both, and sometimes neither. Whichever, he has been doing it great for a quarter century. He has a joint paper in STOC 2014, which begins today, that is part complexity and part—something else.

Today Ken and I want to talk about the beautiful new concept and results in that paper. They can be viewed as complexity theory results: this model of computation can compute such-and-such. They can also be viewed as security results: malware programs can do things to avoid detection.

We are not suggesting that Harry and his co-authors—Richard Cleve, Michal Koucký, Bruno Loff, and Florian Speelman—are up to something nefarious. Their paper is called Computing with a full memory: Catalytic space. No hint of security or anything diabolical here. Indeed they seem to think of the result more as a result about chemical reactions, hence the term “catalytic.”

We beg to see their result in a different light. Of course, the authors can call it whatever they like: one of the perks of writing the paper. Of course, we can interpret the result also in any way that we like: fair is fair. Seeing it as related to malware and security is meant as a compliment, and we certainly hope that Harry and his co-authors take it that way. Or perhaps they might take it in a complementary way.

## A New Model of Computation

So what is their new concept? It is a new type of model of computation. We quote them:

Imagine the following scenario. You want to perform a computation that requires more memory than you currently have available on your computer. One way of dealing with this problem is by installing a new hard drive. As it turns out you have a hard drive but it is full with data, pictures, movies, files, etc. You don’t need to access that data at the moment but you also don’t want to erase it. Can you use the hard drive for your computation, possibly altering its contents temporarily, guaranteeing that when the computation is completed, the hard drive is back in its original state with all the data intact?

They go on to qualify an important point:

One natural approach is to compress the data on the hard disk as much as possible, use the freed-up space for your computation and finally uncompress the data, restoring it to its original setting. But suppose that the data is not compressible. In other words, your scheme has to always work no matter the contents of the hard drive. Can you still make good use of this additional space?

The short answer is yes. In order to state their result we need to formalize their model a bit. Their model leans on a restricted class of ordinary random-access machine (RAM) programs with the standard numerical ${+,-,*}$ operations, and registers ${r_1,r_2,r_3,\dots}$ Their transparent programs have instructions only of the form

$\displaystyle r_i \text{ \tt += } r_j*r_k \qquad\text{and its inverse}\qquad r_i \text{ \tt -= } r_j*r_k,$

where ${j,k \neq i}$. If we restrict ${i > n}$ for some ${n}$, then the initial contents ${x_1,\dots,x_n}$ of ${r_1,\dots,r_n}$ are read-only inputs. The main point is that the model also allows ${r_{n+1}}$ through the last register ${r_N}$ to have arbitrary initial values. A program ${P}$ computes ${y = f(x_1,\dots,x_n)}$ if the initial value ${r_N}$ is transformed to ${r_N + y}$.

The stealthy aspect is that if you allocate one more register ${r_{N+1}}$, then you can write the program

$\displaystyle P' = r_{N+1} \text{ \tt -= } r_N;\;\;P;\;\;r_{N+1} \text{ \tt += } r_N;\;\;P^{-1},$

where ${P^{-1}}$ applies the inverse of each instruction in ${P}$ in reverse order. This computes ${y}$ into ${r_{N+1}}$ while leaving all previous registers unaltered. The trick works even if the values belong to a finite ring. With a finite ring we can model the registers by regularly-spaced finite segments of a tape, and thus hope to control the programs by Turing machines.

## The Idea

With ${s(n)}$ tape cells, one can address any location on a tape of size ${2^{s(n)}}$. Thus a log-space Turing machine can control a polynomial number of registers, and also read the input tape. If the programs ${P_n}$ to run on inputs of length ${n}$ are uniform enough to generate each successive instruction in logspace, then our Turing machine can execute them. Our machine can also save an ${O(\log n)}$-sized portion of a larger tape, introduce its own value there, and restore the original later.

Now suppose the input holds the adjacency matrix ${A}$ of a directed graph on ${n = 2^k}$ nodes. If we can square ${A}$ iteratively ${k}$ times, then we can tell whether there is a path from node ${1}$ to node ${n}$, which solves a problem complete for nondeterministic logspace. Squaring ${A}$ has very regular polynomial-length transparent programs.

The final hitch is that the final entry ${A^n(1,n)}$, and entries along the way, can have values up to ${n^n}$. One solution is to compute modulo ${p}$ for a long enough logspace-uniform sequence of small primes ${p}$; then by the Chinese remainder theorem we can accept iff any one of them gives a non-zero value mod ${p}$. Since ${p}$ has ${O(\log n)}$ length our machine can preserve the original (arbitrary) contents ${r_N}$ of the destination register, recognize that the value was changed, and restore ${r_N}$.

## Main Results

Their formal model thus augments the standard Turing machine model—which has work tapes and read-only input—with an auxiliary tape. They call the machine catalytic if it computes the same value for every possible initial setting of the auxiliary tape, and that tape is unchanged at the end. They define ${\mathsf{CSPACE}(s(n))}$ to be the class of sets computed by catalytic Turing machines whose work-tape is bounded by ${s(n)}$ tape cells, and whose auxiliary space is bounded by ${2^{O(s(n))}}$ cells. Write ${\mathsf{CL}}$ for ${\mathsf{CSPACE}(\log n)}$. This is a deterministic log-space class, yet by the above argument, it contains non-deterministic logspace: ${\mathsf{NL \subseteq CL}}$. In fact, they prove something stronger:

Theorem 1 ${\mathsf{TC^1 \subseteq CL \subseteq ZPP}}$.

Here ${\mathsf{TC^1}}$ is the class of languages accepted by uniform log-depth threshold circuits, and ${\mathsf{ZPP}}$ is the class of languages with programs that halt in expected polynomial time and are always correct when they halt. The former inclusion builds on wonderful known connections between arithmetical and Boolean complexity classes using counting as a go-between, and their paper gives a great survey of them. The latter inclusion exploits the fact that since the machine is deterministic, it can’t have any overlap in configuration between computations with different initial settings of the auxiliary tape, because the flowing together would make the final auxiliary tape contents wrong for one of them. Since the remaining space is logarithmic, the average number of disposable configurations per setting is polynomial. Hence a randomized TM that sets the auxiliary tape randomly runs in expected polynomial time. They also prove:

Theorem 2

• If the Exponential Time Hypothesis is true, that is ${\mathsf{SAT} \notin \mathsf{BPTIME(2^{o(n)})}}$, then ${\mathsf{SAT} \notin \mathsf{CSPACE(o(n))}}$.

• There is an oracle ${A}$ such that ${\mathsf{CL}^A = \mathsf{PSPACE}^A}$, indeed ${\mathsf{CSPACE}^A(s(n)) = \mathsf{DSPACE}^A(2^{O(s(n))})}$ for every ${s(n) \geq \log n}$.

• There is an oracle ${B}$ such that ${\mathsf{NL}^B \not\subseteq \mathsf{CL}^B}$.

The last statement seems jarring since above we proved ${\mathsf{NL \subseteq CL}}$, but it plays on known issues with defining oracles for space-bounded machines. That’s the very last result in their paper, but…

## More Applications?

Note what they show is that memory that is already in use can be used for temporary storage during a computation. This is why we see their result as related to security and malware.

Imagine the following scenario. You are a malware program. You do not want to ask for additional storage, since this could alert the system that you are running. Instead you plan to use memory that is currently being used for storage of objects like movies or whatever. The key is that you use this memory to expand your power of computation, but when done you leave the memory in the exact state that it was in when you started.

Clearly, this a great for a malware program that needs lots of memory during its execution. It can avoid making any system call to allocate additional memory, yet can use tons of memory while it is executing. This stealth type of behavior seems quite cool, quite powerful, surprising, and if practical could be used in real attacks on systems.

## Open Problems

As they say, how does ${\mathsf{P}}$ relate to ${\mathsf{CL}}$? As we say, if every polynomial-time program could be “catalyzed” by a program with a small own-footprint, what greater dangers would we face?

1. June 2, 2014 6:35 am

Nothing diabolical? I can think of one demon who might be interested – Maxwell’s. A bank of memory full of zeroes is a useful resource for such a demon. This seems to suggest that memory full of random stuff (equivalent to a system at thermodynamic equilibrium) is just as good. Does this mean that we can avoid the heat death of the universe? Where’s the catch?

• June 2, 2014 11:03 am

That’s a neat idea, and spurs me to read Roger Penrose’s Cycles of Time which I hear talks about creatio ex equilibrio.

June 2, 2014 7:23 am

In future versions of Linux, the home and the swap partitions will be one and the same…