How to use crypto methods to create efficient codes for space limited channels

Adam Smith is a complexity theorist who works in the area of coding theory, and especially the interface between information theory and cryptography.

Today I plan on discussing a result he has with Venkatesan Guruswami. They both are masters of using cryptography and information theory, and they use these tools to prove a pretty theorem in information theory; this result will appear in the upcoming FOCS 2010. Guruswami gave a detailed talk on it last weekend at the Eastern Great Lakes Theory of Computation Workshop, which was held on the University at Buffalo campus. It was organized by Bobby Kleinberg, Atri Rudra, and Alan Selman—I understand it was a great success, I wish I could have been there.

The central idea is one that I find extremely appealing. In their paper they study, among other things, the consequences of limiting the power of the communication channel. This is not a new idea, information theory has long studied binary symmetric channels. These are channels that have an probability ${p}$ of “flipping a bit,” and each bit is flipped with this probability and independently. The reason these channels are so popular is two-fold: (i) they are the simplest non-trivial channels and usually are the easiest to understand, and (ii) they sometimes can be used to solve even more complex situations.

Here is a figure from Wikipedia that shows Alice sending bits to Bob over such a channel.

I have discussed the idea of limiting the power of the communication channel before—see this.

An aside: are you all getting tired of Alice and Bob, they seem to be everywhere? I recently was told to stop using them to explain things. What do you think? By the way here is a great comment about them from John Gordon:

So you see Alice has a whole bunch of problems to face. Oh yes, and there is one more thing I forgot to say—Alice doesn’t trust Bob. We don’t know why she doesn’t trust him, but at some time in the past there has been an incident.

Now most people in Alice’s position would give up. Not Alice. She has courage which can only be described as awesome. Against all odds, over a noisy telephone line, tapped by the tax authorities and the secret police, Alice will happily attempt, with someone she doesn’t trust, whom she cannot hear clearly, and who is probably someone else, to fiddle her tax returns and to organize a coup d’etat, while at the same time minimizing the cost of the phone call.

A coding theorist is someone who doesn’t think Alice is crazy.

Computationally Limited Channels

A major idea in their paper is to restrict the communication channel between Alice and Bob—yes I will stay with them—to be a logspace limited computation. Actually they study a number of other results, but this is the one that I will discuss. The restriction on the space used by the channel to decide which bits to change is pretty reasonable in my opinion.

But I am biased. Almost twenty years ago I worked on a related problem and got weaker results. I did however introduce—I believe—the idea of limiting the computational power of the channel. My results were for the case where the channel was polynomial-time limited, which is of course more powerful than a logspace limited one. At least we believe that is true: recall we still do not know if logspace equals polynomial-time.

Shortly after my paper was out Zvi Galil, Moti Yung, and Xiangdong Yu wrote a paper on limited channels, changing my assumption to logspace. I made a terrible mistake. They asked me to join them and we eventually wrote a joint paper which we never got published. It was titled: “Computational error-correcting codes achieve Shannon’s bound explicitly.”

My terrible mistake was joining them. Their paper without me had a line in the introduction about the “great idea of Lipton.” But once I was added to the paper that line was removed, as it should have been. Rule:

Never join a paper that gives you a great reference.

The New Results

Adam and Venkatesan consider more generally space limited channels. They show that as long as the channel is restricted to logspace they have a coding method that is efficient and optimal up to the channel rate of ${1-H(p)}$ and recovers a short list that contains the correct message with high probability.

I will not attempt to give any details of how they prove their theorem, take a look at their paper. I will make a few high level comments about their wonderful paper. Their paper makes use of the famous Noam Nisan’s pseudorandom generator. This is one of the great results in complexity theory: Noam shows that there is a pseudorandom number generator that “fools” logspace computations, and does this unconditionally—Noam’s proof uses no unproved assumptions, no need to assume that one-way functions exist.

Their paper also extends the previous work in several ways. The major one, in my opinion, is they get rid of the assumption of a shared secret that was used in the previous work. In my paper and in the paper of Galil, Yu, and Yung the assumption was that Alice and Bob had a shared secret. Clearly removing this is extremely important. They call these “setup” assumptions. The proof that they can do this is quite a subtle one. The “obvious” idea is to send the shared secret over the channel, but this must be done with great care. Recall they are making no assumptions about the existence of public key systems. Take a look at how they make all these work—it is quite impressive.

Open Problems

Suppose that you have a result that uses an error-correcting code somewhere. In order to make your result work against worst case adversaries, you would likely need to use a classic code that can handle an all powerful adversary. What would happen if you replaced the worst case code by the code of Adam and Venkatesan? Of course your result would now likely fail. There might be a situation that would invalidate your result. The open question is: can your result still be proved to be true if the adversary has limited computational power? I would love to see such a result.

Another question is a version of this for the SAT problem, my favorite problem. Suppose that a logspace machine ${M}$ must create instances of SAT problems. If we allow the machine ${M}$ to be random it is unclear why the generated instances would be any easier than “general” ones. However, suppose that we require that the machine ${M}$ must know the answer to the instances it generates:

1. If ${M}$ generates a satisfiable formula ${\phi}$, then ${M}$ must also generate a satisfying assignment.
2. If ${M}$ generates an unsatisfiable formula ${\phi}$, then ${M}$ must also generate a short proof of this.

What is the complexity of SAT for formulas generated in this way?

1. October 14, 2010 12:00 pm

The Gordon quote is great. It brings Alice to life in a way dry texts and lectures never do. There might be an implicit funny skit about Alice similar to “Minesweeper: The Movie” (http://www.youtube.com/watch?v=LHY8NKj3RKs). Your anecdote about the not contributing to papers which would otherwise give you a great reference is also very funny. Finally, your open problem from this post is (IMHO) one of the best, if not the best open problems suggested so far.

Another open problem, which may speak more to my ignorance than to anything really interesting: what can restricted channels or computationally limited adversaries tell us about class separation? For example, is it possible to show that P != L by the virtue that different things are possible against adversaries with access to L and P resources?

October 14, 2010 1:33 pm

Even though my comment does not add any new content, I totally agree, the open problem regarding SAT is really sweet.

• October 15, 2010 1:04 am

It doesn’t seem like the first component of your open question is as interesting as the second component. M can always generate a satisfying assignment and then add extra variables and clauses until it becomes hard. This might be easiest to illustrate with subset sum – generate several random numbers of a certain order. Pick a random subset and set the target value to their sum.

The second component seems more difficult.

2. October 15, 2010 10:26 am

On Scott Aaronson’s weblog Shtetl Optimized there is an ongoing discussion of a Math Overflow question posed by Tim Gowers, namely “Are there any interesting examples of random NP-complete problems?”, that (AFAICT) is complementary to the present discussion.

I too thought the Guruswami-Smith preprint was terrific! For me, their beautiful theorems concretely illustrate our natural intuition, that problems which are infeasible to solve when they are posed by adversaries having access to omniscient oracles, can look *much* easier when they are posed by dumber adversaries.

Indeed, the word “explicit” appears 42 times in the Guruswami-Smith preprint … and the word “oracle” appears not at all. 🙂

October 15, 2010 10:29 am

Is there a polynomial time algorithm known for NP-complete problems where the word sizes one has to operate are exponentially large ? Say you do quadratically many multiplications to solve an NP-complete problem but on words of size n^n bits. If such an algorithm is found, would that be p=np? or at least would that be of any interest to the community?

October 15, 2010 2:20 pm

LifelsChill

That is a nice question. There are algorithms that given large bits can solve NP-complete problems in polynomial number of steps. I cannot hink of pointer right away, but this is known. A good idea though.

October 15, 2010 10:31 pm

Hi Professor Lipton:
Thank you.
What about polynomially growing bits size $n^k$ bits? Say one has a solution to computing n! (factorial) in $log^{c}(n)$ steps but with arithmetic operations on size $n^k$ bits.

Would that be of any importance? Are such relations known? Are there any references?

October 15, 2010 11:02 pm

We need at least $nlog(n)$ bits to represent n!. However are their easy solutions to computing n! given word size of $n^k$.

4. October 15, 2010 11:23 am

Yet another wonderfully interesting aspect of the Guruswami-Smith preprint is that we can regard it equivalently as being about reliable communication channels, or as being about efficient simulation algorithms.

Here the point is that Bob can decode Alice’s message if and only if Bob can efficiently simulate Eve’s error injection process; in this sense decoding and simulation are (essentially) equivalent problems.

The equivalence between reliable communication and efficient simulation arises commonly and naturally in broad classes of practical engineering problems. For example, quantum spin microscopes commonly are regarded as communication channels, over which the Alice (the transmitting spins in the sample) communicate information (namely, supramolecular 3d structure) to Bob (the receiving spins in the microscope), and it is common practice to associate concrete Shannon capacities to this channel.

Adopting the Guruswami-Smith point-of-view, we see that to approach quantum spin microscope channel capacities in practice, it is necessary to simulate the processes that inject errors into this communication channel. Then the Guruswami-Smith results tell us that such simulations can be computationally efficient—even against adversarial errors—if computational bounds are placed upon the error-injection process.

AFAICT, these considerations apply broadly to both classical and quantum dynamical processes. It is necessary only that Mother Nature not be too profligate in the computational resources that she brings to bear, in posing the problems that she presents to us.

That is why I particularly admired the Guruswami-Smith passage: “Codes that tolerate adversarial errors are attractive because they can model channels whose true behavior is unknown or varies over time, including, for example, burst errors and echo. In contrast, codes tailored to any one of these models tend to fail when the model changes. For example, concatenated codes, which can transmit eﬃciently and reliably at the Shannon capacity with random errors, fail miserably in the presence of burst errors that occur in long runs.”

For Guruswami and Smith to explicitly recognize that Mother Nature’s dynamics commonly are adversarial, and yet are not exponentially (or even undecidably) adversarial, and from this natural restriction to derive explicit algorithms for communication/simulation that are powerful, practical, and “pretty”, seems to me to be a terrific accomplishment, that is likely to prove very broadly applicable across the STEM enterprise.

So, thanks go to Gödel’s Lost Letter and P=NP for this terrifically interesting post!