Skip to content

You can test them, you can prove them, they can still break ’em

Rafail Ostrovsky is one of the most prolific theorists in the world. He has already written over two hundred papers on many diverse aspects of computational theory, which is very impressive for someone at any age—and he is young. They range from cryptography to data structures to distributed computing to graph algorithms. He is terrific.

Today I want to talk about a quite neat paper of his on the insecurity of “provably” secure protocols.

The issue is the use of hashing functions to build certain “secure” RAM implementations, and the paper is joint with Eyal Kushilevitz and Steve Lu. It just appeared at the SODA 2012 conference. We will try to stop sounding repetitive, but there were many cool papers, as usual, at SODA.

## Building on the Community

Before joining the UCLA faculty, Rafail was at Telcordia in New Jersey, and I would see him fairly often, since I was also still at Princeton. I recall many of our conversations went like this:

Me: So what are you up to?

Rafail: Oh I have to get the final version of three papers back to conferences X and Y and of course I need to finish the drafts of two new papers on ${\dots}$

He was, and still is, an unbelievably hard worker, with a great success rate at the best conferences. He does have a “secret sauce” that I can tell you—it’s how he is able to get so many papers accepted. A typical Rafail paper would be described by him like this:

Do you know the recent result X that appeared in the last FOCS conference? No? Well they proved that S is true, and we can now show that S’ is actually true. Even further we can extend what they did and show ${\dots}$

I am not suggesting that these many papers were incremental. No. Not at all. What they had in common was that they fit perfectly into the theory structure. A typical paper would often solve an open problem, or extend the result X from a special class of whatever to the general case, or improve the previous bounds. Or all the above. These results were almost always hard and clever, and the proofs often used new techniques. But by being close to existing work, they were easy for the program committees to evaluate, and their evaluations were almost always positive.

Recall Issac Newton’s famous remark:

If I have seen a little further it is by standing on the shoulders of Giants.

In Rafail’s case it is by standing on the shoulders of many theorists. By the way there is some controversy over whether Newton meant this as a positive or a negative comment. Some now believe it was a sarcastic remark directed against Robert Hooke, at the beginning of what became lifelong animosity. Hooke was not tiny, but he was slight in build. Ken agrees with the commentary here. The British two pound coin bears the inscription Standing On The Shoulders Of Giants on its edge; of course this is intended as a quotation of Newton.

## The Oblivious RAM Problem

For the truly worried and paranoid, even encrypted computations can leak some information if one is not careful. Suppose that a computation can be watched, and while all the operations are performed on encrypted data, the pattern of which parts of memory are accessed is revealed. It is possible in this case that important information is lost.

Enter the notion of an Oblivious RAM. This is a memory protocol that reveals nothing about the computation except the size of memory and the length of the computation. This is pretty slim pickings for an attacker. The concept was introduced by Oded Goldreich and Rafail separately in 1987 and 1990 in what became a joint journal paper in 1996. To quote the motivation from the SODA 2012 paper:

A secure Oblivious RAM simulation allows for a client, with small (e.g., constant size) protected memory, to hide not only the data but also the sequence of locations it accesses (both reads and writes) in the unprotected memory of size ${n}$.

Once they had made the notion precise, the race was on to get the most efficient simulation possible in software for an Oblivious RAM. Read their current paper and this followup for a full discussion of the literature comparing the rich space of solutions. Ostrovsky has an efficient simulation algorithm that is referred to as the hierarchical algorithm, because of the nature of the data structure used by the RAM. At a very high level we can view the hierarchical algorithm as a series of memory tables that grow at a geometric rate. The protocol moves information from buffers to buffers, as you might guess. This is a technique that is used in a variety of places in theory, but here there is an additional need to shield the locations. So there is a clever reshuffling of the information from one table to another that keeps the accesses secret.

## A Flaw in Previous Schemes

Security protocols are tricky. One of the main contributions in this paper is the discovery that some previously published papers handle the buffers in ways that can leak information when implementing the hierarchical algorithm. The key insight is how the overflows of certain hash tables are handled. In cuckoo hashing each data item ${a}$ has two possible locations ${h_1(a),h_2(a)}$ and is always found at one of them, or not at all; inserts attempt to resolve collisions by successively moving collided-with items to their alternate locations, and the table overflows if a previously-moved item is collided with again. I will quote their explanation of the flaw; see the paper for all the details:

After the end of a reshuffle, the server knows that the data stored at a level can be inserted into the hash-table at that level without an overflow. While this simple observation might not seem very useful, it could actually reveal a significant amount of information.

For a concrete example, when using cuckoo hashing, it is known that no three elements ${a,b,c}$ have the property that “${h_{1}(a) = h_{1}(b) = h_{1}(c)}$ and ${h_{2}(a) = h_{2}(b) = h_{2}(c)}$”, otherwise they could not have been inserted into the table. However, during subsequent probes ${x,y,z}$ to that hash-table, for elements ${x,y,z}$ which are actually not in the table, with some noticeable probability, we may have “${h_{1}(x) = h_{1}(y) = h_{1}(z)}$ and ${h_{2}(x) = h_{2}(y) = h_{2}(z)}$”; in such a case, the server immediately knows that ${x,y,z}$ cannot all live inside that table. This can be used to distinguish two sequences of memory accesses, one in which only elements living at this hash-table are read, and one which reads elements that do not live at this level. If the table is of constant size, this lead to a distinguisher with constant success probability!

Two side comments come to mind: first the use of the symbol “${!}$” for other than factorial is a sign that it comes from a crypto paper. No slight intended, but they are used much more in such papers than any other area of theory. I once saw a paper, by other authors, that had several dozen such symbols. Part of the GLL style sheet is that we never use “${!}$”—we try and get you excited by other means. The other comment is that like many results the flaw was also independently discovered—this time by Michael Goodrich and Michael Mitzenmacher and also by Bogdan Carbunar and Radu Sion.

## Error Modes

Security protocols are tricky for many reasons. One of the most vulnerable places for many protocols is in how they handle errors or exceptions. Years ago I showed, with Anita Jones, how a certain published protocol could be broken exactly because of this issue: the protocol broke precisely when an error occurred.

There are two ways around the handling of exceptions. One is to show that they happen very rarely. If this is the case then the protocol usually can be saved, because information is lost with such a low probability that it does not affect the overall security.

A better method is to arrange that the exceptions are handled properly. This can be difficult and is what Rafail’s paper does. In the first of two main theorems, they extend the scheme by Goodrich and Mitzenmacher to do so and improve a quadratic overhead (in ${\log n}$) to nearly linear. Again read his paper for the full details on what the existing flaws are, and how they can be avoided.

## Open Problems

How can we believe in provably secure protocols if there are mistakes of this kind? Can we build a set of tools that help us understand how a protocol behaves when an error occurs? I have discussed this before here, and will discuss it again, no doubt.

Advertisements
2 Comments leave one →
1. Frank Vega permalink
February 20, 2012 4:55 pm

I tried to prove the existence of one-way functions which have a closed relation with public key protocol. See:

http://the-point-of-view-of-frank.blogspot.com/

### Trackbacks

1. Magic To Do | Gödel's Lost Letter and P=NP