Rabin Flips a Coin
The start of the modern era of random algorithms
Michael Rabin is one of the greatest theoreticians in the world. He is a Turing Award winner (with Dana Scott in 1976), a winner of the Paris Kanellakis Award, a winner of the EMET Prize, and many others. What is so impressive about Rabin’s work is the unique combination of depth and breadth; the unique combination of solving hard open problems and the creation of entire new fields. I can think of few who have done anything close to this.
For example, one of his results is the famous proof, in 1969, that the monadic second-order theory of two successors is decidable–. I will not explain what this is here, but trust me that this is is one of the deepest decidability results in logic. What is even more remarkable about this theorem is the myriad number of corollaries that flow from it. He also helped create the field of modern cryptography, and has made contribution to almost all parts of computer science from distributed computing to secure operating systems. However, one of my personal favorites of his results is the randomized linear time pattern matching algorithm that he created in 1987 with Dick Karp. This work is so elegant and simple: it is based on just doing the “obvious” brute force algorithm with a dollop of randomness to make it run in linear time. What a wonderful result. I will discuss it in detail in a future post.
I first met Rabin when I was a graduate student at Carnegie Mellon–around 1970. One day Rabin visited CMU and my advisor at the time–Don Loveland–set up a meeting between the three of us. Michael kindly asked me what I was working on at the time. I was, at that time, working on a problem that concerned the power of a certain type of tree automata. Rabin listened to my problem and asked if it would be okay if he thought about the question. I replied of course–I wanted to know the answer more than anything. A few days later I was entering a classroom where the theory qualifier examination was being given. Loveland was proctoring the exam and causally told me that Rabin had just called him with a new idea on how to solve my problem. Don said it was unclear if it would work, but he would tell me the idea later on, after the exam. I still remember sitting down at my desk and consciously thinking: I could do the exam and likely pass, or I could try to figure out what Rabin’s new idea was and surely fail. But I could not do both. I decided to work on the exam–not an easy choice. As it turned out, as I found out the next day, Rabin’s idea was very clever, but was not enough to crack the problem. It is still open today.
Since then I have had the pleasure to work with Michael and to count him as a close colleague and friend. Today, I will talk about his work that essentially brought “randomness” to the design of algorithms.
The Nearest Neighborhood Problem
Rabin may not have been the first to use randomness in algorithms, but he was the last to introduce the notion. Often in mathematics the key is not who discovers something first, but who discovers it last. Rabin discovered the power of randomness last. He used randomness to solve a geometric problem about points in the plane. His result was that with randomness the problem could be solved in linear time. He gave a series of talks based on his ideas that quickly spread the word throughout the theory community that randomness is powerful. In a short time, there where many other papers using randomness. Today of course randomness is a key tool in the design of algorithms. We teach whole courses on it (I am doing that right now at Tech), there are whole conferences just on this critical topic. It such a part of theory that it may be hard to believe that there was a time when algorithms were always deterministic.
The problem was the nearest neighbor problem for the plane. Let points
be in the unit square. The problem is to find the two points and that are the closest in Euclidean distance. It is convenient to assume that there is a unique closest pair. If there are several with the same minimum distance, then Rabin’s algorithm still works. However, this assumption makes the discussion easier to follow.
Here is the outline of Rabin’s algorithm. Clearly, the problem can be solved in time by a brute force method. Just compute the distance between each pair of points. Rabin had the clever idea of using randomness. So he randomly selects of the points: call them . Then he computes the minimum distance between the points in . This he does by a brute force algorithm: clearly this takes
time. Call the minimum distance among the points in the sample set .
He then proceeds to imagine that the unit square is divided up into a “checkerboard” where each small square is size by . Clearly, there may be a huge number of these squares. If, for example, is very small then there could be many more squares than points. In any event no algorithm that hopes to run in linear time can afford to “look” at all the squares. So we must figure out a way to avoid that. Let’s call a square filled if it contains one or more points from the original points. Note, there are at most filled squares. Rabin then does the following: First, for every filled square he computes the closest pair. This is again done by brute force. Second, for every pair of adjacent filled squares he computes the closest pair between their points: call squares adjacent if they share even one vertex. He then claims that he has found the closest pair.
This is really quite a simple argument. No matter what is, the closest pair must be either in a square together or must be in adjacent square. If they were separated by a square, then their distance would be at least
There are two last pieces of his algorithm. First, we must show that we can find the squares that are filled in linear time. As pointed out earlier we cannot afford to look at all the squares. This is done by a hashing technique and was the subject of some discussion at the time. Clearly one needs a model of computation that allows the finding of the filled squares fast. Since this is not a randomization idea we will not comment further on it other than to say it is an issue. Steve Fortune and John Hopcroft discuss this here.
What is much more interesting is the running time of the algorithm. Let be the number of points in the filled square. It is easy to show that the cost of the comparing of the distances in the filled squares and in the adjacent squares is bounded above by
Note, the value of this sum must be linear for Rabin’s algorithm to run in linear time. The value depends on the quality of the sampled . If is too big, then the sum can easily be super-linear. In a tour-de-force Rabin is able show that because the sample had points the expected size of is linear in .
After Rabin visited me at Yale and gave his talk on his nearest neighborhood algorithm I tried to reproduce the proof that is linear. The proof is difficult and I could not prove it. The trouble is that the sample is of size and thus has about distances. If these were independent they would give a good estimate of the real minimum distance. However, they clearly are not independent. If are points in , then if are close and are far apart, it seems likely that would also be far apart. This is what makes the analysis so hard.
I soon realized that there was a way to fix Rabin’s algorithm so the analysis would be simple. The key is not to sample points, but rather to sample distances. More precsiely, select pairs of points and compute the distances. Let now be the minimum among these distances. It is now quite simple to prove that is linear in . The reason it is so much easier is that now the distances are all mutually independent by definition.
There is one lesson here and one open problem. The lesson is to always strive to make everything as independent as possible in designing a random algorithm. The reason, I am sure, that this did not occur to Michael is that he is so strong that he could prove that sampling points worked. For the rest of us making the sampling pairs of points is a great simplification.
The open problem is a general question about the nearest neighbor problem. I will talk more about this in future posts, but the problem of finding the exact nearest neighbor in an arbitrary dimension Euclidean space is still open. One of my favorite open problems.