The start of the modern era of random algorithms

Michael Rabin is one of the greatest theoreticians in the world. He is a Turing Award winner (with Dana Scott in 1976), a winner of the Paris Kanellakis Award, a winner of the EMET Prize, and many others. What is so impressive about Rabin’s work is the unique combination of depth and breadth; the unique combination of solving hard open problems and the creation of entire new fields. I can think of few who have done anything close to this.

For example, one of his results is the famous proof, in 1969, that the monadic second-order theory of two successors is decidable–${S2S}$. I will not explain what this is here, but trust me that this is is one of the deepest decidability results in logic. What is even more remarkable about this theorem is the myriad number of corollaries that flow from it. He also helped create the field of modern cryptography, and has made contribution to almost all parts of computer science from distributed computing to secure operating systems. However, one of my personal favorites of his results is the randomized linear time pattern matching algorithm that he created in 1987 with Dick Karp. This work is so elegant and simple: it is based on just doing the “obvious” brute force algorithm with a dollop of randomness to make it run in linear time. What a wonderful result. I will discuss it in detail in a future post.

I first met Rabin when I was a graduate student at Carnegie Mellon–around 1970. One day Rabin visited CMU and my advisor at the time–Don Loveland–set up a meeting between the three of us. Michael kindly asked me what I was working on at the time. I was, at that time, working on a problem that concerned the power of a certain type of tree automata. Rabin listened to my problem and asked if it would be okay if he thought about the question. I replied of course–I wanted to know the answer more than anything. A few days later I was entering a classroom where the theory qualifier examination was being given. Loveland was proctoring the exam and causally told me that Rabin had just called him with a new idea on how to solve my problem. Don said it was unclear if it would work, but he would tell me the idea later on, after the exam. I still remember sitting down at my desk and consciously thinking: I could do the exam and likely pass, or I could try to figure out what Rabin’s new idea was and surely fail. But I could not do both. I decided to work on the exam–not an easy choice. As it turned out, as I found out the next day, Rabin’s idea was very clever, but was not enough to crack the problem. It is still open today.

Since then I have had the pleasure to work with Michael and to count him as a close colleague and friend. Today, I will talk about his work that essentially brought “randomness” to the design of algorithms.

The Nearest Neighborhood Problem

Rabin may not have been the first to use randomness in algorithms, but he was the last to introduce the notion. Often in mathematics the key is not who discovers something first, but who discovers it last. Rabin discovered the power of randomness last. He used randomness to solve a geometric problem about points in the plane. His result was that with randomness the problem could be solved in linear time. He gave a series of talks based on his ideas that quickly spread the word throughout the theory community that randomness is powerful. In a short time, there where many other papers using randomness. Today of course randomness is a key tool in the design of algorithms. We teach whole courses on it (I am doing that right now at Tech), there are whole conferences just on this critical topic. It such a part of theory that it may be hard to believe that there was a time when algorithms were always deterministic.

The problem was the nearest neighbor problem for the plane. Let ${n}$ points

$\displaystyle x_{1},x_{2},\dots,x_{n}$

be in the unit square. The problem is to find the two points ${x_{i}}$ and ${x_{j}}$ that are the closest in Euclidean distance. It is convenient to assume that there is a unique closest pair. If there are several with the same minimum distance, then Rabin’s algorithm still works. However, this assumption makes the discussion easier to follow.

Rabin’s Algorithm

Here is the outline of Rabin’s algorithm. Clearly, the problem can be solved in time ${O(n^{2})}$ by a brute force method. Just compute the distance between each pair of points. Rabin had the clever idea of using randomness. So he randomly selects ${\sqrt n}$ of the points: call them ${S}$. Then he computes the minimum distance between the points in ${S}$. This he does by a brute force algorithm: clearly this takes

$\displaystyle O( (\sqrt n)^{2}) = O(n)$

time. Call ${d}$ the minimum distance among the points in the sample set ${S}$.

He then proceeds to imagine that the unit square is divided up into a “checkerboard” where each small square is size ${d}$ by ${d}$. Clearly, there may be a huge number of these squares. If, for example, ${d}$ is very small then there could be many more squares than points. In any event no algorithm that hopes to run in linear time can afford to “look” at all the squares. So we must figure out a way to avoid that. Let’s call a square filled if it contains one or more points from the ${n}$ original points. Note, there are at most ${n}$ filled squares. Rabin then does the following: First, for every filled square he computes the closest pair. This is again done by brute force. Second, for every pair of adjacent filled squares he computes the closest pair between their points: call squares adjacent if they share even one vertex. He then claims that he has found the closest pair.

This is really quite a simple argument. No matter what ${d}$ is, the closest pair must be either in a square together or must be in adjacent square. If they were separated by a square, then their distance would be at least ${d.}$

There are two last pieces of his algorithm. First, we must show that we can find the squares that are filled in linear time. As pointed out earlier we cannot afford to look at all the squares. This is done by a hashing technique and was the subject of some discussion at the time. Clearly one needs a model of computation that allows the finding of the filled squares fast. Since this is not a randomization idea we will not comment further on it other than to say it is an issue. Steve Fortune and John Hopcroft discuss this here.

What is much more interesting is the running time of the algorithm. Let ${f_{k}}$ be the number of points in the ${k^{th}}$ filled square. It is easy to show that the cost of the comparing of the distances in the filled squares and in the adjacent squares is bounded above by

$\displaystyle C = \sum_{k} O(f_{k}^{2}).$

Note, the value of this sum must be linear for Rabin’s algorithm to run in linear time. The value ${C}$ depends on the quality of the sampled ${d}$. If ${d}$ is too big, then the sum can easily be super-linear. In a tour-de-force Rabin is able show that because the sample had ${\sqrt n}$ points the expected size of ${C}$ is linear in ${n}$.

Open Problems

After Rabin visited me at Yale and gave his talk on his nearest neighborhood algorithm I tried to reproduce the proof that ${C}$ is linear. The proof is difficult and I could not prove it. The trouble is that the sample ${S}$ is of size ${\sqrt n}$ and thus has about ${n}$ distances. If these were independent they would give a good estimate of the real minimum distance. However, they clearly are not independent. If ${a,b,c}$ are points in ${S}$, then if ${a,b}$ are close and ${b,c}$ are far apart, it seems likely that ${a,c}$ would also be far apart. This is what makes the analysis so hard.

I soon realized that there was a way to fix Rabin’s algorithm so the analysis would be simple. The key is not to sample points, but rather to sample distances. More precsiely, select ${n}$ pairs of points and compute the distances. Let ${d}$ now be the minimum among these distances. It is now quite simple to prove that ${C}$ is linear in ${n}$. The reason it is so much easier is that now the distances are all mutually independent by definition.

There is one lesson here and one open problem. The lesson is to always strive to make everything as independent as possible in designing a random algorithm. The reason, I am sure, that this did not occur to Michael is that he is so strong that he could prove that sampling points worked. For the rest of us making the sampling pairs of points is a great simplification.

The open problem is a general question about the nearest neighbor problem. I will talk more about this in future posts, but the problem of finding the exact nearest neighbor in an arbitrary dimension Euclidean space is still open. One of my favorite open problems.

12 Comments leave one →
1. March 6, 2009 4:23 pm

In the Algorithm Design book by Kleinberg and Tardos, they describe a similar hash points into squares to find the closest pair algorithm. Do you know who came up with this version of the algorithm? It is so simple that even I can prove it is expected linear time.

It basically goes like this.

Randomize the input order.
Let C = distance(a_1, a_2)
Make a grid of size C, hash items into appropriate cells.
For i = 3 to n
Find i’s nearest candidate neighbor via cell probe search.
If found a neighbor and neighbor is distance < C, update C and rehash

Each neighbor candidate search is O(1), done with a few hashes.

The algorithm is expected to do O(log(n)) re-hashes, but each takes time proportional to i. The probability of needing to rehash is 2/i, the probability that this point is in the closest pair after examining i points. Thus, the expected cost is sum_i=2^n Prob[Rehash at step i] * Cost(rehash at i) = sum_i=2^n 2/i * i = sum_i=2^n 2 = O(n).

March 6, 2009 6:17 pm

Rabin was definitely the first to do something like this. However, their algorithm is a bit different so I do not know who came up with first…rjlipton

2. March 28, 2009 8:46 am

In 1990, Yossi Matias and I worked on this problem. Our paper appeared
in the 1991 Canadian Conf. on Comp Geometry. It has the basic idea
of picking a point at random, and then computing the distance to its
closest point and then using this distance for constructing a grid that
is used to filter points. There was a paper by Golin, Raman,
Schwarz and Smid in SODA 1993, that built upon this approach for
dynamic closest pairs and I think in another paper they simplified
our original algorithm as well.

The following link has the paper:
http://www.cs.umd.edu/~samir/grant/cp.pdf

3. DR. EDWARD SIEGEL/"PHYXSICAL-MATHEMATICIST"/"MATHICIST"/"FUZZYICS"="CATEGORYICS"("SON OF 'TRIZ'") permalink
July 26, 2010 6:17 pm

RABIN GAVE A VERY NICE TALK AT THE WIENERFEST AT MIT IN 1994 ABOUT HI REDISCOVERY IN (SO CALLED) “COMPUTATIONAL-COMPLEXITY” OF:
PLATO-ARISTOTLE “SQUARE-OF-OPPOSITION”,
AKA SIEGEL “FUZZYICS”=”CATEGORYICS” (“SON OF ‘TRIZ'”) DICHOTOMY
(REF:” SYMP. ON FRACTALS”, MATERIALS RESEARCH SOC. FALL MTG., BOSTON(1989)-5-CONTIGUOUS-PAPERS!!!,
AKA SIEGEL-BAEZ “CATEGORY-SEMANTICS”
BUT CO-EDITOR DAVID JERISON SAYS HE COULD NEVER GET A MANUSCRIPT FROM RABIN.
MAYBE SOME READER KNOWS IF AND WHERE AND WHEN HE EVER PUBLISHED IT IF NOT, THEN AND CAN E-MAIL ME THIS INFORMATION?
CAN GET HIM TO PUBLISH IT SOMEWHERE READABLE AND QUOTABLE/REFERENCEABLE?

July 27, 2010 7:02 am

I would love to see this.

4. Li Chen permalink
October 12, 2012 1:32 pm

in the adjacent squares is bounded above by

\displaystyle C = \sum_{k} O(f_{k}^{2}).

In adjacent square, should add some f_(k)f_(k’) since only 4 adjacent squares. so
the formula is OK.