The Iceberg Effect in Theory Research
The iceberg effect in research: how theorems can be lost
Fred is my favorite name, when I need a “random” name. You might have also noticed that the picture is not even a picture of a person—it’s an iceberg. I will explain why in a moment.
Usually I talk about real people, but today I thought I would use phony names to protect the innocent—or at least protect my friends.
I plan to talk about an issue that comes up in research:
Is the fact X a new result? Or is it a known result?
At FOCS, a great researcher, who I will call “Fred,” asked me if a certain theorem, he had just proved, was new. He did this in the hall right outside one of the conference rooms, which did not allow us much time or any blackboard for drawing pictures. His result concerns a clever algorithm that yields a neat approximation algorithm to a well studied problem. He asked me and others if it was “new.” My immediate response was that I knew a related theorem, but that Fred’s theorem seemed new to me.
Moreover, Fred’s theorem seemed to generalize the known theorem, which made me excited so I began to write an entire post on his result. In the post I talked about his theorem, its proof, and how it differed from the old known result. In order to do the latter I had to Google until I found the old paper. This was not a completely trivial task, since the paper was so old. I try to be careful in how I cite papers and wanted to see the original. That is where I hit a snag: his “new” theorem was about years old. It was “known.” Or was it?
The Iceberg Effect
The confusion about whether or not Fred’s theorem was new could be traced to what I will call the iceberg effect. Often an author may write up a paper that becomes famous for some theorem T. But also in the paper there are other theorems, which are not nearly as exciting as the main theorem T. Over time everyone knows T, we teach T to our students, we use T in our papers, and T becomes part of the fabric of theory.
The problem is that T is like the visible part of an iceberg. We see the tip, T, that is above the water, but the part below, the other theorems, are soon forgotten. They may be quite clever or hard to prove, but people just know about T. This is where the iceberg effect hits. You prove a “new” theorem, which is not new—it’s part of the invisible theorems that are below the water line.
This is exactly what happened to Fred. He lost a theorem, and I lost a post. Oh well.
There are many other effects that make it hard to know for sure that something is new. I think that this problem is less of an issue in “hot” areas, since the results are fairly recent.
For results that are older the iceberg effect and other effects can be a major problem. There are so many on-line papers, workshops, conferences, journals, and other sources of information that it is easy to overlook something. This is not unique to theory, but it is an issue that we need to work on.
The Theorem In Question
Dick Karp proved a theorem about TSP for random points in the unit square. His theorem is based on a famous result of Jillian Beardwood, John Halton and John Hammersley who prove that the length of the optimal TSP tour for uniform independent random points in the unit square is .
Karp proved his famous result in this paper:
Theorem: Let random points be in the unit square, and let the length of their optimal tour be . Then, there is a polynomial-time algorithm that given the points finds a tour of length so that
Since the optimal tour is , this result shows that the tour found is close to the optimal length: the error term is low order.
This is the T result. The “new” theorem was:
Theorem: Let arbitrary points be in the unit square, and let the length of their optimal tour be . Then, there is a polynomial-time algorithm that given the points finds a tour of length so that
What Karp proved in his original paper is much stronger than his theorem on random points, which is the T result that we all remember. He actually proved the new result in one of his below-the-waterline theorems. He proved exactly the above theorem, but it was stated in slightly different language:
For a more recent treatment of Karp’s result check out this.
I will include a sketch of Karp’s theorem, with a weaker bound. Actually, this is how Fred sketched his “new” result, when he spoke to me in the hall. Check out Karp’s paper or the notes for the better bound. The key ideas are all here, however.
Suppose that points are given that all lie in the unit square. The points need not be random.
The algorithm has three steps.
Step 1: Divide the square into small square cells of side where is a slow growing function. Then, if a small cell has more than point in it, connect all the points together at their respective centroid. Now each small cell can be thought of as having at most point—call these points the leaders. The error introduced by this step is at most , which is .
Step 2: Next divide the square into large square cells of side . Each large cell contains at most leaders, since a large cell contains at most small cells. Use a brute force dynamic program to solve the TSP for each large cell exactly. This causes no error, but takes time at most,
If , then this runs in polynomial time.
Step 3: Finally, connect all the large cells together by a simple path: the path snakes across each row of large cells. The total length of this path is linear in the number of rows of the large cells, . This path can be all error so the total error from each step is at most:
Since is , the total error is bounded by , which implies the theorem.
I may be alone, but I often run into this question of what is known. The iceberg effect is one barrier that I have hit. But there are other issues. Can we do a better job of making sure that we do not re-invent what is known?
Or was it really known if top researchers did not recall Karp’s “other” result? Perhaps the outcome here is good—I had forgotten that Karp’s ideas worked just fine for any collection of points. So perhaps the whole affair has worked out just fine.