An experimental approach to asymptotic analysis

Bob Sedgewick is an expert on the analysis of algorithms–one of the best in the world. This is not too surprising since he is a PhD student of Donald Knuth, who coined the term analysis of algorithms. Bob is also famous for his textbooks on algorithms and more recently his monograph on the analysis of algorithms with Philippe Flajolet. The last is a required book for anyone trying to understand the performance of an algorithm.

Today’s post is on the potential for abuse with the famous $O(f(n))$ notation. I have posted on this already twice. Twice. I think that it is a serious issue, and that Bob has a simple but powerful “solution”.

I have many stories about Bob. One of the strangest is the time that we were offered a bribe. For the record we did not take the bribe–although the statute of limitations is probably long run out on us. Also for the record the bribe offer was very smooth, so we could be wrong–no it was a bribe offer.

At the time Bob was the chair of the department of computer science at Princeton. For a variety of reasons there was a mathematician, we will call X, that was not going to be kept on at the Princeton mathematics department. This department is perhaps the best in the world, so that should not come as a shock.

The complication is that X was sort of a computer scientist. For reasons that I did not understand then, nor do I understand now, there were people in the political world that really wanted X to stay at Princeton. Powerful people. So far fine. It’s nice to have friends in high places. The trouble began when Bob was “asked” to a meeting with them. I was brought along for support, and it turned out that was a good idea.

The meeting was strange. They spoke highly of X. We agreed and said that we had a mechanism for hiring faculty. For starters, X should send us a resume, which would start the hiring review process. They nodded, but said why could we not just hire X. Bob repeated again and again that there was a process–and on it went.

Finally, one of the senior political types made it very clear that we personally would be very happy if we just hired X. No, he did not open a briefcase of cash, but his meaning was clear. Bob and I repeated the mantra, have X send in the resume. The meeting finally ended.

As soon as Bob and I were outside, we looked at each other and Bob said, “that was a bribe offer, no?” I agreed. A very strange affair all around. By the way X went elsewhere and did just fine.

How to Not to Abuse O-Notatation

Bob has a simple idea that I find quite compelling–I will present it in his own words. Think of this as a guest post. His initial “this” refers to an algorithm that we were discussing via email.

This example brings to mind the idea of rampant misuse of O-notation and an easy way to fix it that I’ve been teaching and using lately. Suppose that you think that the running time of algorithm is ${O(N \log N)}$ and another is ${O(N)}$. With O-notation, you have nothing to test with an experiment. But in almost all cases where you are going to run an experiment, it is reasonable to use ${\approx}$ notation to express a hypothesis. In this case, you would hypothesize that the running time of the first is ${\approx a N \log N}$ and the second is ${\approx b N}$ for some constants ${a}$ and ${b}$. So the ratio of the running times ${R(N)}$ should be ${\approx (a/b) \log N}$. Then if you run the same experiment for ${10N}$ and take the ratio ${R(10N)/R(N)}$, all the constants cancel and this ratio of ratios approaches ${1}$. Under the hypothesis, running the experiment for larger ${N}$ should give a better approximation to ${R(N)}$. Or, you can just run an experiment and solve for the constants to predict running time.

In short, experimental verification of hypotheses is easy with ${\approx}$ notation and impossible with O-notation. Despite that fact, O-notation is widely misused to predict and compare algorithm performance.

Bob then added, in his next email:

But actually log factors are not so important with this approach. For an unknown algorithm, we hypothesize ${T(N) \approx aN^b}$, find ${b}$ by computing ${T(2N)/T(N)}$ for as large an ${N}$ (and as many experiments) as we can afford, then use the actual running time to solve for ${a}$. This gives very accurate predictions. Essentially the log factor is absorbed into the constant, which
is not so bad for large ${N}$.

Open Problems

Try Bob’s idea out. Does this simple idea work on interesting problems?

July 24, 2009 10:53 pm

What is the title of Sedgewick and Flajolet’s monograph? My Google-ing skill seems to be off today.

July 24, 2009 11:36 pm

Sedgewick and Flajolet’s book is “An Introduction to the Analysis of Algorithms”. You can find it here as well as a link to it at Sedgewick’s homepage.

2. July 26, 2009 2:59 am

My favourite book to learn programming what the one by knut.
I was thinking that O is often missused because in many algorithms you do not know when the iterations are going to end ,because they can depend stronly on the data they are processing, so you can not infer the order the algoerithm might have.
Maybe I am wrong, maybe you can just put always the higher possible one.
Can anybody explain in more detail?

July 27, 2009 8:53 am

Some colleagues of mine, Simon Goldsmith, Alex Aiken, and Dan Wilkerson, wrote a paper that tries to fit an implementation’s performance to a power-law, i.e. solving for a and b in T(N) ~= aN^b , using the implementations run time on different workloads. They are more fine grained than just looking at the final running time, in that they actually look at how often each basic block executes. This turns out to work surprisingly well and in some cases discovered performance bugs in implementations:

“Measuring empirical computational complexity”
http://portal.acm.org/citation.cfm?id=1287681

Their tool is available here:
http://trend-prof.tigris.org/

4. July 28, 2009 10:28 pm

Nice stuff! That sounds similar to the automated performance testing/analysis system I put together recently: http://ndickson.wordpress.com/2009/07/09/inventor-performance-viewer/

Good to know that some other people are working on this front too.

$\Theta(f(n))$ means that there’s some unknown constant on f(n) plus lower order terms. In the notation above, there’s a known/measured constant (if it’s a runtime, then it has units, such as milliseconds). For example, saying that something is $\Theta(n\log n)$ doesn’t say whether it’s “500 days * nlogn” or whether it’s “2ns * nlogn”, whereas the notation above does; it only removes lower order terms instead of also removing the constant.