Liskov Wins The Turing Award
Simulating queues and stacks and amortized complexity
Barbara Liskov just won the Turing Award. Congratulations to Barbara, an excellent choice for this great award. Many of you who work in the theory of computing may not know Barbara or her work. While I have never worked in her area, I have always had a wide set of interests, and am well aware of her important contributions to programming. Today I plan to talk about a connection between her work and theory.
An observation is that if you want to win a Turing Award working on programming languages is not a bad idea. The first Turing award went to Alan Perlis for Algol, John Backus won for Fortran, Niklaus Wirth for Pascal, John McCarthy for LISP (partially), Robin Milner for LCF, Alan Kay for Smalltalk, and several other awards where given for similar types of research. You might wonder if it’s too late for you to switch areas It’s way too late for me.
Stacks and Queues
There is an interesting connection between her work on abstraction and a basic result of theory. I thought I would share this with you today. One of her contributions is the development of the concept of abstraction. There are many deep issues that surround this concept, but today I will focus on a simple example involving stacks and queues.
We need to be precise so let me define what I mean by a stack and by a queue. A stack is a data structure that operates as follows:
- You can test whether or not the stack is empty;
- You can always put an element on top of the stack;
- If the stack is not empty, then you can remove an element from the top of the stack.
Note, in theory we sometimes prefer to call this a pushdown. One of Liskov’s contributions is the creation of a methodology to make what I stated in “English” formal. A visual model for a stack is the device that you find at buffets that hold clean plates: waiters can push new clean plates onto the top, customers can only remove plates from the top.
We also need the notion of queue another data structure that operates as follows:
- You can test whether or not the queue is empty;
- You can always put an element at the end of the queue;
- If a queue is not empty, then you can remove an element from the front of the queue.
A visual model for a queue is the TSA security check line at the airport. People can only join at the end; and they can only leave from the front of the line.
A Simple Simulation
It is not hard to prove the following: two stacks can simulate one queue. Let the two stacks be denoted by and : think “left” and “right”. At the start both and are empty. I need to explain how to do the three operations of a queue:
- The queue is empty if and only if is empty;
- To add a new element push it onto the top of the stack ;
- To remove an element do the following: move all the elements from the stack to the stack . Then, remove the top element from . Finally, move all the elements of back to the stack .
Notice that the the stack always contains all the elements of the queue except during the simulation of the “remove” operation. The nice feature of this simulation is that the first two operations are fast: they each can be done in constant time. The problem with this simulation is the third operation. Each time it is performed all of the elements have to be moved from . This is potentially a very slow operation, if there are many elements in the queue.
You could try to fix this by keeping all the elements in the stack . Now adding elements to the queue would be expensive, but removing them would be cheap. The problem is: how can we make it so that all the operations take constant time? The answer unfortunately is that this is impossible. However, we can do something that is almost as good.
A Better Simulation
We will show that there is a way to use two stacks to simulate one queue so that each element is only moved a constant number of times. Therefore, the average cost over time for operations will be linear in . In theory terms what we are doing is using not worst case complexity but amortized complexity. This concept was created by Danny Sleator and Bob Tarjan and will be the subject of a future post. Again let the two stacks be and . Initially they are empty. I need to explain how to do the three operations of a queue:
- The queue is empty if and only if and are both empty;
- To add a new element push it onto the top of the stack ;
- To remove an element do the following: There are two cases. In the first case is non-empty. In this case remove the top element from . In the second case is empty. In this case move all of the stack onto the stack Then, remove the top element from
There are two claims. First, that this only moves each element a constant number of times. That should be clear, since elements only move from to –they never go back. Second, that this correctly implements a queue. The insight is that suppose holds and holds . Then, the arrival of the elements was
The reason for this is that as we move elements from to the order was “flipped”. Thus, this is an implement of a queue by two stacks that takes constant amortized complexity: pretty neat.
What is the open problem? First, let me tell you a story about the Princeton Computer Science Department. Once while I was at Princeton I was in charge of the graduate program. Like many places we had a written multi-day exam: one day was on theory, one day on systems and so on. Then, after the exams were all graded we got together as a faculty to decide which students passed. This year there was an interesting debate. The theory faculty were upset because many of their students failed a theory question. The question was to give a good simulation of a queue by two stacks, I had written that question. The debate was pretty heated. Several of the theory faculty thought the question was a “system” question, while others thought the question was a “theory” question. Eventually all cooled down and we made reasonable decisions about the students.
However, one of my theory colleagues was unconvinced. He still claimed that this was not a theory question. I took him to my office and pulled out a paper on Turing Machine models and showed him the exact lemma that proved the above amortized simulation. He finally, gave in.
So what is the open problem? I think the real issue is one of how we teach and educate our students. I think that problems are problems, but in real life they do not come with labels. I would argue that anyone who claims to understand stacks and queues should either know or be able to figure out an efficient simulation. I actually suggested to my faculty that we drop the labels on the exams. That we mix up the questions and give on each day. Let the students figure out if the question is a theory or a system question. About two thirds of the faculty thought it was a great idea, and the other two thirds that it was a crazy idea. We never did change the exam.
My question today is how can we better educate students to solve problems, to apply knowledge across boundaries, and in general be better problem solvers? What do you think?