# Natural, Natural, and Natural

* A great recent talk on natural algorithms *

Bernard Chazelle is one of the world experts in computational geometry, lower bounds on data structures, upper bounds on data structures, and discrepancy theory—among many other aspects of theory. His book on the discrepancy is a classic. If you need to know about placing points so they are “well distributed,” you should read his book.

Today I plan to discuss a recent presentation he gave at Georgia Tech, actually at the ARC center, on Natural Algorithms (NA).

I have had the pleasure of knowing Bernard since he was a graduate student at Yale, where his advisor was David Dobkin. After I got to Princeton in 1980, I quickly recruited David, who then quickly recruited Bernard. I cannot imagine Princeton without either of them.

I have one story I would like to relate about Bernard. It concerns the PhD. examination process we used at Princeton years ago. One year I was in charge of the graduate examination process. We gave the students written exams, and then got the faculty together to discuss each student’s case. Our pass/fail decision was based on the exam and other information—for example, input from their advisor.

The written part of the exam was given over three days: one day was on systems, one on theory, and one on applications. There was a question on the systems exam that most of the theory students failed. This led to a pretty heated discussion about whether or not the question was fair. It was my question, so I thought it was okay, but many did not. Curiously, the systems faculty liked the question, but argued it was a theory question.

Bernard liked the question, thought it was fair, but still agreed it was a systems question—one theory students might not need to understand. Finally, cooler heads prevailed and we made decisions about exam grades all the faculty could support.

After the meeting Bernard was still convinced my question was not something a theory student should know. So I asked him to come to my office, where I pulled out an old paper on Turing Machines, and pointed to a lemma—there was my question. He finally agreed the question was definitely a theory one. I always thought the question was just a basic fact about data structures. Here is the question, you judge:

Using two stacks, show you can simulate a queue so the cost of operations of the queue is bounded by .

What do you think? Theory or systems?

** Natural **

The word “natural” is a wonderful word, full of positive feelings—I think good thoughts when I hear it. This is why the word is used so often: there are natural resources, there are natural remedies, there is natural selection, there is natural history, there is the movie “The Natural,” and on and on. Not surprisingly, “natural” is also popular in theory and in mathematics. Here are a few uses of “natural” in our area of theory.

**Natural Proof Systems** Gerhard Gentzen created natural deduction in 1935. He created a set of inference rules for predicate calculus: rules with a simple clarity, rules with a certain naturalness. For example,

is a simple natural deduction. You should read as meaning: if the formulas in are true, the formula follows. The brilliance of Gentzen was creating a logic consisting only of natural rules that is still complete: this uses his famous cut elimination method.

**Natural Mappings**

I have never used category theory in my research, and only know the very basics. If you are not familiar with category theory look here. With apologies to the experts, I view category theory as a way to study functions rather than objects as the primary concern. This shift in focus seems simple, but has remarkable consequences. For example, category theory is indispensable in explaining why some maps are “natural” and others are not. Category theory can do much more, but this is certainly one of the elementary but very powerful uses of it. Saunders Mac Lane once said:

“I didn’t invent categories to study functors; I invented them to study natural transformations.”

An important example comes from the theory of vector spaces. Consider, to be concrete, finite dimension vector spaces over the reals. Every such space has a dual space of the same dimension: the dual is nothing more than all the linear functionals defined on the vector space. A linear functional is just a linear map from to .

Since any two such vector spaces are isomorphic if they have the same dimension, the following two statements are true for any :

- The space is isomorphic to .
- The space is isomorphic to .

Yet, intuitively these statements are different: in (1) the map seems to depends on the basis used to define the spaces; in (2) the map is canonical or natural. The latter follows since there is a “canonical map” from to : Define by,

where is in the vector space and is in the dual space . This map clearly does not depend on any basis of the vector spaces—it is natural. Category theory gives a precise way to define natural maps, and unravels why these two isomorphisms are different.

**Natural Selection** John Holland in the early 1970’s created a notion of natural selection for the creation of algorithms. I will have to discuss this in more detail another time.

**Natural Proofs** In 1994 Alexander Razborov and Steven Rudich created the notion of natural proofs. See my earlier discussion for more details on this important notion.

**Natural Algorithms** This is an area started by Chazelle in the early 2000’s and was the topic of his talk. NA is an attempt to apply complexity theory to the study of natural phenomena like bird flocking.

** Form of Bernard’s Talk **

Bernard gave a beautiful talk on his recent work on NA. It was terrific on two levels: a great presentation and great results. The presentation was so special I thought I would discuss how he did it. It was unique.

His slides were very sparse. Some slides had a single phrase or a single picture or a single equation. Usually we see talks with slides full of bullets and multiple pieces of information. Bernard’s was the opposite—a minimalist approach to the use of powerpoint. It flowed easily from one idea to the next, and had a beautiful rhythm. I was very impressed with the form of his talk. If you can have him visit and give a talk, you will be very happy you did.

** Content of the Talk **

In Bernard’s talk were some quite striking results and some intriguing open problems. One of the most surprising was an “old” result he has proved before on the theory of how birds flock. This is an intriguing question, with no compelling answer. They do flock, but we do not understand why they do, or how.

There are formal mathematical models, however, that let us try to study flocking. One of the models Bernard studies is a variant of the model of Felipe Cucker and Steve Smale. All models try to capture the following ideas:

- Birds will try to avoid local crowding.
- Birds tend to fly towards the average direction of their neighbors.
- Birds tend to fly towards the average position of their neighbors.

What is surprising is Bernard proves an upper bound and a lower bound on how long it will take for the birds to achieve a steady state. The times are immense—both the upper and the lower bound.

Theorem:

- A group of birds reach steady state in at most time steps.
- There are initial configurations of birds requiring more than time steps before steady state.

The notation with the arrows is defined by: and for ,

The proofs and the results are beautiful, but they suggest something is missing in the model, since the bounds are immense. Birds do actually flock in reasonable amounts of time, so there must a more realistic model. His results still are amazing.

Birds flocking over Rome:

** Open Problems **

One open problem is to find a model of flocking, for instance, where the bounds do not need exponential notation. This should be possible. The current work is beautiful, but birds do flock together faster than Bernard’s lower bound. He and I spoke about this after his talk and he agrees it is one of the key open problems in the area.

Using two stacks, show you can simulate a queue so the cost of n operations of the queue is bounded by O(n).It may

alsobe a fair systems problem, but it is certainly a fair theory problem. It’s just a standard amortized-analysis problem; I’ve asked similar questions on undergrad algorithms exams.Now, whether this was a fair theory problem

in the mid-1980sis another question entirely.About your first example:

I think that Gentzen showed his cut-elimination theorem only for sequent calculus and not natural deduction. Some sort of cut-elimination for natural deduction was only proved later by Prawitz (1965).

I will check, thanks for pointing this out.

> Using two stacks, show you can simulate a queue so the cost of n operations of the queue is bounded by O(n).

I can’t imagine a universe where this question is hard for a graduate theory student to answer. The only way I can see how you might get stumped is if you simply give up too early on your candidate algorithms. Technically speaking it is “amortized analysis”, but I don’t think anything fancy is required.

It’s a theory question that I’d like to think a systems grad student could answer. It is framed in terms concepts (stacks and queues) that will be familiar to systems people, and uses only very basic theory.

It’s not a systems question because if you’re a systems dude and you want a stack, you, like, just use a stack, you know.

Also, if you’re thinking in systems terms, amortized is not good enough unless it’s well behaved. Unless you’re thinking of a better implementation than I have in mind, the first pop after you pushed 10G of entries onto the queue would be a very ugly thing to have happen inside an OS kernel. Subsequent pops might be fast and amortize in theory, but won’t make up for the interrupts you didn’t catch.

Or have I failed to get it, like most of the theory students did?

Or have I failed to get itAlas no, you DID get it I guess.

It’s indeed very simple:

-Have an “input” stack and an “output” stack.

-Whenever an item come in push it onto the input stack.

-Whenever an output is requested and the output stack is not empty pop from it.

-Whenever an output is requested and the output stack is empty flush the whole input stack onto the output stack before proceeding.

– Since each item is pushed/popped only twice this is obviously O(n).

But that’s MATHEMATICISM AT IT’S WORST, totally disconnected from reality.

That’s what is taught in universities, no wonder that all software is such a pile of crap.

The foolishness of mathematicians has been pointed long ago by E.W.Dijkstra.

While I am at it (snarking), another terminally ill “genius” is Oleg Kiselyov, his website is filled to the brim with astoundingly insightful yet

uselessresults, because these are only extremely sophisticated TOY EXAMPLES, not the nagging real problems found in everyday practice.By the way, according to my systems folks this stack simulation is used in real code. So perhaps more realistic than you think.

If I recall the history correctly, useless all of discrete mathematics was totally disconnected from reality until the computer came along. So I don’t think it’s a good idea to discourage mathematicians from pursuing useless but interesting inquiries.

@Prof Lipton: I do like the 2 stacks == 1 queue question. It isolates testing for understanding of concepts and the ability to do basic manipulations, from testing for puzzle-solving ability. A student should be able to discover the algorithm. There just aren’t that many possible ways of passing data through two stacks where every input is eventually an output. And the O(n) analysis is then straightforward. The nearest thing to a “trick” is the amortization.

Why did the theory students tend to fail this? Was it just that they weren’t fluent with queues and stacks?

Subsequent pops might be fast and amortize in theory, but won’t make up for the interrupts you didn’t catch.You can actually get a constant lower bound instead of an amortized one. This is a classic of purely functional data structures : you can implement a queue with two lists (exactly the problem in this post), but you can then use lazy lists and a slightly clever implementation to get O(log n) delay for atomic operations in the worst case, and then use another clever trick to get O(1) delay in the worst case.

The idea of the first trick is to interleave “append” “reverse” computation steps : you reverse and add the input list incrementally to the output list instead of doing it only when the output list is empty.

The second trick is a subtle technique to force incremental evaluation of the resulting lazy list.

I discovered those techniques in Chris Okasaki’s wonderful “Purely Functional Data Structures” book. Queues were specifically described in his article “Simple and Efficient Purely Functional Queues and Deques”.

The presentation style you described reminded me of Larry Lessign’s presentation style (Larry Lessig is the main proponent of Creative Commons License) Many people like this style so much they follow his presentation strategy

Please look at the type of presentation by Larry Lessig (style)

http://presentationzen.blogs.com/presentationzen/2005/10/the_lessig_meth.html

Bernard’s presentation style could be different –

I have heard several variations of the Mac Lane quotation that also appears on the Wikipedia page in the past and wondered what the original statement was. The Wikipedia page unfortunately does not provide a citation. Categories for the Working Mathematician by Mac Lane has the following:

“As Eilenberg-Mac Lane first observed, `category’ has been defined in order to be able to define `functor’ and ‘functor’ has been defined in order to be able to define ‘natural transformation.'”

Benjamin Pierce’s book has the following quotation from one of the original Eilenberg-Mac Lane papers.

“It should be observed that the whole concept of a category is essentially an auxiliary one; our basic concepts are essentially those of a functor and a natural transformation…”

For an even more concrete example of why the lack of a canonical isomorphism between a vector space and its dual is important it helps to look at a vector space over the binary Z2 finite field. In that setting, you can’t define an inner product on the vector space–one usual means of inducing a natural duality map when working over the reals.

From the 1000 foot view I’ve found Bernard Chazelle’s natural algorithms idea intriguing. However, I don’t find the flocking work to be as compelling. For the following reason:

I had an economics professor who worked on some of the earlier computer simulations. One day in class I asked about the long term equilibrium behavior of a model we were studying. He replied that it didn’t matter. The models were all rubbish in the long term anyway due to inaccuracy and exogenous shocks. The point of the models were to study shorter term cause and effect.

Consequently, I’m skeptical about whether the sorts of asymptotic bounds achieved are all that relevant. Is there some reason this argument is incorrect that I’ve overlooked? Perhaps there are other more compelling examples of algorithmic thinking used for analysis in the natural sciences.

Nice points. One wonders what happens if we insist that the birds do not reach steady state, but only have to get close to steady state. I do not know what happens then.

Chazelle responded to this concern somewhat, here:

http://blog.computationalcomplexity.org/2009/11/multi-agent-biological-systems-and-nat.html?showComment=1257902789089#c1155632379743181735

The point, I think, is that we need to build new tools of algorithmic analysis. Lyapunov functions have been very useful in control theory, but their application is perhaps more an art than a science. New combinatorial tools to analyze nondeterministic paths (such as Chazelle’s “virtual bird”) are a theme that this blog has visited several times.

Some of the considerations that motivate the ideas of “naturalness” in algebraic geometry and category theory are expressed in the Preface to John Lee’s

Introduction to Smooth Manifolds, and then fleshed-out in Mac Lane’sMathematics: Form and Function.Here are some excerpts from Lee’s Preface:

———–

Over the past few centuries, mathematicians have developed a wondrous collection of conceptual machines designed to enable us to peer ever more deeply into the invisible world of geometry in higher dimensions.

The price we pay for this power, however, is that the machines are built out of layer upon layer of abstract structure.

Starting with the familiar raw materials of Euclidean spaces, linear algebra, and multivariable calculus, one must progress through topological spaces, smooth atlases, tangent bundles, cotangent bundles, immersed and embedded submanifolds, tensors, Riemannian metrics, differential forms, vector fields, flows, foliations, Lie derivatives, Lie groups, Lie algebras, and more … just to get to the point where one can even think about studying specialized applications of manifold theory such as gauge theory or symplectic topology.

The student’s main job is to absorb all the definitions and learn to think about familiar objects in new ways.

It is the bane of this subject that there are so many definitions that must be piled on top of one another before anything interesting can be said, much less proved.

The old joke that “differential geometry is the study of properties that are invariant under change of notation” is funny primarily because it is alarmingly close to the truth.

——-

Here are excerpts from Mac Lane:

——-

Modern computers have made big matrices manageable, but unfortunately many current texts of elementary linear algebra bury the ideas under a morass of muddled matrix manipulations with no understanding of the concepts.

“It has taken me over fifty years to understand the derivation of Hamilton’s equations … the point of this cautionary tale is the difficulty in getting to the bottom of it all.

Effective or tricky formal manipulations are introduced by Mathematicians who doubtless have a guiding idea—but it is easier to state the manipulations than to formulate the idea in words. Just as the same idea can be realized in different forms, so can the same formal success be understood by a variety of ideas. A perspicacious exposition of a piece of Mathematics would let the ideas shine through the display of manipulations.

———

It is clear that commutative diagrams (for example) were conceived by Mac Lane as a tool to “let the ideas shine through” what would otherwise be “morasses of muddled manipulations”.

Modern systems engineering can be understood as the confluence of these ideas. In order to “let the ideas shine through”, diagrammatic techniques of every variety are finding increasing use in engineering.

Didn’t you talk about this very question here:

https://rjlipton.wordpress.com/2009/03/12/liskov-wins-the-turing-award/

I did indeed.

lmao cool story dude.

Very nice site!

As a Newbie, I am always searching online for articles that can help me. Thank you

I don’t like the sound of all those lists he’s making – it’s like intriguing too innumerable notes at seminary; you feel you’ve achieved something when you haven’t.