A puzzle and a conference

Zohar Manna is an expert on the mathematical concepts behind all types of programming. For example, his 1974 book the Mathematical Theory of Computation was one of the first on the foundations of computer programming. He wrote textbooks with the late Amir Pnueli on temporal logic for software systems. As remarked by Heiko Krumm in some brief notes on temporal logic, there is a contrast between analyzing the internal logic of pre- and post-conditions as each statement in a program is executed, and analyzing sequences of events as a system interacts with its environment.

Today I want to talk about an encounter with Zohar years ago, and how it relates to a puzzle that I love.

Zohar did one of the coolest PhD theses ever. It was on schema theory, which we have talked about before here. He was advised by both Robert Floyd and Alan Perlis in 1968—two Turing Award winners. The thesis Termination of Algorithms was short, brilliant, a gem. I recall when I started as a graduate student reading it and thinking this was beautiful. His own abstract is:

The thesis contains two parts which are self-contained units. In Part 1 we present several results on the relation between the problem of termination and equivalence of programs and abstract programs, and the first order predicate calculus. Part 2 is concerned with the relation between the termination of interpreted graphs, and properties of well-ordered sets and graph theory.

## The Encounter

I will explain the puzzle in a moment, but first let me describe the encounter I had with Zohar. At a STOC, long ago, I saw him and started to explain my result: the solution to the puzzle. After hearing it he said that he liked the solution and then added that he once had worked on this problem. I said that I thought I knew all his work and was surprised to hear that. He smiled and seemed a bit reluctant to explain. Eventually he explained all.

It turned out that he and Steven Ness had a paper On the termination of Markov algorithms at the 1970 Hawaiian International Conference on System Science, held in Honolulu. Zohar explained the conference was not the “best,” but was held every year in the Hawaiian Islands. In January. Where it is warm. And sunny. It sounded like a great conference to me.

I soon went to a couple of these conferences. I stopped going after they accepted one of my papers, then rejected it—this is a long story that I will recount another time. The “accepted-rejected” paper finally did appear in an IEEE journal.

Zohar explained that he went to the conference for the same obvious reasons. He also explained to me the “three-person rule.” The Hawaiian conference was highly parallel and covered many areas of research. Zohar said that you were always guaranteed to have at least three people in the room for your talk: There was the speaker, the session chair, and the next speaker. Hence the three-person rule.

## The Puzzle

The issue is the distributive law:

$\displaystyle A \times (B+C) \rightarrow A \times B + A \times C.$

Consider any expression that has only variables and the two operations plus (${+}$) and times (${\times)}$. Suppose one applies the distributive law to the expression, in any order. One stops when there are no further possible applications. The question, the puzzle, was: Does the distributive law always stop? Of course it does. Or does it? The puzzle is to prove that it actually does stop.

I raised this recently as my favorite result, with a smile, during my Knuth Prize lecture at FOCS. I said I had a nice solution and would be glad either to give it or let the audience think about it. They seem to want to think about it, so I gave no solution.

My lecture had been right before dinner, and the next day I spoke to some people about whether they’d thought about it. A few said they had some idea of how to proceed, but no one seemed to have a proof. The reason the problem is a bit challenging is that the rule increases the size of the expression: two copies of ${A}$ now appear instead of one. This means that any local measure of structure may fail.

Indeed Zohar’s proof uses a well ordering argument that is not too hard, but is perhaps a bit hard to find. Check out his paper with Ness.

## A Solution

The first thing I noticed immediately when I heard the problem—see here for the context—was this: The distributive law preserves the value of the expression. We apply it because it expands the expression but it does not change the value of the expression. A rule like

$\displaystyle A \times (B+C) \rightarrow A \times B + B \times C$

does not preserve value. So who cares whether it halts or not?

But the distributive law preserves the value. So here is a proof based on that observation. Notice the following two things:

1. The value is the same as the rule is applied.
2. The size of the expression grows as the rule is applied.

The trick is to replace all the variables by ${2}$. The value stays the same, but it is easy to argue that if the rule never stops then eventually the value of the expression would increase without bound. This is a contradiction.

## Open Problems

Are there other termination problems that can be attacked in this way?

1. November 12, 2014 12:10 pm

Nice trick, but unless I’m wrong a simpler observation works here. Just notice that if two expressions are added, then they behave as two disjoint copies of the problem. So after applying the rule and then separating the two terms that are added after it, we find that both new expressions are shorter than the original one.

November 12, 2014 12:40 pm

In response to domotorp: The difficulty is that the law may be applied “in any order”, including to a subexpression of another point of application. So, it’s not enough to reason only about the two shorter terms, since the outer term can now grow.

November 12, 2014 12:44 pm

I’d like to propose a simpler (and very precise) argument.

Let us assign to every expression a natural number, by
– replacing every variable with 1
– replacing every (x + y) with (x + y) — ok, this one is basic
– replacing every (x \times y) with (x^2 + y^2)

One can see, that applying the rewriting rule (distributive law) strictly lowers the number assigned to the expression, because (a+b)^2 < a^2 + b^2 and the number is always non-negative.

November 12, 2014 12:53 pm

Well, obviously there should be * instead of + in the third bullet

November 12, 2014 1:18 pm

I guess it is not really the same sort of thing, but this problem reminds me of this paper: http://arxiv.org/abs/1101.1667 , where the authors consider iteratively applying various operations to a regular language, and then show that you can only get finitely many languages from any given starting language.

November 12, 2014 5:35 pm

The easy-to-understand (and widely implemented) recursive path ordering (FOCS 79) shows that both examples (the value-preserving rule and the non-preserving one) always halt, for any starting expression, regardless of the order in which the rule is applied.

Just make x more “significant” than +. To compare two expressions with the same outermost symbol, compare their arguments (either lexicographically or as multisets, where one big element can “cover” any number of smaller ones). If the outermost symbols differ (as in the distributive law), then compare the one with the more significant symbol [in this case Ax(B+C)] with the subexpressions of the other [AxB and AxC]. If the former is bigger (recursively) than the latter, then it [Ax(B+C)] is bigger; if one of the latter subexpressions is bigger, then it is the smaller. Furthermore, an expression is always bigger than its subexpressions.

Ax(B+C) > AxB + AxC
because
Ax(B+C) > AxB and Ax(B+C) > AxC
because
{A,B+C} > {A,B} and {A,B+C} > {A,C}
because
B+C > B,C.

November 12, 2014 7:49 pm

“The size of the expression grows as the rule is applied.”

I must be missing something here.

A x (B + C) -> A x B + A x C

It seems to me that the size of the expression is the same as the rule is applied; similar to the behavior of the value.

Before, we have 7 symbols (omitting spaces): ‘A’, ‘B’, ‘C’, ‘x’, ‘(‘, ‘+’ and ‘)’. And after, we continue to have 7 symbols: two ‘A’s, ‘B’, ‘C’, two ‘x’s and ‘+’.

The rule exchanges ‘(‘ and ‘)’ for ‘A’ and ‘x’ then it rearranges the symbols.

• November 12, 2014 8:22 pm

The point—which could have been clearer—is that “A” might itself be a larger expression inside parentheses. Indeed, when you “unfactor” any polynomial (A+B)(C+D)(E+F)(G+H)…(Y+Z), in whichever order, you formally are doing this all the time.

7. November 12, 2014 10:51 pm

I don’t mean to be a wet blanket, but I don’t think I buy the argument made in the “A Solution” section that because the value is preserved, but the expression grows, the expression can’t grow forever. There are numerous substitutions that can grow the expression forever but not change the value, e.g. x -> x/2 + x/2. That substitution rule can be applied arbitrarily many times to yield an arbitrarily large expression for a constant value. Of course nobody’s disputing that distribution does finish, but I don’t think it’s a valid proof; some additional property of distribution must be used.

For example, if, on each step, you always distribute the operator at the the deepest nesting level that can be distributed, and allow for double-distribution (both sides of a single operator at once) to count as a single step, the number of pairs of brackets containing two or more items always decreases in each step, and there were finitely many at the start, so there are finitely many steps. That’s also a bit hand-wavy, in that it’s just a rough summary, but it’s a bit more solid.

• November 12, 2014 11:03 pm

To partly reverse what I just said, the proof in the “A Solution” section is valid if one simply uses that addition and multiplication of a number that is at least 2, by 2, will only ever increase its value by at least 2.

8. November 13, 2014 12:12 am

There was the speaker, the session chair, and the next speaker. Hence the three-person rule.

I know this conference series. When we are young our (with my brother) paper(*) accepted and published in the conference proceedings but we could able to attend. So this may be an counter-example to the “three-person rule”.

(*) I. Cahit and R. Cahit, “Generation of trees, cotrees and 2-trees of a linear graph”, Conference on Information and System Sciences, Honolulu, Hawaii, 1972.

• November 13, 2014 12:23 am

9. November 13, 2014 2:47 pm

I think I’m not understanding something here. In your description of the puzzle, you say the initial expression involves only variables and the symbols for ‘plus’ and ‘times.’ But the distributive law also involves parentheses, and they are essential to its meaning. Applying the distributive law strictly decreases the number of parentheses in the expression. Since there were only finitely many to begin with, this must stop at some point.

What am I missing?

• November 14, 2014 4:35 pm

Agreed — arguing that the number of parentheses decreases was my first thought. Why does this not constitute a proof?

November 15, 2014 7:46 am

That was also my idea – each application of the distributive law decreases the number of parentheses by two, making the process eventually stop. I suppose Dick and Ken have a more semantic proof. Maybe a syntactic proof is implicitly forbidden?

November 16, 2014 8:06 am

It’s wrong. If A contains parentheses, they all get duplicated. So the number of parantheses doesn’t strictly decrease at each application of the distributive law.

November 16, 2014 3:32 pm

OK, but each step decreases the number of parentheses of each term in the resulting sum. So, if you apply distribution to each term in parallel at each step, the term with maximum number of parentheses will eventually have zero parenthesis.

November 16, 2014 5:58 pm

To make it trivial, let’s say B and C have no parantheses. If A has 1 million parentheses, then the whole initial expression has 1,000,002 parentheses, and the resulting expression has 2,000,000 parentheses. And it’s just a trivial case.

Intuitively this can’t keep exploding like that forever. But go prove it formally if you can. That’s the challenge. Obvious things can be hard to prove.

November 16, 2014 6:20 pm

Alright, I must say that I went through a similar argument as I was reading the article. And I totally dismissed it when I saw the much simpler argument.

As I was going through my own initial argument, I was seeing A as a flyweight object (it’s an OOP pattern). At each distributive expansion, the number of pointers to flyweights would often increase but the parantheses within all flyweights would in general decrease. So there might be a way to prove it as you said. But it could be laborious.

November 16, 2014 6:32 pm

The new expression is indeed made of two terms with 1,000,000 parentheses each. This is actually two less than the old one – which I guess can be generalized and formalized without difficulty.

December 6, 2014 9:14 pm

Stipulate that parens removal proceeds from the innermost parens first to the outermost last?

November 13, 2014 2:49 pm

I’m not a mathematician, but there appears to be direction discrepancy on the distributive property. If you distributed out the expressions to a preset point, then you would make ax + bx into x(a+b) by factoring. This eliminates the terms until all that is left is the value of the expression. Working it that way, it definitely ends.

Working the other way in your puzzle, every time you break a variable x = x1 + x2, you are creating one possible case of an original, unfactored problem.

So it looks like your first conclusion is temporally opposite your second conclusion.

November 14, 2014 8:54 am

RIP Alexander Grothendieck (1928-2014).

Wonder how he’d have solve this one. However he’s changed the face of mathematics forever, for better or for worse. There’s one like him every century… well I mean the good centuries, of course.

12. November 14, 2014 3:26 pm

Grothendieck encourages us to “Paint like Titian”

Dick Lipton’s call for open problems inspires the following tribute to Alexander Grothendieck:

Lipton’s call for open problems
Are there other termination problems that can be attacked in this way [namely]:

• The value is the same as the rule is applied, and

• The size of the expression grows as the rule is applied.

Response Glory — and plausibly a Fields Medal too — await any mathematician who prove the resolution of singularities by the following Grothendieck-inspired “Liptonian” method.

Physical example  [1] Associate (somehow) to any algebraic variety a set of symplectic structures and Hamiltonian symbols, [2]  Arrange (somehow) that the Poisson brackets associated to the Hamiltonian symbols canonically capture the singularity-structure of the variety, [3] show (somehow) that the canonical symplectic structures and symbols lift to a nonsingular variety having the same Poisson-backet algebra as the canonical varietal dynamics.

Physical conjecture  [A] Nature’s canonically unique singularity-resolving variety is Hilbert-space; [B] Nature’s canonically unique Hamiltonian symbols are those of `t Hooftian gauge field field theory; [C] Nature’s canonically unique underlying algebraic state-space is (what else?) the state-space manifild of M-theory; and [D] Nature’s canonically unique underling dynamics is (of course!) quantum gravity.

Conclusion  Grothendiek’s ideas marvelously encourage every young researcher to dream that they can “paint like Titian!” We can all appreciate, and share, and be grateful for this wonderful and inspiring hope, as Grothendieck’s legacy to everyone.

• November 15, 2014 6:02 am

In response to comments regarding Grothendieck on Shtetl Optimized, I have quoted there some comments by Juan Antonio Navarro Gonzalez regarding Grothendieck’s celebrated question What is a meter?; the point being that (as it seems to me)

Resolving the Harrow-Kalai quantum computing debate  Grothendieck’s mathematical worldview encourages us to regard Harrow-style Hilbert dynamics as the natural resolution of Kalai-style varietal dynamics. And this resolution’s realization is not as some far-future postulate, but an eminently practical worldview for which transformational engineering applications are near at-hand.

Here the main intended point, is that we can all of us be grateful — young researchers especially — for the dazzling wealth of inspiration that Grothendieck’s life and work provide to our 21st century STEAM community.

13. November 14, 2014 3:47 pm

We want a feature on Grothendieck and TCS that does not involve the Grothendieck constant:)!!

• November 16, 2014 5:43 pm

It was already being done as you wish🙂, but took awhile. The link you gave below was especially helpful among the obits, and I led with it.

14. November 17, 2014 12:43 pm

Consider a 2D lattice. At the beginning some (but finite) squares have a clonal plant, the others are empty. If a square is occupied, it remains occupied. If at least 2 neighbors of an empty square are occupied, it will be occupied next year. The grow will stop after a finite number of years.