Real-time computation and the start of stringology

Zvi Galil is my boss’s boss, he is the Dean of the College of Computing at Georgia Tech. He is also one of the world’s experts on stringology, an area that he helped create and name in 1984. He is also well known for many other beautiful results—perhaps one of his most famous is in the paper with Ofer Gabber entitled: Explicit Constructions of Linear Size Superconcentrators.

Today I want to talk about another result of Zvi that is quite surprising and helped launch the area of stringology.

I still remember when Gabber and Galil announced their breakthrough result in 1979. I was at Berkeley, and a group of us got their paper and began to study it, trying to understand the proof. One of the key lemmas was a deep inequality about complex numbers—do not let it scare you:

Lemma: Let ${\{a_{m,n}\}}$ be a doubly infinite array of complex numbers. Also assume ${a_{0,0}=0}$ and ${\sum_{m,n} |a_{m,n}|^2 < \infty}$. Then

$\displaystyle \sum_{m.n} |a_{m+2n,n} - a_{m,n}|^2 + \sum_{m.n} |a_{m,n+2m} - a_{m,n}|^2 \ge 8c\sum_{m,n} |a_{m,n}|^2.$

Here ${c}$ is equal to ${(2- \sqrt 3)/4}$.

Their proof showed that a certain explicit graph was essentially an expander. Their idea was to think of this property not as a graph property, but as an inequality. This is a powerful idea that allowed them to use methods from analysis to prove their theorem. This makes another example of unexpected connections between different parts of mathematics.

I recall spending a good part of the summer trying to really understand where this inequality came from, and why it was true. While their construction has since been improved, the original paper and the proof still stand as one of the great results of theory.

Importance, Again

What makes a problem important? I have discussed this many times before, but it still is one of the questions that I think about, especially late at night. Some problems are clearly important because they are hard, because they “fight back,” and because they are difficult. Certainly the work on superconcentrators was important—by this measure and others.

Some problems are important because they combine a lovely realistic model with deep mathematics. The other problem of Galil that helped start stringology is a combination: it is a hard problem, and a problem that is about a realistic model of computation. Curiously the model of computation has somehow become less studied than it was years ago—I am not sure why that is true. The model concerns real-time computation. In order to be concrete we will define real-time only for Turing machines, but the notion can and has been defined for almost any model of computation.

In a real-time computation we imagine that a Turing machine is required to read its input tape one character each time step. It is not allowed to compute multiple steps before reading the next input character—it must read one. This can be relaxed to insist that the machine read a character every constant number of steps—what’s important is that the time between reading the input symbols is bounded. Then the machine gives its verdict within the same bounded time after reading the last input character. This is a tremendous restriction on algorithms; even a linear time computation is rarely a real-time computation, since it can read in the whole input and then rescan the input. Of course a linear time computation takes constant amortized time per input, but the real-time constraint is much stronger.

In modern terms the real-time restriction is a reasonable and practical model: in many situations data arrives whether or not you want it to. A internet router gets packets as they come. It can buffer or store them, but it cannot just ignore them. Further performance issues require that the router try to be as real-time as possible; otherwise packets will be delayed or even lost.

I think the interest in real-time computation in the past, especially algorithms on Turing machines, was not driven by practical concerns, however. It was driven by the surprise that real-time Turing machines were so powerful. Even with this restriction they could do magical things. One of the classes of things they could do involved basic questions about strings.

So let’s turn now to stringology.

Palindromes-semordnilaP

Recall a string ${x}$ is a palindrome if identical to ${x^R}$, where ${x^R}$ is the reversal of the string ${x}$. Palindromes are strings that read forward the same as backward. Some favorites are:

• A man, a plan, a canal: Panama
• Able was I ere I saw Elba
• Never odd or even

Note in these examples we are a bit loose about the notion: we do not count spaces, nor punctation, and we ignore case. Thus the last example as a string should be

$\displaystyle \text{neveroddoreven}$

The notion of palindrome is a mathematical one: the string reads the same forward or backward. However, it is not the same picture forward and backwards: only the symbols are the same, not the images. For instance, “${\text{A}>\text{A}}$” is a palindrome as a string, but not as a figure since backwards it would be “${\text{A}<\text{A}}$.”

There has been work on writing that reads the same from different orientations—this has nothing, I believe, to do with stringology. But it is beautiful work. The artist John Langdon has created many pictures with miraculous symmetry properties: Here is one of his beautiful examples:

The Discovery: Act I

Zvi tells me that in the fall of 1975 he was a postdoc at IBM Yorktown Heights. Albert Meyer was also visiting there, and one day he handed Zvi a handwritten manuscript. It was a translation of a Russian paper by Anatol Slisenko—the translation had been made by Bob Daley. The paper had one result. An amazing result. Slisenko had proved:

Theorem: There is a multi-tape Turing Machine that can recognize initial non-trivial palindromes in real-time.

Galil explains that he thought this result was “incredible and impossible.” Real-time is a very strong constraint, and Zvi could not believe that palindromes could be recognized by such machines.

The intuition I imagine he had is this: Suppose the machine has read ${x_1,\dots,x_k}$ of the input ${x_1,\dots,x_n}$. Since the machine is real-time, it has only constant time to decide if ${x_1,\dots x_k}$ is or is not a palindrome. It just seems impossible.

Yet Galil had a translation of Slisenko’s paper: Recognizing a symmetry predicate by multihead Turing machines with input. Here “symmetry” predicate refers to palindromes. The paper was huge. Slisenko’s journal paper was 173 pages, and Daley’s translation was ${500+}$ handwritten pages.

Galil was taken aback with Slisenko result. He decided to try and read the paper, but after about 30 of the 500 pages he gave up. Yet he felt, correctly as it turned out, that there should be a much shorter and simpler proof. He set out to find it. He soon succeeded, with a proof that is only 15 pages. Of course some of the notions of Slisenko were used, some were modified, and many new ideas were added by Zvi.

And as he told me:

This began my love affair with string algorithms.

The Discovery: Intermission

I cannot do justice to Galil’s proof, so I will not even try to explain it in any detail. The critical insight is the following “meta” theorem:

Theorem: If a linear time off-line algorithm ${\cal A}$ satisfies the predictability condition, then ${\cal A}$ can be transformed into a real-time algorithm.

The predictability condition is a technical condition on the behavior of an algorithm—it is due to Galil, and is one of the principal reasons his proof is so much shorter. The intuition is this: suppose that the algorithm ${\cal A}$ spends too much time processing some input symbol, then it must be the case that many of the next output symbols are ${0}$. This “prediction” allows the theorem to convert an algorithm that is not real-time into one that is. See his paper for the details.

The Discovery: Act II

The story does not end there: it was really the beginning of a much longer story. Slisenko answered by creating a much shorter proof of his own that is published here. The abstract is: A comparatively short proof is given of the recognizability of palindromes in real-time on multitape Turing machines. It is based on the same idea as the original proof by the author, and on Z. Galil’s idea for simplifying the proof by using the Fischer-Paterson algorithm for finding all symmetric suffixes in linear time.

The Discovery: Act III

Perhaps more important than these papers is that they helped launch an area of theory that is called Stringology. It is roughly the study of strings, their properties, and algorithms. It is quite active today: there are countless papers, current workshops, and journals devoted to this important area.

In 1997 Zvi gave a talk about the state of stringology; the talk’s title was: Open Problems in Stringology, Thirteen Years Later. His abstract is quite amusing, as are many of his talks. Somehow he got “thirteen” worked into the abstract three times: The talk was on Friday, June thirteen, 1997; it was about his 1984 talk where he created the field and listed thirteen open problems; and his 1984 talk was exactly thirteen years earlier.

Stringology is doing well today. Modern string problems often allow approximations—since the exact questions are too hard—and many algorithms are probabilistic. Alberto Apostolico is also at Tech, and is another leader of this important area.

$\displaystyle \S$

I cannot resist giving one elementary result about strings, yet one that I find amazing. I guess one insight from stringology is that the notion of strings seems so simple, so basic, yet strings have many wonderful properties and surprising structure. In my opinion that is the importance of much of stringology: the illumination of how to think about and manipulate strings.

The problem I have always liked is this: Suppose that ${x}$ and ${y}$ are two strings over some alphabet. When are there positive integers ${m,n}$ so that

$\displaystyle x^m = y^n \ ?$

Here ${x^m}$ is the concatenation of ${n}$ copies of ${x}$; for instance, ${(abc)^3 = abcabcabc}$. The question is, given ${x}$ and ${y}$, determine whether or not there are ${m,n}$ so that ${x^m = y^n}$. A moment’s thought should allow you to see that there is a simple quadratic-time algorithm. Just let ${m = |y|}$ and ${n = |x|}$. Then check to see if ${x^m}$ is equal to ${y^n}$. It is not hard to see that this yields a correct quadratic, worse-case, algorithm. The surprise, in my opinion, is that there is a simple linear time algorithm. The algorithm is just to check whether or not

$\displaystyle xy = yx.$

Of course the hard part, like many fast algorithms, is proving that this is correct. It is because of the following theorem:

Theorem: Suppose that ${x,y}$ are two strings. Then there are positive ${m,n}$ so that ${x^m = y^n}$ if and only if ${xy = yx}$.

The proof of this theorem is not too hard, but it is typical of many string theorems. It requires a non-trivial argument, with careful case analysis. Added: here is a reference to a proof of this theorem.

Open Problems

Galil asked me if the original Slisenko paper was one of the longest papers in Theory to date. Recall Slisenko’s original paper was 173 journal pages. What do you think?

January 12, 2011 11:02 am

If you like palindromes you should check out this classic by Weird Al Yankovic

This is obviously “inspired” by Bob Dylan’s video of “Subterranean Homesick Blues” as appered in the movie “Don’t Look Back” (can’t seem to be able to locate it unfortunately).
Al’s song is written entirely in palindromes (including your “never odd or even”, but also such classics as “go hang a salami, i’m a lasagna hog” and others), and it does not make much less sense than Bob’s, which is not say it makes much sense 🙂

Marcin

January 12, 2011 11:16 am

In the first series in the inequality of lemma 1, is that supposed to be 2n rather than 2n,n?

January 12, 2011 2:22 pm

jules

will check

3. January 12, 2011 11:26 am

And there was a Stringology workshop this year, again 13 years later, which again Galil attended: http://www.cs.biu.ac.il/stringology2010/

4. January 12, 2011 3:32 pm

What is the relationship between stringology and combinatorics on words ?

January 12, 2011 9:02 pm

dc

I think highly related. The would guess that stringology is more algorithmic, but have no easy answer

January 12, 2011 7:07 pm

Can you give a pointer to the proof of theorem that x^n = y^m for some n,m iff xy = yx?

January 12, 2011 11:27 pm

After working it out on paper, it is very easy to go in both directions of the implication.

January 13, 2011 9:01 am

Ross

I will try to get the pointer

January 13, 2011 10:50 pm

Ross,

I remembered the theorem, but finally found a reference. It is Lyndon’s theorem. I think I learned it from Ginsburg’s book on languages. Here is one reference:
here

6. January 12, 2011 7:27 pm

Thanks, what a beautiful article !

“Here x^m is the concatenation of n copies of x; for instance,…”
=> shouldn’t this be “concatenation of m copies of x” ?

January 12, 2011 9:01 pm

PC Gent

Yes, typo-think-o

7. January 12, 2011 8:43 pm

just for fun,one of Chinese palindromes is上海自来水来自海上，there are many popular Chinese palindromes.

another less symmetric string which is not palindromes but more famous and popular are the following poetry

http://zh.wikipedia.org/zh/%E7%92%87%E7%8E%91%E5%9B%BE

reading forward or backward or circularly or from bottom to top or inversely oralong the diagonal ,one may get different meaningful poet,which the author composed to miss her husband .

• January 12, 2011 8:49 pm

the poetry be more symmetric in another light,in another word ,the strings being readed forward or backward or circularly or from bottom to top or inversely oralong the diagonal,are all strings in the language.

• January 12, 2011 8:57 pm

one question is :under what condition,all strings in the language have the property,or how many strings having such a property are in a language

8. January 12, 2011 9:08 pm

such a trivial language is a^n,and it is the unique one with the property .

January 12, 2011 11:31 pm

I’m not as fascinated that a multi-tape Turing Machine can recognize palindromes in real time. Of course, there is something there because of how restrictive the real time computation model is. But having several tapes is a very powerful freedom. If the same time bounds were true for a single-tape Turing Machine, I would find this endlessly fascinating.

January 13, 2011 1:17 am

I guess “endlessly fascinating” here is a joke, right? Real time implies linear time, and single-tape Turing machines need quadratic time for palindromes.

January 13, 2011 10:57 am

I did not mean this as a joke – no.

I meant what I said: that if it were true that X, then I would be fascinated.

A clear way to communicate this is to highlight that my claim isn’t a claim about how quickly single-tape Turing Machines can recognize the language PALINDROMES. It is a claim about how I would feel if they could do it in real time (or even linear time), which they irrelevantly can not.

A second clear way to say it is: false -> false is true.

Truly, the post was making a point about intuition. I was basically saying: the work is cool, however I don’t find the results as surprising as you’ve suggested I should.

• January 14, 2011 7:21 am

In fact, when real-time computation is considered, the input is assumed to arrive on a separate device – a read-only input tape, not on the worktape. Palindromes can be recognized in linear time on such a machine using a single worktape. Can they be recognized in real time?

• January 14, 2011 10:10 am

Hmm, I have not been very careful: forgot about the directionality of the input tape.
It has to be 2-directional for the linear-time palindrome detection. In real-time computation, you cannot use the two directions, so a real-time solution would imply a linear-time solution with a 1-way tape. Is this possible?

January 13, 2011 6:56 am

By the way, if we relax real-time to allow randomization and we require the algorithm to output the correct answer with high probability and if we allow some further (reasonable I think) relaxations, there is a very simple Karp-Rabin fingerprints based solution. The relaxations are to either 1) allow O(log n) steps of computations between input symbols; OR 2) define real time on a RAM, and make sure that the length of the string fits in a word — then O(1) RAM operations between reading input symbols suffice.

Say the string is $x \in \{0, 1\}^n$. We can think of $x$ as a number in binary. It’s enough to keep $A = x \pmod{p}$ and $B = x^R \pmod{p}$ for a randomly chosen prime $p$ from a large enough range $[0, U]$. If we have $A$ and $B$ for the first $t-1$ bits of $x$, it’s easy to update them when the new bit $x_t$ arrives: $A := 2A + x_t \pmod{p}$ and $B := B + 2^t x_t \pmod{p}$ (the current value of $2^t \pmod{p}$ can be kept and updated on each step). Now we only need to choose a large enough $U$ and I believe something to the order of $n^2$ will suffice to make sure that with high constant probability $A = B \pmod{p}$ implies $A = B$. Then all computations can be done with numbers that fit in $O(\log n)$ bits, and there are only a constant number of arithmetic operations done between updates.

On a different note: isn’t the real-time model somewhat subsumed by the streaming model? Streaming is a little bit of a relaxation, since randomization and polylog many steps per update are allowed. Also, the small-space requirement is made explicit in streaming, although I believe that with real-time computation the limited time-per-update restriction together with a restriction of limited post-processing time (is this restriction made in real-time computation?) would together imply limited space usage. Either way, to me streaming seems to capture the essence of the real-time model, but might be a little more closely aligned with realistic models of computation (always hard to argue) and also allows for even more remarkably results.

• January 13, 2011 10:04 am

Sasho, from an engineering point-of-view, doesn’t the viability of randomization methods depend on the tolerance for error?

In particular, if we use a randomized method to filter palindromes from a random input stream—maybe because palindromic inputs “jam” some upstream algorithm—then don’t the false-positives outnumber the true-positives?

As a side note, in the link Dick provided it was interesting to read that:

“In 2005 the Prague Stringology Club extended the range of its research topics also to data compression …”

Gosh … now that Stringology formally encompasses broad-ranging topics like “Lossless zip compression versus lossy jpeg compression” … we are led to wonder … are there any boundaries to the topics in mathematics, physical science, linguistics, and even cognitive science, that are relevant to Stringology?

January 13, 2011 10:33 am

John, if you can tolerate probability $\delta$ for false positives, it’s enough (or even more than enough) to set $U = O(n^2/\delta)$. If you have a long string of length $N$, and you want to test, on each step, if the prefix so far is a palindrome, you can set $\delta' = \delta/N$ to get $U = O(N^3/\delta')$, and, with probability $\delta'$, there will be no false positives. Notice that any integer inside $[0, U]$ still fits within $O(\log N)$ bits, i.e. one RAM word. In one sentence, the dependence of the number of bits you need for the arithmetic computations on the error probability is logarithmic, so you can afford quite low error rates.

As you can tell I am not an engineer, but I suppose if you expect not to encounter palindromes too often, and you set the error probability to be low, then you can check positive answers deterministically and you won’t have to do it too often. Is a simple randomized solution better or worse than a more complicated randomized one is I guess a question for the engineers. But my point was that the *proof* is very very simple – probably simpler than the simple proofs for deterministic algorithms. I am not sure how simple Slisenko’s second algorithm is.

• January 13, 2011 11:23 am

I’m pretty confident that we’re not disagreeing Sasho … for a random input string the probability of it being palindromic is exponentially small, thus (almost) all the candidate palindromes identified by random methods will be false-positives.

One practical example of real-time assessment is the second-by-second decision “Should this airliner’s starboard engine be shut-down, because it is about to explode”? In this example the sole input is the “string” of data provided by the flight sensors … both Type I and Type II errors have unacceptably high costs … and the decision *must* be made in real-time.

In grappling with problems in this class, experience has taught systems engineers to be suspicious of every kind of algorithm … and doubly suspicious of probabilistic algorithms. And yet, there are plenty probabilistic algorithms that—with careful design and tuning—do work amazing well in real-world applications; the ethernet protocols for dealing with packet collisions are one example.

January 13, 2011 12:11 pm

We probably are not disagreeing 🙂 Still, engineers use algorithms no matter how suspicious they are of them. And while using the wrong model might make some modes of formal analysis unusable, a carefully analyzed algorithm should always be a good idea. I think Karp-Rabin based algorithms are actually fairly practical: can use shift operators for speed, there is pretty good control on error probabilities, I don’t think there are huge constants hiding anywhere. But I do understand your point, and that’s why good deterministic algorithms are important, not only in complexity analyses.

January 13, 2011 9:02 pm

Does ‘real-time’ mean time n plus some constant for input size n?
If so, then I find this result truly compelling and now have an itching to read Galil’s proof. Is it particularly difficult?

• January 14, 2011 7:26 am

Yes, real-time does mean this. And moreover, there is no end-marker: the machine gives a correct answer for every prefix of the input, all within this tight time bound.

12. April 11, 2012 3:11 pm

I’m curious as to the mentality behind naming the area “stringology” and calling these “string algorithms” when strings are a very specific computer science artifact and the majority of the math is graph-based, producing algorithms for classes of graphs with certain properties.

Maybe it has to do with grants the authors are funded by and the journals where they are publishing.