## Separating Words by Automata

*Another exponential gap in complexity theory?*

[ From his home page ] |

Jeffrey Shallit is a famous researcher into many things, including number theory and being a skeptic. He has a colorful website with an extensive quotation page—one of my favorites by Howard Aiken is right at the top:

Don’t worry about people stealing an idea. If it’s original, you will have to ram it down their throats.

Today I thought I would discuss a wonderful problem that Jeffrey has worked on.

Jeffrey’s paper is joint with Erik Demaine, Sarah Eisenstat, and David Wilson. See also his talk. They say in their introduction:

Imagine a computing device with very limited powers. What is the simplest computational problem you could ask it to solve? It is not the addition of two numbers, nor sorting, nor string matching—it is telling two inputs apart: distinguishing them in some way.

More formally:

Let and be two distinct long strings over the usual binary alphabet. What is the size of the smallest deterministic automaton that can accept one of the strings and reject the other?

That is, how hard is it for a simple type of machine to tell apart from ? There is no super cool name for the question—it is called the *Separating Words Problem* (SWP).

## Some History

Pavel Goralčik and Vaclav Koubek introduced the problem in 1986—see their paper here. Suppose that and are distinct binary words of length . Define to be the number of states of the smallest automaton that accepts and rejects or vice-versa. They proved the result that got people interested:

Theorem 1For all distinct binary words and of length ,

That is the size of the automaton is asymptotically sub-linear. Of course there is trivially a way to tell the words apart with order states. The surprise is that one can do better, always.

In 1989 John Robson obtained the best known result:

Theorem 2For all distinct binary words and of length ,

This bound is pretty strange. We rarely see bounds like it. This suggest to me that it is either special or it is not optimal. Not clear which is the case. By the way it is also known that there are and so that

Thus there is an exponential gap between the known lower and upper bounds. Welcome to complexity theory.

What heightens interest in this gap is that whenever the words have different lengths, there is always a logarithmic-size automaton that separates them. The reason is our old friend, the Chinese Remainder Theorem. Simply, if there is always a short prime that does not divide , which means that the DFA that goes in a cycle of length will end in a different state on any of length from the state on any of length . Moreover, the strings and where equals plus the least common multiple of require states to separate. Padding these with s gives equal-length pairs of all lengths giving **SEP**(x,y).

Some other facts about SWP can be found in the paper:

- (a) It does not matter whether the alphabet is binary or larger.
- (b) For random distinct , the expected number of states needed to separate them is at most .
- (c) All length- pairs can be distinguished by deterministic pushdown automata (with two-way input tape) of size .

Point (b) underscores why it has been hard to find “bad pairs” that defeat all small DFAs. All this promotes belief that logarithmic is the true upper bound as well. Jeffrey stopped short of calling this a conjecture in his talk, but he did offer a 100-pound prize (the talk was in Britain) for improving Robson’s bound.

## Some Questions

There are many partial results in cases where and are restricted in some way. See the papers for details. I thought I would just repeat a couple of interesting open cases.

How hard is it to tell words from their reversal? That is, if is a word can we prove a better bound on

Recall is the reversal of the word . Of course we assume that is not the same as its reversal—that is, we assume that is not a palindrome.

How hard is it to tell words apart from their cyclic shifts?

How hard is it to tell words from their You get the idea: try other operations on words.

## Open Problems

The SWP is a neat question in my opinion. I wonder if there would be some interesting consequence if we could always tell two words apart with few states. The good measure of a conjecture is: how many consequences are there that follow from it? I wonder if there could be some interesting applications. What do you think?

## Self-Play Is Key?

*Self-play and Ramsey numbers*

[ Talking about worst case ] |

Avrim Blum is the CAO for TTIC. That is he is the Chief Academic Officer at the Toyota Technological Institute of Chicago. Avrim has and continues to make key contributions to many areas of theory—including machine learning, approximation algorithms, on-line algorithms, algorithmic game theory, the theory of database privacy, and non-worst-case analysis of algorithms.

Today I want to discuss a suggestion of Avrim for research on self-play. Read more…

## Danny Cohen Passed Away

## A Matroid-Quantum Connection

*With tighter links between notions of rank?*

Composite of src1, src2 (our congrats) |

James Oxley and Geoff Whittle are mathematicians at LSU and Victoria University (Wellington, New Zealand), respectively. They have written many joint papers on matroid theory, but none on quantum computing.

Today we observe that they really have *indirectly* written a paper on quantum computing.

## Our Trip To Monte Carlo

*Why does randomness help?*

Kathryn Farley is my dear wife. She and I are currently on a cruise through the Mediterranean. Our trip started in Barcelona and is stopping daily at various cities as we journey to Rome. “Tough duty,” but we are trying to enjoy it.

Today I wish to talk about our visit to Monte Carlo.

Read more…

## Predicting Chess and Horses

*Using predictivity both to sharpen and cross-check models*

Cropped from article source |

Patrice Miller and Jeff Seder look under the hide of horses. Their company EQB does predictive modeling for horse racing based on biometric data. They are famous for having advised the owner of American Pharoah not to sell because the horse had a powerful heart. In 2015, American Pharoah became the first Triple Crown winner since the also-hearty Secretariat in 1978.

Today I am happy to announce an extended version of my predictive model for chess and discuss how it gains accuracy by looking under the hood of chess positions.

Read more…