Counting Is Sometimes Easy
Cases where we can count objects in polynomial time
New Interim Dean source—our congrats
Mike Sipser is one of the top theorists, especially in complexity theory, and has been Head of the Mathematics Department at MIT since 2004. He is long known for some very clever proofs and his famous survey paper on the status of . He is recently known for co-authoring the paper that introduced adiabatic quantum computing, which is the kind being given physical realization by the company D-Wave Systems. But he is perhaps best known nowadays for his textbook Introduction to the Theory of Computation. I have used the earlier version many times for teaching our undergraduate class; I have not used the third edition, mainly because I teach other things these days.
Today Ken and I wish to talk about a topic that is covered in his book, finite state automata, in relation to counting.
Yes, lowly finite state automata (FSA). In my opinion FSA are one of the great inventions of theory. They led Michael Rabin and Dana Scott to discover nondeterminism, yielding a Turing Award along the way. They led algorithm designers like Don Knuth to discover, with Jim Morris and Vaughan Pratt, the first linear-time pattern matching algorithm. And much more.
Mike’s book was discussed before here, where I talked about his use of FSA to prove that the first order theory of addition is decidable. This is one of my favorite applications of FSA, which I learned from Ann Yasuhara directly—it is also included in her early book, Recursive Function Theory and Logic.
A Look at the Book
Mike’s book proves some interesting theorems that are ancient—okay, many decades old. This yields what in retrospect look like omissions, which are bound up with the history of the theorems. For example, consider the following classic one:
Theorem 1 A language is context-free if and only if some nondeterministic pushdown automaton (PDA) accepts it.
This is proved in detail in his book—see page 117 in the new edition, Theorem 2.20. But the proof establishes more, much more, that is not stated. I have used both of these consequences in my own work over the years:
- Given a context-free grammar we can construct the PDA in polynomial time, and conversely given the PDA we can construct the grammar.
- The sizes of the grammar and the description of the PDA are related by a polynomial bound.
Note that the best construction, I believe, goes from a PDA with states to a grammar with rules. If there is a better one it would have some interesting consequences.
Let me say that this omission is not isolated to Mike’s book, but others most always leave out these interesting refinements. I believe that the reason is simple: the above theorem was proved before polynomial time was defined, and the textbook is sequenced that way. Hence the omission. Perhaps they can be added to a fourth edition.
Ken adds—also writing much of the rest of this post: I now like the sequence of skipping grammars and PDA’s and going straight from regular languages and FSA to Turing computability and complexity. After this ‘kernel’ material the instructor has an option of covering grammars, and then the polynomial-overhead concepts are in place. Or the instructor can do more with complexity or logic or some other core topic.
Let’s get back to FSA and counting.
Theorem 2 Let be a deterministic FSA with states and input alphabet . Then we can determine the cardinality of the set in time polynomial in and .
Put another way, we can count in polynomial time the number of strings of length that a FSA accepts. If the FSA is fixed, then the time is polynomial only in . Actually, in this case the time is essentially bi-linear in and —it depends on the exact computational model.
The algorithm to prove this is often described as “dynamic programming,” but often that just means maintaining a well-chosen data structure. Here we allocate a slot for each state of the FSA, and maintain for each the number of strings of length exactly that reach from the start state . Initially , for the empty string, and all other . Now to update it to , find all states and characters such that , where is the transition function of , and sum up . That is,
Finally equals the sum of over all accepting states . Assuming random access to the data slots, and unit time for arithmetic on small numbers, this runs in time .
Sounds simple, but as often happens in complexity, delicacy and difficulty lurk not too far away.
The NFA Case and an Application
First, does the algorithm work when is nondeterministic? It certainly runs, and counts something, but not the number of accepted strings. So can we modify it to do so?
The answer is no—or maybe better to say “ostensibly not”: Given a Boolean formula that is a conjunction of clauses , design an NFA that begins with a nondeterministic choice of one of states . From every , starts reading its input deterministically. The part of ‘s code from is written so that if is an assignment that makes every literal in false—note that the literals can be presented in index order—then accepts . Thus the formula is unsatisfiable if and only if . So if we had a algorithm to compute , we’d have .
Moreover, counts the satisfying assignments. Hence relaxing from DFA to NFA makes our little problem -complete. Now it’s important that is part of the input. If is fixed and we only want to compute given , then of course we can convert to an equivalent DFA which is likewise fixed, and run our algorithm. Hence one must be delicate with what constitutes the input.
For an example, call a string special if it contains copies of the pattern for some that is a prime number. We can count the number of special strings of length in time polynomial in . To do so, we take and program a choice over all as before, this time keeping only the for which is prime. From each we program a DFA that accepts a string if and only if it contains exactly copies of . This resembles the case, but is different because there is no overlap in the strings accepted by the respective . So we just count the numbers of length- strings accepted by each and add them up.
This is not a hugely important application. We selected it to show that there are counting problems that might be tricky to solve without the FSA method. This and other examples may be useful.
What Else Can Be Counted?
Note that even though the decision problem is in , the counting problem is still -complete. For an aside, is a major example in Mike’s quantum paper, but here we raise the question, what else can be counted?
For instance, counting the number of solutions to an -variable polynomial over is -complete. It becomes -time, however, when has degree at most . This is when is fixed. What if is variable? We can also ask about computing for all , for the particular purpose of computing the exponential sum
Dick co-authored a paper recently with Jin-Yi Cai, Xi Chen, and Pinyan Lu, showing that when has degree at most , is computable in time polynomial in and . In particular this means that for , when we can represent the members of by strings in , the time to compute is .
A little thought shows that this suffices to compute any individual number in time , indeed by computing all of them. But if we just want one, and , can we do it in time ? This is not obvious to me (Ken), and at least for now leaves the funny situation where we can compute in time the historically important function which involves all the values, but don’t see a way to compute any one of them in the same time.
Whose originally proved efficient counting for deterministic FSA? We seem to not be able to track that down. Are there some cool applications?
What is the answer to the last problem? Should counting receive more emphasis in texts at the level of Mike’s?
I (Ken) add a story: I first met Mike when we shared a compartment of the train between Hamburg, Germany and Aarhus, Denmark, on the way to ICALP 1982. I had just moved from algebraic combinatorics into complexity during my first year at Oxford, and naturally asked him what were the propsects for proving . He replied “It will be proved … yes we will prove it,” and backed up his confidence by naming some results and giving some ideas along lines that would later be called the “Sipser programme” of approach via circuit lower bounds. (Did we mention that he also wrote with Ravi Boppana a bellwether survey on circuit complexity?) I guess there wasn’t a time limit on his assertion…