Cases where we can count objects in polynomial time

 New Interim Dean source—our congrats

Mike Sipser is one of the top theorists, especially in complexity theory, and has been Head of the Mathematics Department at MIT since 2004. He is long known for some very clever proofs and his famous survey paper on the status of ${\mathsf{P = NP}}$. He is recently known for co-authoring the paper that introduced adiabatic quantum computing, which is the kind being given physical realization by the company D-Wave Systems. But he is perhaps best known nowadays for his textbook Introduction to the Theory of Computation. I have used the earlier version many times for teaching our undergraduate class; I have not used the third edition, mainly because I teach other things these days.

Today Ken and I wish to talk about a topic that is covered in his book, finite state automata, in relation to counting.

Yes, lowly finite state automata (FSA). In my opinion FSA are one of the great inventions of theory. They led Michael Rabin and Dana Scott to discover nondeterminism, yielding a Turing Award along the way. They led algorithm designers like Don Knuth to discover, with Jim Morris and Vaughan Pratt, the first linear-time pattern matching algorithm. And much more.

Mike’s book was discussed before here, where I talked about his use of FSA to prove that the first order theory of addition is decidable. This is one of my favorite applications of FSA, which I learned from Ann Yasuhara directly—it is also included in her early book, Recursive Function Theory and Logic.

## A Look at the Book

Mike’s book proves some interesting theorems that are ancient—okay, many decades old. This yields what in retrospect look like omissions, which are bound up with the history of the theorems. For example, consider the following classic one:

Theorem 1 A language is context-free if and only if some nondeterministic pushdown automaton (PDA) accepts it.

This is proved in detail in his book—see page 117 in the new edition, Theorem 2.20. But the proof establishes more, much more, that is not stated. I have used both of these consequences in my own work over the years:

• Given a context-free grammar we can construct the PDA in polynomial time, and conversely given the PDA we can construct the grammar.

• The sizes of the grammar and the description of the PDA are related by a polynomial bound.

Note that the best construction, I believe, goes from a PDA with ${S}$ states to a grammar with ${O(S^{3})}$ rules. If there is a better one it would have some interesting consequences.

Let me say that this omission is not isolated to Mike’s book, but others most always leave out these interesting refinements. I believe that the reason is simple: the above theorem was proved before polynomial time was defined, and the textbook is sequenced that way. Hence the omission. Perhaps they can be added to a fourth edition.

Ken adds—also writing much of the rest of this post: I now like the sequence of skipping grammars and PDA’s and going straight from regular languages and FSA to Turing computability and complexity. After this ‘kernel’ material the instructor has an option of covering grammars, and then the polynomial-overhead concepts are in place. Or the instructor can do more with complexity or logic or some other core topic.

## Counting

Let’s get back to FSA and counting.

Theorem 2 Let ${M}$ be a deterministic FSA with ${S}$ states and input alphabet ${\Sigma}$. Then we can determine the cardinality ${C_m(n)}$ of the set ${L(M)\cap \Sigma^{n}}$ in time polynomial in ${S}$ and ${n}$.

Put another way, we can count in polynomial time the number of strings of length ${n}$ that a FSA accepts. If the FSA is fixed, then the time is polynomial only in ${n}$. Actually, in this case the time is essentially bi-linear in ${S}$ and ${n}$—it depends on the exact computational model.

The algorithm to prove this is often described as “dynamic programming,” but often that just means maintaining a well-chosen data structure. Here we allocate a slot for each state ${q}$ of the FSA, and maintain for each ${k \leq n}$ the number ${N_{k,q}}$ of strings of length exactly ${k}$ that reach ${q}$ from the start state ${s}$. Initially ${k = 0}$, ${N_{0,s} = 1}$ for the empty string, and all other ${N_{0,q} = 0}$. Now to update it to ${N_{k+1,q}}$, find all states ${p}$ and characters ${c}$ such that ${\delta(p,c) = q}$, where ${\delta}$ is the transition function of ${M}$, and sum up ${N_{k,p}}$. That is,

$\displaystyle N_{k+1,q} = \sum_{p,c: \delta(p,c) = q} N_{k,p}.$

Finally ${C_M(n)}$ equals the sum of ${N_{n,q}}$ over all accepting states ${q}$. Assuming random access to the data slots, and unit time for arithmetic on small numbers, this runs in time ${O(Sn)}$.

Sounds simple, but as often happens in complexity, delicacy and difficulty lurk not too far away.

## The NFA Case and an Application

First, does the algorithm work when ${M}$ is nondeterministic? It certainly runs, and counts something, but not the number of accepted strings. So can we modify it to do so?

The answer is no—or maybe better to say “ostensibly not”: Given a Boolean formula that is a conjunction of clauses ${C_1,\dots,C_m}$, design an NFA ${M}$ that begins with a nondeterministic choice of one of ${m}$ states ${q_j}$. From every ${q_j}$, ${M}$ starts reading its input ${x \in \{0,1\}^n}$ deterministically. The part of ${M}$‘s code from ${q_j}$ is written so that if ${x}$ is an assignment that makes every literal in ${C_j}$ false—note that the literals can be presented in index order—then ${M}$ accepts ${x}$. Thus the formula is unsatisfiable if and only if ${C_M(n) = 2^n}$. So if we had a ${\mathsf{poly}(Sn)}$ algorithm to compute ${C_M(n)}$, we’d have ${\mathsf{NP = P}}$.

Moreover, ${2^n - C_M(n)}$ counts the satisfying assignments. Hence relaxing ${M}$ from DFA to NFA makes our little problem ${\mathsf{\#P}}$-complete. Now it’s important that ${M}$ is part of the input. If ${M}$ is fixed and we only want to compute ${C_M(n)}$ given ${n}$, then of course we can convert ${M}$ to an equivalent DFA which is likewise fixed, and run our algorithm. Hence one must be delicate with what constitutes the input.

For an example, call a string special if it contains ${j}$ copies of the pattern ${u = 010111}$ for some ${j}$ that is a prime number. We can count the number of special strings of length ${n}$ in time polynomial in ${n}$. To do so, we take ${m = \lfloor n/6 \rfloor}$ and program a choice over all ${j \leq m}$ as before, this time keeping only the ${q_j}$ for which ${j}$ is prime. From each ${q_j}$ we program a DFA ${M_j}$ that accepts a string ${x}$ if and only if it contains exactly ${j}$ copies of ${u}$. This resembles the ${\mathsf{SAT}}$ case, but is different because there is no overlap in the strings accepted by the respective ${M_j}$. So we just count the numbers of length-${n}$ strings accepted by each and add them up.

This is not a hugely important application. We selected it to show that there are counting problems that might be tricky to solve without the FSA method. This and other examples may be useful.

## What Else Can Be Counted?

Note that even though the decision problem ${\mathsf{2SAT}}$ is in ${\mathsf{P}}$, the counting problem ${\mathsf{\#2SAT}}$ is still ${\mathsf{\#P}}$-complete. For an aside, ${\mathsf{2SAT}}$ is a major example in Mike’s quantum paper, but here we raise the question, what else can be counted?

For instance, counting the number ${N_m(f)}$ of solutions to an ${n}$-variable polynomial ${f}$ over ${\mathbb{Z}_m}$ is ${\mathsf{\#P}}$-complete. It becomes ${n^{O(1)}}$-time, however, when ${f}$ has degree at most ${2}$. This is when ${m}$ is fixed. What if ${m}$ is variable? We can also ask about computing ${N_m(f - j)}$ for all ${j \in \mathbb{Z}_m}$, for the particular purpose of computing the exponential sum

$\displaystyle Z_f = \sum_{j=0}^{m-1} N_m(f - j) \omega^j = \sum_{x \in \mathbb{Z}_m^n} \omega^{f(x)},$

where ${\omega = \exp(\frac{2\pi i}{m})}$.

Dick co-authored a paper recently with Jin-Yi Cai, Xi Chen, and Pinyan Lu, showing that when ${f}$ has degree at most ${2}$, ${Z_f}$ is computable in time polynomial in ${n}$ and ${\log m}$. In particular this means that for ${m = 2^r}$, when we can represent the members of ${\mathbb{Z}_m}$ by strings in ${\{0,1\}^r}$, the time to compute ${Z_f}$ is ${\mathsf{poly}(nr)}$.

A little thought shows that this suffices to compute any individual number ${N_m(f - j)}$ in time ${\mathsf{poly}(nm)}$, indeed by computing all of them. But if we just want one, and ${m = 2^r}$, can we do it in time ${\mathsf{poly}(nr)}$? This is not obvious to me (Ken), and at least for now leaves the funny situation where we can compute in ${\mathsf{poly}(nr)}$ time the historically important ${Z_f}$ function which involves all the ${N_m(f - j)}$ values, but don’t see a way to compute any one of them in the same time.

## Open Problems

Whose originally proved efficient counting for deterministic FSA? We seem to not be able to track that down. Are there some cool applications?

What is the answer to the last problem? Should counting receive more emphasis in texts at the level of Mike’s?

I (Ken) add a story: I first met Mike when we shared a compartment of the train between Hamburg, Germany and Aarhus, Denmark, on the way to ICALP 1982. I had just moved from algebraic combinatorics into complexity during my first year at Oxford, and naturally asked him what were the propsects for proving ${\mathsf{P \neq NP}}$. He replied “It will be proved … yes we will prove it,” and backed up his confidence by naming some results and giving some ideas along lines that would later be called the “Sipser programme” of approach via circuit lower bounds. (Did we mention that he also wrote with Ravi Boppana a bellwether survey on circuit complexity?) I guess there wasn’t a time limit on his assertion…

1. April 5, 2014 6:06 am

A related problem asks, given two NFA $A$ and $B$ and an integer $n$ (written in binary), whether $A$ has more accepting runs of length $n$ than $B$. This easily reduces to the question of whether, given a square integer matrix $M$ and integer $n$, the bottom-left entry of the matrix power $M^n$ is positive. By iterated squaring this reduces to the problem PosSLP of whether the output of a given algebraic circuit with integer inputs and addition and multiplication gates is positive. The best known upper bound for PosSLP is the fourth level of the counting hierarchy, but can one do better for matrix powering?

• April 6, 2014 10:13 am

An interesting point implicit in James’ comment: solving the DFA counting problem by matrix exponentiation reduces the dependence on $n$ so that one can obtain $n$ in binary and the algorithm is still polynomial in the length of the input.
Now consider the NFA: a classic result (due to Meyer and Stockmeyer, I believe) is that NFA universality (whether the language accepted is all strings) is PSPACE-complete. This is proved by constructing an automaton that recognizes strings which are not a computation history of a given polynomial-space machine. Since the length of such a computation can be bounded by a 2^poly function, the representation of the length in binary is polynomial; so in the binary representation, counting in polynomial time would prove PSPACE=P.

• April 6, 2014 12:05 pm

Hmm, in the first claim (regarding DFAs) I assumed unit cost for arithmetics. So in a bit-model like a TM I’d have to restrict the numbers, e.g., compute modulo some small enough number. I suppose that this is why problems like the one James described are hard (any comments?). For the NFA reduction, it suffices to consider a decision problem (is the count equal to 2^n ?).

April 5, 2014 10:04 am

Nice post! This is exactly how I started my “Introduction to theor. comp. scien.” course in the previous semester. Moreover, I made a counting a central topic. The motivating simple problem in the first lecture was to count the decimal strings(or numbers) of length n with all 10
digits present. The students(The City College of New York) gave a bunch of wrong formulas or expressions, that were “useless” for computing. So, I asked them to write a code for counting to check who was right…and they complained that it is too slow…what could be a better motivation for a theory(fortunately simple!). There were several more counting discussions, the favorite one
was counting the number of 2-colorings(also by linear algebra).
The cool thing is that inclusion-exlusion trick, which I proved in the class as well together with Bonferroni inequalities, diagonalizes the corresponding matrix. Another simple, related, entertaining related topic: a (constructive) representation of finite markov chains as DFA
with IID input. Anyway, as a researcher I always thought that the most important pieces of knowledge are tricks. Now, as a novice professor, I am trying to share some of them with my students. But the trick is…

• April 5, 2014 10:53 am

Indeed, when I am trying to decide if a student deserves to have our Discrete Mathematics requirement waived (or deemed fulfilled by a transfer course), my favorite question is: “If you roll an N-sided die N times, what is the chance of getting all N different numbers?” I specialize N=6 for an ordinary die as a prompt.

April 7, 2014 12:51 pm

April 8, 2014 5:26 am

> the favorite one was counting the number of 2-colorings(also by linear algebra).

Could you explain the role of linear algebra in this? Don’t you just ensure that each connected component is bipartite and then the result is 2*(number of components)? Or can finding connected components and checking bipartite be seen as linear algebra?

April 10, 2014 10:03 pm

@Amanda: The number of 2-colorings is half of the number of solutions of some system of linear equations in GF(2). This is how one can decide and count in one shut.

3. April 5, 2014 11:43 am

Very Nice post! Counting and computation is a great connection where perhaps much more is yet to be explored. So I certainly share Leonids feelings about it. I remember Knuth’s 3 Volumes “Art” was a source of wonderful enumerative combinatorics much before I was interested in computing. Things I like especially since early days is counting trees and also there it looks that there is more to be said and ask.

There is much about counting and complexity in the context of graph polynomials. In other directions, some works of Igor Pak trying to put the notion of “bijective proof” on precise grounds come to mind, and also recent works of Igor and Greta Panova that are related to counting as well as to Mulmuley’s geometric complexity.
Let me mention a few questions that I asked: http://cstheory.stackexchange.com/questions/19189/counting-isomorphism-types-of-graphs

Mike Sipser is also famous for his1986 computational complexity class where among the students were: Steven Rudich,Russell Impagliazzo, Noam Nisan, Ronitt Rubinfeld,Moni Naor, Lance Fortnow, John Rompel, Lisa Hellerstein, Sampath Kannan, Rajeev Motwani, Roman Smolensky,Danny Soroker, Valerie King, Mark Gross,and more.

April 5, 2014 3:14 pm

“mainly because I teach other things these days.”

So what do you teach these days?

April 6, 2014 11:34 am

1. Counting the number of spanning trees is also easy.

2. I think, there are at least 3 main classes of counting problems. One of them which are in FP, one of them which are #P-complete but are in FPRAS and the third is the class of problems which are not even in FPRAS unless RP = NP. The first example for this is in the seminal Jerrum, Valiant, Vazirani paper: counting the number of cycles in a directed graph. They actually showed that is is not in FPAUS unless RP = NP, but it easy to show that the same holds with FPRAS, too.

Personally, I am very interested in making a public page (wiki page or similar) which counting problems are in which class.