Impetus to study a new reducibility relation

 See Mike’s other projects too

Michael Wehar has just earned his PhD degree in near-record time in my department. He has posted the final version of his dissertation titled On the Complexity of Intersection Non-Emptiness Problems which he defended last month. The dissertation expands on his paper at ICALP 2014, joint paper at ICALP 2015 with Joseph Swernofsky, and joint paper at FoSSaCS 2016.

Today, Dick and I congratulate Mike on his accomplishment and wish him all the best in his upcoming plans, which center on his new job with CapSen Robotics near Pitt and CMU in Pittsburgh.

Mike’s thesis features a definition that arguably has been thought about by researchers in parameterized complexity but passed over as too particular. If I were classifying his definition like salsa or cheddar cheese,

• Mild;

• Medium;

• Sharp;

then Mike’s definition would rate as

• Extra Sharp.

Too sharp, that is, for ${O}$-notation, thus putting it beyond our usual complexity theory diet. But the problem it addresses has also seemed beyond the reach of complexity theory while we can’t even get a clear handle on ${\mathsf{P}}$ versus ${\mathsf{NP}}$:

What are the relationships between the ${n^k}$ levels of deterministic versus nondeterministic time and/or space, and versus ${k\log n}$ or ${n^k}$ levels of fairly-counted deterministic and nondeterministic space?

“Fairly counted” space means that we cast away the “linear space-compression” theorem by restricting Turing machines to be binary, that is to use work alphabet ${\Gamma = \{0,1\}}$. The blank character can be read but not written. This doesn’t mean swearing off ${O}$-notation for space, but does mean employing it more carefully.

Questions about particular polynomial-time exponents and space-usage factors animate fine-grained complexity, which has seen a surge of interest recently. Consider this problem which is central in Mike’s thesis:

• Intersection (Non-)Emptiness ${\mathsf{IE}}$: Given ${k}$ automata ${M_1,\dots,M_k}$, each with ${n}$ states, is ${L(M_1) \cap \cdots \cap L(M_k) \neq \emptyset}$?

The “naive” Cartesian product algorithm works in time basically ${n^k}$. The fine-grained question is, can one do better than naive? As we covered before, Mike’s early work showed that getting time ${n^{o(k)}}$ would have the sharp consequence ${\mathsf{P \neq NL}}$, which improved the ${\mathsf{NP \neq NL}}$ conclusion in a 2003 paper by Dick with George Karakostas and Anastasios Viglas.

Mike’s thesis expands the mechanism that produces this and related results, with the aim of a two-pronged attack on complexity bounds. He has framed it in terms of parameterized complexity, which we’ll discuss before returning to the thesis.

## Some Fixed-Parameter Basics

Consider three classic ${\mathsf{NP}}$-complete problems, each given an undirected graph ${G = (V,E)}$ and a number ${k}$:

• Vertex Cover: Is there ${S \subseteq V}$ of size at most ${k}$ such that every edge is incident to a node in ${S}$?

• Clique: Is there ${S \subseteq V}$ of size at least ${k}$ such that all nodes in ${S}$ are connected by edges?

• Dominating Set: Is there ${S \subseteq V}$ of size at most ${k}$ such that every other node has an edge to a node in ${S}$?

Parameterized complexity started with the question:

What happens when some value of the parameter ${k}$ is fixed?

For each problem ${B}$ and each ${k}$, we can define the “slice” ${B_k}$ to be the language of strings—here denoting graphs ${G}$—such that the answer for ${(G,k)}$ is “yes.” Each individual slice belongs to ${\mathsf{P}}$: just try all ${\binom{n}{k} = O(n^k)}$ choices of ${S}$, and when ${k}$ is fixed this gives polynomial time. The fine-grained question is:

Can we solve each slice faster? In particular, can we solve ${B_k}$ in time ${t(k)n^c}$ where the exponent ${c}$ is fixed independent of ${k}$?

The start of much combinatorial insight and fun is that the three problems appear to give three different answers:

• Vertex Cover: Build ${S}$ by streaming edges ${(u,v)}$ in a fixed order. If neither ${u}$ nor ${v}$ already belongs to ${S}$, make a binary choice of which to add, until ${|S| = k}$ when we accept if it’s a cover. If not, start over and try the next string in ${\{0,1\}^k}$ to guide the choices. This gives time ${O(2^k)n^c}$ for some fixed ${c}$.

• Clique: We can get ${n^{ak}}$ where ${a < 1}$ (see this), but the exponent is still linear in ${k}$.

• Dominating Set: No algorithm with time better than ${n^{k - o(k)}}$ is known or expected.

The exact ${c}$ for Vertex Cover may be model-dependent; for RAMs we get ${c = 1}$ and indeed time ${O(1.274^k + kn)}$ is known. But the fact of its being fixed holds in any reasonable model and classifies Vertex Cover as belonging to the class ${\mathsf{FPT}}$ for fixed-parameter tractable. To address whether Clique and Dominating Set belong to ${\mathsf{FPT}}$, a novel reducibility and completeness theory was built.

## Parameterized Reductions and Logspace

To exemplify an FPT-reduction, consider the problem

• Short NTM Acceptance: Does a given nondeterministic Turing machine ${N}$ accept the empty string within ${k}$ steps?

When ${k}$ is variable this is a generic ${\mathsf{NP}}$-complete problem, so of course we can reduce Clique to it, but consider the reduction done this way: Given ${G}$ and ${k}$, make an alphabet ${\Gamma}$ with a character ${c_u}$ for each ${u \in V}$. Code ${N_G}$ with a table for ${E}$ to write down ${k' = \binom{k}{2}}$ edges ${(c_u,c_v)}$ nondeterministically on its tape, then deterministically check whether only ${k}$ different nodes appear in these edges, accepting if so. The computation takes time ${k''}$ that is ${O(k')}$ and can be defined solely in terms of ${k}$. The latter fact makes ${f(G,k) = (N,k'')}$ an FPT-eduction.

FPT-reductions work similarly to ordinary polynomial-time reductions. If our NTM problem belongs to ${\mathsf{FPT}}$ then so does Clique. There is a trickier FPT-reduction the other way, so the problems are FPT-equivalent. Both are complete for the first level, called ${\mathsf{W[1]}}$, of the ${\mathsf{W}}$-hierarchy. Dominating Set is hard for the second level, ${\mathsf{W[2]}}$; Clique FPT-reduces to it, but not vice-versa unless ${\mathsf{W[2] = W[1]}}$.

Just like all major ${\mathsf{NP}}$-complete problems are related by logspace not just poly-time computable reductions, parameterized complexity has a logspace-founded version. The class ${\mathsf{XL}}$ consists of parameterized problems ${B}$ whose slices ${B_k}$ individually belong to ${\mathsf{L}}$—that is, to deterministic logspace—and ${\mathsf{XNL}}$ and ${\mathsf{XP}}$ are defined similarly via the conditions ${B_k \in \mathsf{NL}}$ and ${B_k \in \mathsf{P}}$. Interestingly, the separation ${\mathsf{FPT \subset XP}}$ is known by standard diagonalization means, but ${\mathsf{FPT}}$ is not known to be contained in ${\mathsf{XNL}}$ for reasons similar to the ${\mathsf{P}}$ vs. ${\mathsf{NL}}$ problem. A recent Master’s thesis by Jouke Witteveen has details on these classes.

All four of our problems belong to ${\mathsf{XL}}$. Furthermore, the slices ${B_k}$ belong to “almost exactly ${k\log n}$” space using binary TMs. The space is needed mainly to write down each candidate ${S}$ and can be quantified as ${k\log n + o(\log n)}$. We could jump from this to define finer variations on ${\mathsf{XL}}$ and ${\mathsf{XNL}}$ but Mike, joined by Swernofsky, chose to refine the reductions instead. Now we are primed to see how their ideas might impact separations.

## LBL Reductions

Our reduction from Clique to Short NTM Acceptance bumped ${k}$ up to ${k'' = \Theta(k^2)}$. Blowups in ${k}$ are important to a central concept called kernelization which deserves more space—see this survey. They can be exponential—the ${O(2^k n)}$ algorithm for Vertex Cover hints at how—or even worse. Hence people have refined parameterized reductions according to whether they blow up ${k}$

• polynomially, or

• linearly.

But before now, we haven’t noted in this line a strong motivation to limit the blowup even further, such as to ${k'' = k + o(k)}$ or ${k'' = k + d}$ for some fixed constant ${d}$ (however, see this).

Our reduction also bumped up the alphabet size. This appears necessary because for binary NTMs the problem is in ${\mathsf{FPT}}$: we can traverse the ${2^k}$ possible guesses made on the tape, and for each one, solve an instance of graph-reachability.

So what can we do if we insist on binary TMs? Mike and Joseph fastened on to the idea of making the ${k}$ in ${k\log n}$ space become the parameter. We can define “Log NTM Acceptance” to be like “Short NTM Acceptance” except that the NTM ${N}$ is allowed ${k\log n}$ space (where ${n}$ is the size of ${N}$ in states). We get a problem in ${\mathsf{XNL}}$ that is in fact complete for ${\mathsf{XNL}}$ under parameterized logspace reductions. Likewise “Log DTM Acceptance” is the case where ${N}$ is deterministic, which is complete for ${\mathsf{XL}}$. Then we define “Medium DTM Acceptance” where the question asks about acceptance within ${n^k}$ time steps regardless of space. Mike’s thesis also covers “Long DTM Acceptance” where the time is ${2^{n^k}}$.

The ${k}$ in ${k\log n}$ allows a binary TM to write down the labels of ${k}$ nodes in a size-${n}$ structure at a time. In that way it subsumes the use of ${k}$ in the above three graph problems but is more fluid—the ${k\log n}$ space could be used for anything. Keying on ${k}$ drives the motivation for Mike’s definition of LBL (for “level-by-level”) reductions, whose uniform logspace version is as follows:

Definition 1 A parameterized problem ${[A_k]}$ LBL-reduces to a parameterized problem ${[B_k]}$ if there is a function ${f}$ computable in ${c\log n}$ space such that for all ${x}$ and ${k}$, ${x \in A_k \iff f(x,k) \in B_k}$, where ${c}$ is fixed independent of ${k}$.

That is, in the general FPT-reduction form ${f(x,k) = (x',h(k))}$, we insist on ${h(k) = k}$ exactly. This reduction notion turns out to be the neatest way to express how Mike’s thesis refines and extends previous work, even his own.

 With our Chair, Chunming Qiao, and his UB CSE Graduate Leadership Award.

## Some Results

The first reason to care about the sharper reduction is the following theorem which really summarizes known diagonalization facts. The subscript ${b}$ reminds us that the TMs have binary work alphabet size; it is not necessary on ${\mathsf{DTIME}[n^{ck}]}$ since it does not affect the exponent.

Theorem 2

• If a parameterized problem ${[A_k]}$ is LBL-hard for Log DTM Acceptance, then there is a constant ${c > 0}$ and all ${k}$, ${A_k \notin \mathsf{DSPACE}_b[ck\log n]}$.

• If a parameterized problem ${[A_k]}$ is LBL-hard for Log NTM Acceptance, then there is a constant ${c}$ such that for all ${k}$, ${A_k \notin \mathsf{NSPACE}_b[ck\log n]}$.

• If ${[A_k]}$ is LBL-hard for Medium DTM Acceptance, then there is a constant ${c}$ such that for all ${k}$, ${A_k \notin \mathsf{DTIME}[n^{ck}]}$.

Moreover, if there is a constant ${d}$ such that each ${A_k}$ belongs to ${\mathsf{DSPACE}_b[dk\log n]}$, ${\mathsf{NSPACE}_b[dk\log n]}$, or ${\mathsf{DTIME}[n^{dk}]}$, respectively, then these become LBL-equivalences.

The power, however, comes from populating these hardnesses and equivalences with natural problems. This is where the ${\mathsf{IE}}$ problem—as a parameterized family of problems asking about the intersection of the languages of ${k}$-many ${n}$-state DFAs—comes in. We can augment them by adding one pushdown store to just one of the DFAs, calling them ${P_1,M_2,\dots,M_k}$ where ${P_1}$ is the PDA. Then call the problem of whether

$\displaystyle L(P_1) \cap L_(M_2) \cap \cdots \cap L(M_k) \neq \emptyset$

by the name ${\mathsf{IE}_{D+1P}}$. It curiously turns out not to matter whether ${P_1}$ or the ${M_i}$ are nondeterministic for the following results:

Theorem 3

• ${\mathsf{IE}}$ is LBL-equivalent to Log NTM Acceptance. Hence there is a ${c > 0}$ such that for each ${k}$, ${\mathsf{IE}_k \notin \mathsf{NSPACE}_b[ck\log n]}$.

• ${\mathsf{IE}_{D+1P}}$ is LBL-equivalent to Medium DTM Acceptance. Hence there is a ${c > 0}$ such that for each ${k}$, ${\mathsf{IE}_k \notin \mathsf{DTIME}[n^{ck}]}$.

Between the ICALP 2015 paper and writing the thesis, Mike found that the consequence of (a) had been obtained in a 1985 paper by Takumi Kasai and Shigeki Iwata. Mike’s thesis has further results establishing a whole spectrum of ${\mathsf{IE}}$ variants that correspond to other major complexity classes and provide similar lower bounds on ${k}$-levels within them. The question becomes how to leverage the greater level of detail to attack lower bound problems.

## Onto Bigger Questions

The way the attack on complexity questions is two-pronged is shown by considering the following pair of results.

Theorem 4 ${\mathsf{P = NL}}$ if and only if ${\mathsf{IE}}$ and ${\mathsf{IE}_{D+1P}}$ are LBL-equivalent.

Theorem 5 If for every ${a > 0}$ there is a ${k}$ such that ${\mathsf{IE}_k \in \mathsf{DTIME}[n^{ak}]}$, then ${\mathsf{P \neq NL}}$.

It has been known going back to Steve Cook in 1971 that logspace machines with one auxiliary pushdown, whether deterministic or nondeterministic, capture exactly ${\mathsf{P}}$. Since pushdowns are restricted and concrete, this raised early hope of separating ${\mathsf{P}}$ from ${\mathsf{PSPACE}}$—intuitively by “demoting” some aspect of a ${\mathsf{P}}$ problem down to ${\mathsf{L}}$ or ${\mathsf{NL}}$ which are known to be different from ${\mathsf{PSPACE}}$. There was also hope of separating ${\mathsf{P}}$ and ${\mathsf{L}}$ or at least finding more-critical conditions that pertain to their being equal.

What Mike’s theorem does is shift the combinatorial issue to what happens when the pushdown is added to a collection of DFAs. Unlike the case with an auxiliary pushdown to a ${k\log(n)}$ space-bounded TM, there is a more-concrete sense in which the relative influence of the pushdown might “attenuate” as ${k}$ increases. Can this be leveraged for a finer analysis that unlocks some secrets of lower bounds?

At the very least, Mike’s LBL notion provides a succinct way to frame finer questions. For example, how does ${\mathsf{P = PSPACE}}$ hang on the relation between individual ${\mathsf{DTIME[n^k]}}$ levels of ${\mathsf{P}}$ and ${\mathsf{DSPACE[n^k]}}$ levels of ${\mathsf{PSPACE}}$? The simplest way to express this, using Mike’s notation ${D^T_{n^k}}$ for Medium DTM Acceptance and ${D^S_{n^k}}$ for the analogous parameterized problem for ${n^k}$ space-bounded acceptance, seems to be:

Theorem 6 ${\mathsf{P = PSPACE}}$ if and only if ${D^T_{n^k}}$ and ${D^S_{n^k}}$ are LBL-equivalent.

Mike’s thesis itself is succinct, at 73 pages, yet contains a wealth of other ${\mathsf{IE}}$ variants for tree-shaped automata and other restricted models and connections to forms of the Exponential Time Hypothesis.

## Open Problems

How can one appraise the benefit of expressing “polynomial” and “${O(\log n)}$” complexity questions consciously in terms of individual ${n^k}$ and ${k\log n}$ levels? Does the LBL reduction notion make this more attractive?

It has been a pleasure supervising Mike and seeing him blaze his way in the larger community, and Dick and I wish all the best in his upcoming endeavors.

[fixed first part of Theorem 2, added link to Cygan et al. 2011 paper, added award photo]

January 19, 2017 1:04 pm

Interesting post! A remark wrt. the line “But before now, we haven’t noted a strong motivation to limit the blowup even further, such as to {k” = k + o(k)} or {k” = k + d} for some fixed constant {d}”. This is the central point in lower bounds under the Strong ETH, see for example the “On problems as hard as CNF-SAT” paper by Cygan et al

• January 19, 2017 1:08 pm

Thanks, Daniel! In that line, reference within parameterized complexity was meant. But I’ll consider further.

• January 19, 2017 11:25 pm

Thank you very much for sharing the paper. It’s great!

By the way, I am a big fan of your book “Parameterized Algorithms”!