And Then There Were Two
The mystery of disappearing intermediate complexity
Agatha Christie was a famous writer, perhaps the most famous ever, of mystery novels and stories, and wrote well into her eighties. Her last novel, Sleeping Murder, was published posthumously in 1976. In 2005, the self-publishing site Lulu.com ran a computer analysis that judged this “the perfect title for a best-selling novel.”
Today we wish to discuss whether the vs. problem—or the complexity of any much-studied decision problem—will remain a mystery for as long as Christie wrote mysteries.
Her novel And Then There Were None is “the world’s best-selling mystery ever” according to its Wikipedia entry. This was not the original title. That and a third title by which the book has been known both began with the words “Ten Little……” The first completion is a word we’ve agreed to ban, while Dick and I feel the second, “Indians,” should refer only to India. “Native American” is not much better. For a third change, the nursery rhyme in current editions of the novel is now called “Ten Little Soldiers.”
The words “and then there were none” end the nursery rhyme, and describe the process in the novel. We see a similar process happening with problems in that were for some time not known to be in nor -complete.
NP-Intermediate?
Richard Ladner proved that if , then one can construct languages in that are -intermediate, meaning neither in nor -complete. The languages constructed in the proof, however, are artificial. The question is whether any “natural” decision problems are intermediate.
The famous text by Michael Garey and David Johnson on -completeness identified a dozen candidate intermediate problems. They did not define a decision version of FACTORING separate from the entry on COMPOSITE NUMBER. We will add DISCRETE LOGARITHM as a separate entry and one more still-open problem below, so perhaps this post should be titled “And Then There Were Five.”
Garey and Johnson’s List
Six of the twelve problems were solved in the two years between the text’s 1979 first edition and David’s first NP-completeness column which was devoted to them. Here they are, with his numbers. See David’s column 1, column 19, and column 24 for any references not given here.
- GRAPH ISOMORPHISM: Given (simple, undirected) graphs and , are they isomorphic? Still open.
- SUBGRAPH HOMEOMORPHISM: For any fixed graph , given any graph , does have an edge-induced subgraph that becomes isomorphic to when vertices of degree 2 are ignored? (Ignoring such a means replacing the two edges by one edge and deleting .) In —but with a multiple-exponential dependence of the constant on the size of . -complete when is a variable part of the input.
- GRAPH GENUS: Given and , can be mebedded without edge-crossings on a surface of genus ? -complete as proved by Carsten Thomassen in 1988. But in for any fixed , via similar theory on forbidden minors as item 2.
- CHORDAL GRAPH COMPLETION: Given and , can we add or fewer new edges to make a chordal graph? -complete, by Mihalis Yannakakis in 1981.
- CHROMATIC INDEX, also called EDGE COLORING: Given , can the edges be colored with (or fewer) colors so that no two edges of the same color share a vertex? -complete, proved in 1981 by Ian Holyer.
- SPANNING TREE PARITY: Given and a partition of the edges into two-element subsets, does have a spanning tree that includes neither or both edges from each subset? In , proved by László Lovász in 1978–1980.
- PARTIAL ORDER DIMENSION: Given a finite poset and , are there linear extensions of such that in if and only if in every extension? -complete, proved by Eugene Lawler and Oliver Vornberger in 1981, and independently in stronger form by Yannakakis the same year.
- PRECEDENCE CONSTRAINED 3-PROCESSOR SCHEDULING: Given a finite poset and , can be partitioned into subsets each of size at most 3 whose indices respect the ordering? Still open, even for a fixed number of processors, though -complete if is variable. See this paper by Hans Bodlaender and Mike Fellows for related hardness results.
- LINEAR PROGRAMMING: In , by the famous ellipsoid algorithm of Leonid Khachiyan. Boris Yamnitsky and Leonid Levin noted that an older published algorithm runs in deterministic polynomial time.
- (NON-)TOTAL UNIMODULARITY: Given an integer matrix with entries , , or , is there a square sub-matrix whose determinant has absolute value ? In , proved in 1980 by Paul Seymour.
- COMPOSITE NUMBER: In , by the celebrated but not-quite-practical deterministic algorithm of Manindra Agrawal, Neeraj Kayal, and Nitin Saxena.
- MINIMUM WEIGHT TRIANGULATION: Given a set of points in the plane, and , is there a triangulation of whose total edge length is at most ? -complete, proved by Wolfgang Mulzer and Günther Rote in 2006—the latest problem to fall.
Other Candidates
The first two were generally understood as being on “the list” all along, while the third may have the longest lasting significance.
- FACTORING: Given natural numbers , does have a prime factor greater than ?
- DISCRETE LOGARITHM: Given natural numbers , does there exist such that modulo ?
- CIRCUIT MINIMIZATION: Given a string where for some , and (in binary), is there a -input Boolean circuit of size at most such that for all , , ?
The last was studied by Valentine Kabanets and Jin-Yi Cai, who gave interesting theoretical evidence for its truly being intermediate.
With FACTORING, if the question is: given , is a prefix of the unique prime factorization of , then the problem belongs to , not just .
Some other candidate intermediate problems may be found here.
What Would a Gap Mean?
Even if Garey and Johnson had originally given the full list of fifteen problems, then as with “ten little_______________s,” ten of them would now be gone. Perhaps only CIRCUIT MINIMIZATION will ultimately be left over, and that will be the guilty party.
If there truly is a full gap between “easy” and “maximally hard” for problems in , what would it mean? It may mean that complexity is better thought of in terms of quantized levels rather than classes. Ladner’s theorem constructing problems properly between any two such levels detracts from this picture, but perhaps we would see Ladner as defining one dimension of “gappiness” and then find a technical condition for natural-ness by which complexity cannot be subdivided. This is of course just speculation at this point, but we have to think out-of-the-box to solve the mystery.
A Cue From Christie?
Spoiler alert: if you have plans to read her novels, then skip the following:
Agatha Christie was not just a famous mystery write. She was famous for thinking outside the box—some of her mysteries broke the rules of the genre. Her Murder of Roger Ackroyd, for example, was quite controversial for breaking one of the rules—the narrator of the novel is the murderer. One of her other most famous novels, Murder on The Orient Express breaks another rule—all the suspects are guilty. In the book And Then There Were None she broke yet another—at the end of the novel everyone dies, including the murderer. And in the 1920′s, she made herself a mystery, by disappearing
I (Dick) think the lesson is that to make progress in our area we probably have to think outside the box—like she often did. Break the rules. She would have made a great theorist.
Since this post is all about open problems, we break our rules in the next section.
Open Problems
Agatha Christie was a famous writer, perhaps the most famous ever, of mystery novels and stories, and wrote well into her eighties. (…)
[fixed k>=3 in problem 8.]
You forgot the model-checking problem for the mu-calculus (Emerson-Jutla-Sistla). Or maybe it is now known it is in P ?
I’m always surprised that the most obvious NP-intermediate candidate is omitted from such lists, namely LOGCLIQUE (given a graph G, does it have a clique of size log |V(G)| ?). If LOGCLIQUE is in P, or if LOGCLIQUE is NP-complete, then in both cases we could solve n-variable 3SAT in 2^{o(n)} time, contradicting the Exponential Time Hypothesis (ETH). Therefore, if you believe ETH, then you can easily construct simple (but somewhat technical) problems such as LOGCLIQUE that are between P and NP-complete problems. More generally, ETH seems to suggest that for every (reasonable) function between polynomial and exponential there is a problem that requires exactly that much time. So indeed there seems to be an infinite hierarchy of complexities between polynomial and exponential time.
Thanks—-that is an interesting example. The post we’re drafting for Monday will state our shared skepticism about ETH; as with circuits for E, I think time 2^{O(n/log n)} is possible.
More interesting examples are welcome. Also, we linked but didn’t list the other ones given in DSJ’s “Column 24″—opinions that some are really substantial would interest us.
Take a look at the list here: http://cstheory.stackexchange.com/questions/79/problems-between-p-and-npc
A few more potential examples: of intermediate problems
1) Unique games (conjectured (plausibly) to be NP complete; believed by some to be in P potentially intermediate)
2) Christos classes: that represents various non constructive mathematical methods are known (under plausible conditions) not to be NP complete so must be genuinly intermediate.
3) Finding the sink of an acyclic unique sink orientation (AUSO) of the cube. (The unique sink property is assumed for all subcubes.) This is quite an interesting class of problems and it would be nice to connect them to CC.
(Inspired by Ramsey theory) Given c,n,m and a partial c-coloring of the nxm grid that has no mono rectangles, is there a way to extend to a rect-free coloring.
Can replace rectangle with squares or other shapes.
Clearly in NP, I can’t get it into P nor prove it NPC nor prove that IF it was NPC then
bad things happen. VERY FEW people have worked on this so it may well be in P
or NPC (I think NPC more likely) hence taking it off this list.
Other partial-coloring-Ramsey-type problems can be genreated as well.
I want to suggest an example very tentatively because I haven’t thought about it at all, so it might be obviously NP-complete. (I very much doubt it’s in P.) Suppose we take a non-singular matrix over . The problem is to determine whether it can be reduced to the identity using at most m Gaussian row-operations. (The problem of finding an explicit matrix that needs a superlinear number of operations is a well-known beautiful open problem, which is why it seems very unlikely that this problem is in P.) I can believe that a more general problem about representing group elements by short words in a set of generators might be NP-complete, but what seems to give this one a chance of being intermediate is that we are talking about a specific group and a specific set of generators. Does anyone know anything about the complexity of this problem?
Now was #11 ever a possible? Miller’s paper was 1976, three years before their book (1979), so PRIMES in P was known under ERH, and no one would think the opposite. It should not have been on their list as “possible” IMO.
For the description of the 8th problem, there is a mistake. “Still open, even for a fixed number {k > 3} of processors,” It should be {k>=3}.
You are right, per these notes. Don’t recall what my source was; usually I’d have written p >= 4 so it may just be an uncaught inequality typo. Thanks!