Exponential hardness connects broadly to quadratic time
Cropped from src1, src2
Arturs Backurs and Piotr Indyk are student and advisor. The latter, Piotr, is one of the leading algorithms and complexity theorists in the world—what an honor it must be for Arturs to work with him as his advisor.
Today Ken and I want to talk about their paper on edit distance and an aspect that we find puzzling.
The paper in question is, “Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false).” It is in this coming STOC 2015.
What we find puzzling is that the beautiful connection it makes between two old problems operates between two different levels of scaling, “” and “” This messes up our intuition, at least mine.
I, Dick, have thought long and hard, over many years, about both the edit distance problem and about algorithms for satisfiability. I always felt that both algorithms should have much better than the “obvious” algorithms. However, I was much more positive about the ability for us to make a breakthrough on computing the edit distance, then to do the same for satisfiability.
The way of linking the two problems is to me quite puzzling. Quoting my favorite band, the Talking Heads:
… Now don’t you wanna get right with me?
I hope you get ev’rything you need
Puzzling evidence …
The edit distance between two strings is defined as the minimum number of insertions, deletions, or substitutions of symbols needed to transform one string into another. Thus CAT requires three substitutions to become ATE, but it can also be done by one insertion and one deletion: pop the C to make AT and then append E to make ATE. Thus the edit distance between these two words is 2. The problem of computing the edit distance occurs in so many fields of science that it is hard to figure out who invented what first. The case of strings of length is easily seen to be computable in time quadratic, , by a dynamic programming algorithm that builds up edit distances between initial substrings.
Chak-Kuen Wong and Ashok Chandra proved this is optimal in the restricted model where one can only compare characters to each other. There are algorithms that beat quadratic by logarithmic factors—they essentially treat blocks of characters as one. But it remain open after much research whether there is an algorithm that runs in time of order , for example.
The problem is the usual question of testing Boolean clauses to see if they can all be satisfied at the same time by the same assignment to the Boolean variable. restricts to formulas in conjunctive normal form, and restricts to clauses with at most literals per clause.
Backurs and Indyk prove that if there exists such that edit distance can be decided in time , then there exists such that for formulas with variables and clauses can be solved in time . They build on a connection to SETH showed ten years ago by Ryan Williams in part of a wider-ranging paper.
The basic idea is how behaves with regard to partitions of the variables of a formula into two size- subsets, call them and . Let be the set of assignments to and to . For every assignment , let be the set of clauses it satisfies, and the remaining clauses which it does not satisfy. Similarly define and for . Then
Now let us identify with regarded as an -bit vector and similarly with , also re-labeling to be the set of -many ‘s, for ‘s. Then as Williams observed, is satisfiable if and only if we have a yes-instance of the following problem:
Orthogonal Vectors Problem (): Given two sets of -many length- binary vectors, are there and such that ?
It is obvious to solve in time by trying all pairs. The nub is what happens if we achieve anything slightly better in the exponent than quadratic, say time . Then with we get time
for , which contradicts SETH.
What’s puzzling is that the evidence against doing better than quadratic comes when is already exponential, . Moreover, the instances involved are ridiculously large, exponential sized, and we don’t even care that they have a succinct genesis in terms of . (Note that we have swapped the letters and from their paper—we find it helpful to keep “” the larger one.)
Backurs and Indyk itemize several problems to which this connection was extended since 2010, but we agree that Edit Distance () is the most striking addition to this list. Their new result is a kind of “SETH-reduction” from to Can we capture its essence without referencing each time?
The results here and before all use an unusual type of reduction. Ken and I think it would be useful to formalize this reduction, and try to understand its properties. It is not totally correct to call it simply a quasi-linear time reduction because multiple parameters are involved—we can call them and quite generally.
In the above case with and , if the clause size is fixed then we have so . It hence suffices to have a reduction from to that is computable in quasi-linear time, here meaning time . Indeed, we can allow time for any function .
When talking just about the problems and , however—without reference to — and are separate parameters with no relation specified. It suffices to say that the reduction is polynomial in and quasi-linear in . This is essentially what Backurs and Indyk do. Their “” is called ““; then they define , , and ; then they multiply , and so on. The details in their paper are considerable, involving an initial reduction from to a problem they call , and this is one reason we’d like to streamline the reduction concept.
If we assume , then “quasi-linear in and polynomial in ” is the same as “linear in and polynomial in .” Perhaps the latter phrase is the simplest and best way to define the reduction? However, we do not need to specify “polynomial in ” either. It is enough to have a suitable sub-exponential time in . For instances with we would need not , while for we would need .
Parameterized complexity defines reductions with two parameters, but the simplest ones are not exactly suitable. Finally, we wonder whether it would help to stipulate any of the specific structure that comes from including that the instances are succinct. Note that we once covered succinctness and a hypothesis roughly related to SETH (see this comment for a circuit form). This paper highlighted by Backurs and Indyk works from but says it could have used , while still not formalizing the reduction concept. Likewise their other major references; some work from and others not. The latest of them, by Karl Bringmann and Marvin Künnemann who show “SETH-hardness” for on binary strings, defines a framework for the gadgets used in all these reductions.
Power and Puzzlement
The remarks just above about the reduction time in “” make us recall the three most common levels of exponential hardness. The power index of a language was coined by Richard Stearns and Harry Hunt in a 1990 paper. It is the infimum of such that belongs to time . Their “Satisfiability Hypothesis” (SH) about the power index of satisfiability is much weaker than SETH, though not as weak as conjecturing merely a lower bound of .
- SH: has power index 1.
- ETH: All algorithms for require time .
- SETH: All algorithms for require time .
The latter two come in slightly different versions bringing and/or a fixed in – into the picture, and of course all these versions might be false. SETH is distinguished as the closest to the upper bound. There are randomized algorithms for – that run in time for some and all , and can be replaced by in general.
Stearns and Hunt emphasized the effect of reductions on SH and the power index in general. The same can be done for ETH. But we remain equally puzzled about the issue of the size of the problems used in the reductions. We start with a SAT problem that uses bits in its description. This is viewed then eventually as an edit distance problem that uses exponential in bits. The is the edit problem is extremely large. Of course this is just fine, since we only claim to get an exponential time algorithm.
The point is that our intuition about the edit distance problem is all on problems that have modest size . I, Dick, actually had a student build a hardware machine that did modest size such problems many times faster than any computer. So all my intuition was—is—about small size edit problems. When the size of becomes astronomical my intuition may fall apart. Could this explain the reason that the result here seems puzzling?
So what is the complexity of computing the edit distance? Can we really not do better than the obvious algorithms? This seems hard, no puzzling, to us; but it may indeed be the case.
The 3SUM problem, which we recently covered, is attacked by some of these papers but has not of yet been brought within the scope of the reductions from . The decision tree upper bound has not yet yielded an algorithm that actually runs in time Yet perhaps the above kinds of reductions also generally preserve the decision-tree complexity? This would make the decision-tree result a real obstacle to showing “SETH-hardness” in that manner.