An improved exponential time algorithm for the knapsack problem

Nick Howgrave-Graham and Antoine Joux are experts in the area of computational number theory and cryptography.

Today I plan to talk about their new algorithm for the knapsack problem. It is an exponential time algorithm with a much smaller exponent than any previous algorithm. The knapsack problem is one of my favorite problems, and their improvement is clever and surprising.

Nick has done some beautiful work in computational number theory. One of my favorites is his fast algorithm for factoring numbers of the form ${p^{r}q}$, where as usual ${p}$ and ${q}$ are primes. He has shown, with Dan Boneh and Glenn Durfee, if ${r}$ is large enough—order logarithmic in ${p}$—then such numbers can be factored in polynomial time.

Antoine has done some wonderful work on secure hash functions: for example, his team found the first collision for SHA-0. In cryptography hash functions require a variety of special properties to be useful. Distinct messages with the same hash value must exist by pigeon-hole, but they should be hard to find. If collisions were easy to find, then it might be possible to “fool” someone into accepting the wrong message as the intended message. Here are two messages, found by Antoine, that collide for SHA-0:

The exact messages are not important; the great result is they can be found.

Their paper will appear in the next EUROCRYPT. I like one of the traditions of this conference, which is an evening “rump” session. The session consists of short presentations that range from: “here is my paper you did not accept,” to “here is my great new paper you will accept,” to jokes. The rump session presentations are helped by a generous supply of beer and by general good company.

One year, while I was eating dinner with a large group right before the rump session, we decided we had to present a talk during the evening session—but on what? After some discussion we agreed on an idea I suggested; I did not present the idea but one of my colleagues did. The idea was a new protocol for program committees; the protocol would allow committees to have provable deniability. Our protocol was simple:

After the committee was done deciding which papers to accept and which to reject, they would do the following. They would randomly select one accepted paper to reject and one rejected paper to accept.

The protocol would give each program committee member the ability to handle the two most dreaded questions they get:

• “Why did you reject my paper?”
• “Why did you accept their paper?”

Each of these questions could now be answered with a smile. As far as I know the program committee does not currently use our protocol, but I am not sure.

Let’s turn to the knapsack problem.

The Problem

Recall the knapsack problem is the question of whether or not given the positive integers ${ a_{1}, \dots a_{n}}$, there is a subset ${ I}$ of the indices ${ 1}$ to ${ n}$ so that

$\displaystyle \sum_{i \in I} a_{i} = b$

where ${ b}$ is another given positive integer. The obvious algorithm would try all subsets ${ I}$. Since there are ${ 2^{n}-1}$ non-empty subsets the algorithm takes a long time. An early algorithm, due to Ellis Horowitz and Sartaj Sahni in 1974, made a key observation. The above equation can be re-written as follows:

$\displaystyle \sum_{i \in I} a_{i} = b - \sum_{j \in J} a_{j}$

where ${ I \subseteq \{1, \dots, n/2\}}$ and ${ J \subseteq \{n/2+1, \dots, n\}}$. (Yes, we assume that ${ n}$ is even, but if ${ n}$ is odd the method can easily be fixed.) Then, the algorithm computes two sets. The first consists of all the values of

$\displaystyle \sum_{i \in I} a_{i}$

where ${I \subseteq \{1, \dots, n/2\}}$; and the second consists of all values of

$\displaystyle b - \sum_{j \in J} a_{j}$

where ${ J \subseteq \{n/2+1, \dots, n\}}$. Then simply check to see if these two have a value in common. The point is they have a value in common if and only if the knapsack problem has a solution. The punch-line is the cost for this is dominated by the number of values in each set: ${ 2^{n/2}}$. Note, checking if they have a common value can be done either by hashing or by sorting. But, in either case, this takes time just slightly larger than the sum size of the sets.

The New Algorithm

Nick and Antoine have proved the following:

Theorem: There is a randomized algorithm for the knapsack problem that runs in time ${\bar{O}(2^{0.311n})}$ and uses space ${\bar{O}(2^{0.256n})}$.

Note, ${\bar{O}(2^{cn})}$ is just the usual O-notation except the missing constant is allowed to be a missing polynomial. There is always a chance this could be abused—what if the missing polynomial is ${n^{1000}}$, but in their case it is not. See their paper for the actual bounds. Another way to see they have not “cheated” is that they have a running implementation with reasonable performance.

I think this is a terrific result, since the bound they beat is almost thirty years old. Further, one wonders whether or not there could be further improvements to the exponent on the running time. Of course, I believe P=NP is possible, so even more surprising bounds could exist. In any event, a great result.

The Schroeppel and Shamir Algorithm

How do Nick and Antoine prove their theorem? I will try to sketch the main ideas for you, but please look at their paper for the details.

Their algorithm is based on an earlier algorithm of Richard Schroeppel and Adi Shamir. This algorithm uses the same method I just outlined, with a key twist. The previous algorithm created two sets ${S_{L}}$ and ${S_{R}}$. The first ${S_{L}}$ consists of all the values of

$\displaystyle \sum_{i \in I} a_{i}$

where ${I \subseteq \{1, \dots, n/2\}}$; and the second ${S_{R}}$ consists of all values of

$\displaystyle b - \sum_{j \in J} a_{j}$

where ${ J \subseteq \{n/2+1, \dots, n\}}$. What Schroeppel and Shamir focused on was not reducing the running time, but the storage of the algorithm. Their twist is to create a method that takes time still ${2^{n/2}}$, but now only uses memory ${2^{n/4}}$. This was a huge improvement, since even for modest size ${n}$ the memory requirement of the old algorithm of ${2^{n/2}}$ made it impractical.

The key to reducing the memory needed is not to store all of ${S_{L}}$ and ${S_{R}}$. Instead, create a subroutine that creates the elements from these sets in sorted order, and this reduces the storage required. This is achieved by viewing ${S_{L}}$ and ${S_{R}}$ as coming from other sets. For example, ${S_{L}}$ is viewed as the set of all sums of ${S_{LL}}$ and ${S_{LR}}$ where: ${S_{LL}}$ is the set of all sums from ${a_{1}}$ to ${a_{n/4}}$ and ${S_{LR}}$ is the set of all sums from ${a_{n/4+1}}$ to ${a_{n/2}}$. In a like manner ${S_{R}}$ is viewed as the set of all sums of two sets ${S_{RL}}$ and ${S_{RR}}$.

This method, as I stated already, reduces the storage required to ${2^{n/4}}$, but leaves the running time the same. The algorithm must still look at ${2^{n/2}}$ values to determine whether or not the knapsack problem has a solution.

The new clever insight is to add another twist to reduce the time to ${2^{0.311n}}$ and very slightly increase the storage requirements. Again consider finding a sum of ${n/2}$ elements in

$\displaystyle a_{1}, \dots, a_{n}$

that sum to ${b}$: of course they must handle all cases, but the case where the answer uses ${n/2}$ elements is the case that takes the longest. Define ${{\cal S}_{n/4}}$ as the set of all sums of ${n/4}$ elements. There are elements ${\sigma_{1}}$ and ${\sigma_{2}}$ that lie in ${{\cal S}_{n/4}}$ so

$\displaystyle \sigma_{1} + \sigma_{2} = b, \ \ \ \ \ (1)$

if the knapsack has a solution. Moreover, there will be many pairs that solve (1), when the problem has a solution. Therefore, the entire set of solutions to (1) need not be examined.

There is a general trick for searching a large space of potential solutions provided there are many solutions. Pick a hash function and a random hash value, and only search those potential solutions with the given hash value. In their algorithm they pick a number ${M}$ of size about the number of solutions:

$\displaystyle \binom{n/2}{n/4}.$

Then, they pick a random ${R}$, and try to find solutions to the equations:

$\displaystyle \begin{array}{rcl} \sigma_{1} &\equiv& R \bmod M \\ \sigma_{2} &\equiv& b-R \bmod M. \end{array}$

The time to search for such a solution is bounded roughly by

$\displaystyle \binom{n}{n/4}/M \approx 2^{0.311n}.$

The details involve a clever use of the Schroeppel Shamir algorithm to make their hash search work correctly.

I hope this captures some of the flavor of what they do. See their well-written paper for the full details.

Open Problems

The obvious question is how low can the exponent be? Can knapsack be done in time

$\displaystyle 2^{o(n)}?$

Another question is can their ideas be used to solve other related open problems?

February 5, 2010 2:52 pm

Just curious where you heard about the result, given that Eurocrypt acceptances aren’t posted yet?

February 5, 2010 5:38 pm

I heard about it from the authors.

2. February 5, 2010 11:22 pm

Does their new approach also work in the general version of the Knapsack problem when there is a weight restriction?

February 6, 2010 4:54 pm

I would see two questions (that can be combined) here:
1) Does it work for integer knapsacks (rather than 0-1) knapsacks ?
2) Does it work for knapsacks where instead of achieving equality you want to maximize the knapsack value under a weight condition ?

My earlier intuition on these questions would be:
1) For integer knapsacks, where one wants to satisfy Sum(x_i p_i)=Target, something in the same vein is certainly doable. However, I do not know what complexity that would yield. In addition, especially if negative values of p_i are allowed, it might be possible to to better in terms of memory by using cycle finding techniques.
2) May be, since the idea of exploding a single solution into many representations still work. But, I do not see how to adapt the use of modular values to deal with inequalities.

Anyway, these are nice open problems.

February 6, 2010 9:38 am

February 6, 2010 9:42 am

I did at beginning, but hard to see. Here it is again: the paper

February 6, 2010 11:01 am

In the linked paper, the authors use an AVL tree to enumerate the sorted sums.

“The main difficulty with this algorithm is that, in order to obtain the overall smallest current sum at each step of the algorithm, one needs to keep the list of sums sorted throughout the algorithm. Of course, it would be possible to sort after each insertion, but that would yield a very inefficient algorithm. Thankfully, by using sophisticated data structure such as balanced trees also called AVL trees, after their inventors Adelson-Velsky and Landis, it is possible to store a sorted list of elements in a way that fits our needs.”

Wouldn’t a priority queue (such as a binary heap) be simpler and as efficient since the only deletions are delete-mins?

February 6, 2010 4:37 pm

Yes, a heap would do.

Note that this use of AVL or heap is in the original Shamir-Schroeppel algorithm. For practical implementations, getting rid altogether of these data structures with are either costly in terms of memory (AVL) or cache unfriendly (heaps) as we propose in section 2.2 is preferable.

February 6, 2010 2:23 pm

If $P = NP$ what current hash functions would you recommend?

February 6, 2010 9:07 pm

What if there are no solutions to the problem? The Horowitz and Sahni algorithm takes Theta(2^{n/2}) time in this case. How long would this new algorithm take in this case?

February 7, 2010 5:49 am

The algorithm is a Monte-Carlo algorithm and we are in essentially the same situation as with Solovay-Strassen primality testing. If a solution is found, then it is correct (it corresponds to the case of a composite number, where a witness is given). If no solution is found, then the problem probably has none (similar to the probable prime case). By repeating the algorithm several times (with different random choices), you increase the confidence of the result.

Right now, to prove that there is no solution, you need to run Horowitz-Sahni or preferably Shamir-Schroeppel (to use less memory). It might be possible with a more careful analysis to show that for most choices of knapsack elements (a_1, …, a_n), even the “no solution” answer is correct, but this remains to be done.

February 7, 2010 12:19 pm

Excellent job! Very clever! I am of the opinion that it is impossible to beat Horowitz and Sahni’s time O(2^{n/2}) for determining whether an instance of the SUBSET-SUM problem has a solution or not, in the worst case scenario. I even wrote a brief paper giving an argument why.

http://arxiv.org/abs/cs.CC/0607093

Your result has forced me to rethink it all.

February 7, 2010 12:23 pm

BTW, I found a typo in the definition of density of a knapsack on page two. It should be d=n/max_i \log a_i.

February 10, 2010 10:51 pm

Antoine Joux said, “It might be possible with a more careful analysis to show that for most choices of knapsack elements (a_1, …, a_n), even the “no solution” answer is correct, but this remains to be done.”

I think the key here is the word “most”. A trivial example of the algorithm most likely failing is where (a_1,…,a_n)=(1,1,…,1) and b= n/2, since in this instance, the algorithm (the above version) would most likely return “no solution”. I’m certain that there are lots of nontrivial examples that would be cases of “hard” instances that this 2010 algorithm would not be able to handle better than the Horowitz-Sahni algorithm. But this is a great improvement from what was known before.

I think the upshot here is that any improvements to the Horowitz-Sahni algorithm will never be able to handle all instances better than O(2^{n/2}) time. A more useful analogy of this algorithm to algorithms testing primes is the Fermat’s Little Theorem test of whether a^{n-1}=1 (mod n) rather than Solovay-Strassen primality testing. The FLT test has a probability one failure for some instances. In fact there are infinitely many instances where it is guaranteed to fail (they are called Carmichael numbers). The Solovay Strassen test works for all instances in a probabilistic sense – the probability of failure is at most 1/2 for all instances of the Solovay Strassen test. The probability of failure of the above algorithm for some instances approaches one as n approaches infinity.

7. February 9, 2010 4:01 pm

This is a great result, and in hindsight it looks strange that no one thought of it before (great results usually are like that). I (and I would bet that many others) tried to solve Subset Sum faster by taking random mod’s of this and that, but never could get anywhere with it. It’s unlikely this will be the last word on the problem, but it’s the first genuinely new insight in a long time.

May 7, 2010 3:21 am

Update: There is now a full version of the paper available that makes it clear that the algorithm works in the mentioned running time with high probability for a random knapsack instance. The question of improving the worst-case running time remains open.

May 7, 2010 6:19 am

Yes. The original paper was unclear on this point.

Random case is interesting but does still not break the worst case bound. So the search is still on for a better algorithm for all knapsacks.

9. July 12, 2010 5:34 am

I have a Polynomial time algorithm for the Knapsack problem that handles 1 and 2 dimensional data, unbounded, decimal or integer and that I’ve tested on 10,000 values.

I have sample data sets on my website in case anyone would like to compare their results. All comments welcome!

Regards, Walt