An exponential algorithm for knapsack

Alan Perlis won the first Turing Award for his pioneering work on computer languages. He helped create the now famous department of computer science at Carnegie-Mellon University, then later in his career he created another department at Yale University. Alan was always smiling even through he spent much of his adult life in a wheel chair. He delighted in all things, especially anything about computing. You could walk in his office anytime and chat about just about anything. My connection with Alan was as a graduate student at Carnegie-Mellon, and later as a junior faculty at Yale when he was the chair there.

Perlis was famous for many “sayings”. He talked about “one man’s constant is another’s variable,” he coined the term “Turing Tar-pit”. The latter referred, of course, to the fact that just about anything interesting that one wanted to do with computers was undecidable. He talked about the “IBM fog descending over all”–recall this was when IBM was the dominate player in computers.

## Polynomial vs. Exponential Algorithms

Alan had one saying that is quite relevant to P=NP: he once said, “for every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run”. His point is simple: if your algorithm runs in $n^{4}$ than an algorithm that runs in $n2^{n/10}$ is faster for $n=100$. While this is a simple point I think that it is a profound one. When we talk about P=NP we must not forget that the actually running time of the algorithm is the key. For example, if there is a factoring algorithm that runs in time $n2^{n/100}$ such an algorithm would easily break all today’s factor based crypto-systems. Yet it’s asymptotic performance would pale when compared to the best known factoring methods.

## An Exponential Algorithm for Knapsack

One of my favorite algorithms is one for the knapsack problem that runs in time exponential in $n$, but the exponential term is $2^{n/2}$ instead of the obvious $2^{n}$. This is a huge improvement over the obvious method. I have no idea who this algorithm is due to. I have known about it for decades–I would be greatly interested in any reference to it. But please to be the “first” the reference has to be before 1974, at least.

Recall the knapsack problem is the question of whether or not given the positive integers $a_{1}, \dots a_{n}$, there is a subset $I$ of the indices $1$ to $n$
so that
$\sum_{i \in I} a_{i} = b$
where $b$ is another given positive integer. The obvious algorithm would try all subsets $I$. Since there are $2^{n}-1$ non-empty subsets the algorithm takes a long time.

The cool algorithm makes one small observation. The above equation can be re-written as follows:
$\sum_{i \in I} a_{i} = b - \sum_{j \in J} a_{j}$
where $I \subseteq \{1, \dots, n/2\}$ and $J \subseteq \{n/2+1, \dots, n\}$. (Yes, we assume that $n$ is even, but if $n$ is odd the method can easily be fixed.) Then, the algorithm computes two sets. The first consists of all the values of
$\sum_{i \in I} a_{i}$
where $I \subseteq \{1, \dots, n/2\}$; and the second consists of all values of
$b - \sum_{j \in J} a_{j}$
where $J \subseteq \{n/2+1, \dots, n\}$. We then simply check to see if these two have a value in common. The point is that they have a value in common if and only if the knapsack problem has a solution. The punch-line is that the cost for this is now dominated by the number of values in each set: $2^{n/2}$. Note, checking if they have a common value can be done either by hashing or by sorting. But, in either case, this takes time just slightly larger than the sum size of the sets. Very neat, I think.

## Open Problems

The above algorithm relies on the following idea. Suppose that $A$ and $B$ are finite sets of integers, and define $A+B = \{a+b \mid a \in A \text{ and } b \in B\}$. Then, determining whether or not $x$ is in $A+B$ can be done in time $|A|+|B|$ if hashing is used. (If you use sorting the time goes up by a log-term.) Note, this is potentially much faster than a brute force search of the set $A+B$.

A natural problem is what happens with a set like $A+B+C$? Is there a way to tell if $x$ is in this set in time $|A|+|B|+|C|$, for example? If this is possible, then the knapsack problem has an algorithm whose exponential term is $2^{n/3}$. I have thought about this and related approaches quite often. I have yet to discover anything, I hope you have better luck.

February 27, 2009 4:22 pm

My friend Stephan Mertens points me to the following reference:

@Article{horowitz:sahni:74,
author = {E.~Horowitz and S.~Sahni},
title = {Computing partitions with applications to the
{K}napsackproblem},
journal = jJACM,
year = {1974},
volume = {21},
number = {2},
pages = {277-292}
}

July 4, 2009 1:08 pm

This problem is usually called SUBSET-SUM. G. Woeginger in his paper “Open problems around exact algorithms” cites [horowitz:sahni:74] when presenting this algorithm.

3. July 6, 2009 9:04 pm

The last problem you mention — given sets A, B, and C, and an integer x, determine whether x is in the set A + B + C — is equivalent to the well known 3SUM problem, which is widely believed to require almost-quadratic time.

July 6, 2009 9:06 pm

I agree. But the whole question is can we beat that?

November 22, 2011 11:09 am

You can’t beat 2^{n/2} because there are only two sides to every equation. If there were three sides to an equation, you could get 2^{n/3} time. Whatever you do algebraically to the knapsack problem equation, you will always get at least 2^k expressions on one side and at least 2^{n-k} expressions on the other side for some k=1,…,n. Minimize 2^k+2^{n-k} and you get the order of 2^{n/2} expressions on both sides.

November 21, 2011 7:49 am

Am I missing something? I do not understand the knapsack algorithm as described in the post. Given elements {3,4,5,8}and I want to see if there is a subset of elements that sums up to “7”.

So, I would end up seeing if there is a common element in following sets: (I divided the initial set into sets: {3,4} and {5,8}.

{3,4,7}
{2,-1,-6} = {7-5,7-8,7-13}

So, there are no elements contained in both sets, but the subset {3,4} sums up to 7.

I am sure I am missing something. It would be nice if you could enlighten me or point me in the right direction.

November 21, 2011 6:19 pm

I think you are missing the empty subsets. So you get
{0,3,4,7} and {7,2,-1,-6} which have a common element.

November 21, 2011 8:52 pm

Thank you, I probably should have paid more attention when reading the post, but think that I really learned something from my mistake. Probably more about how easy it is to be fooled by incidental formulations than about the algorithm.
I made the assumption that the formulation: “The obvious algorithm would try all subsets I. Since there are 2^{n}-1 non-empty subsets the algorithm takes a long time.” ,suggested that the empty set was not relevant for the algorithm. While this seems to be true for the “obvious algorithm” (as long as we stick to b being a positive integer – for b = 0 maybe the empty set would be a valid solution?), I managed to stick with that assumption for the whole rest of the post, even when a different algorithm was discussed.

“For example, if there is a factoring algorithm that runs in time n2^{n/100} such an algorithm would easily break all today’s factor based crypto-systems.”
I agree that an exponential algorithm is not necessarily useless. But is there not a fundamental difference in the fact that it would be relative easy to adjust the cryptosystems by using numbers just big enough to counter such a fast exponential algorithm. (just because it does not scale very good)

A polynomial algorithm fast enough to break today´s cryptosystems, I believe, could not so easily be countered by the very same methods.
Am I seeing this wrong?

• November 23, 2011 12:38 am

Thanks to both of you for working out these examples further—I’ve sometimes struggled with half-developed examples in books, and I’d have been thrown off by the 2^n – 1 thing too.

If Mr. (Dr.?) Johansen doesn’t mind my saying so, it’s a bit hard for him to miss the empty-set case :-). His Twitter feed also includes a touching tribute to Dennis Ritchie here.

November 23, 2011 2:38 am

Now I got really confused here, because I don’t have a Twitter account! Apparently there are at least three twitterers sharing my name, and I found your link on one of their feeds. Even more confusing, I did recognize the site you linked to, as I also dabble in esolangs. 🙂

• November 23, 2011 12:42 pm

That is astounding! I thought the esoteric languages connection was a positive identification of who you were… I was mulling the idea of saying something about “UnLambda” in relation to work I did with Brian Postow and the late Carl Smith which can be found on my pubs page, but it was late and I kept it short. The Twitter feed in question is here, which is not his, but his page on esoteric languages is this.