69

Knapsack problems are easily solved by dynamic programming. Dynamic programming runs in polynomial time; that is why we do it, right?

I have read it is actually an NP-complete problem, though, which would mean that solving the problem in polynomial problem is probably impossible.

Where is my mistake?

Raphael
  • 72,336
  • 29
  • 179
  • 389
Strin
  • 1,505
  • 1
  • 11
  • 16
  • 7
    Keep in mind that DP is polynomial in the "table size". The table is exponentially large for Knapsack (see Kaveh's answer). – Raphael Mar 31 '12 at 07:13

3 Answers3

47

Knapsack problem is $\sf{NP\text{-}complete}$ when the numbers are given as binary numbers. In this case, the dynamic programming will take exponentially many steps (in the size of the input, i.e. the number of bits in the input) to finish $\dagger$.

On the other hand, if the numbers in the input are given in unary, the dynamic programming will work in polynomial time (in the size of the input).

This kind of problems is called weakly $\sf{NP\text{-}complete}$.

$\dagger$: Another good example to understand the importance of the encoding used to give the input is considering the usual algorithms to see if a number is prime that go from $2$ up to $\sqrt{n}$ and check if any of them divide $n$. This is polynomial in $n$ but not necessarily in the input size. If $n$ is given in binary, the size of input is $\lg n$ and the algorithm runs in time $O(\sqrt{n}) = O(2^{\lg n/2})$ which is exponential in the input size. And the usual computational complexity of a problem is w.r.t. the size of the input.

This kind of algorithm, i.e. polynomial in the largest number that is part of the input, but exponential in the input length is called pseudo-polynomial.

Kaveh
  • 22,231
  • 4
  • 51
  • 111
  • But think about the objects to be put in the knapsack. The objects need to be input and such an input must be polynomial with the number of objects. If objects are many enough, then the input is polynomial with the size of the problem. So why cannot I say that Knapsack Problem is P problem in terms of table size? Am I wrong? – Strin Apr 01 '12 at 16:20
  • @Strin, no, a small number of objects can be sufficient to feel a large knapsack, e.g. if the size of the Knapsack is $m$, one objeact of size $m$ is sufficient. The size of the input is roughly $2\lg m$, much smaller than $m$. (I am assuming that we are talking about 0-1 Knapsack.) – Kaveh Apr 02 '12 at 01:21
  • Can you break the input down into smaller inputs whose binary encoding has a size that finishes the algorithm in polynomial time then combine the solutions? – Char May 16 '12 at 09:13
  • @Kaveh "The size of the input is roughly 2 lg m" I don't understand where you get that part from. The relationship between m (pack size) and n (num of items) is totally unknown, right? And re "when the numbers are given as binary numbers"... but couldn't you say that for anything? With most algorithms, we talk about input size in base 10. Why talk about binary here? And whether you encode in binary, octal, decimal, etc... the actual number of times you iterate through your main algorithm loop is directly dependent on both n and W. – The111 Jun 29 '13 at 23:09
  • @Char, depends on the details but in general greedy heuristics like what you describe are unlikely to give an optimal answer. – Kaveh Jun 30 '13 at 04:36
  • @The111, the point is not binary vs. decimal, it is binary vs. unary. Other bases like decimal only change the size of objects by a constant factor relative to binary. Binary is how inputs are encoded on digital computers as bits and that is the reason I prefer to use it, however it doesn't change anything if you use decimal. Unary encoding on the other hand causes an exponential increase relative binary encoding. $2 \lg m$ is roughly the number of bits needed to encoded two numbers $\leq m$: the size of Knapsack, the size of the object. – Kaveh Jun 30 '13 at 04:39
  • @Kaveh Still not getting the connection. A complete scan of an array of size N for example takes O(N). This is despite the obvious fact that if we store N in our program, the computer will use binary to do so. Should we say the scan is exponential in N because only lg N bits are needed to store N? No, because you still need to make N distinct stops in your scan. I'm not sure why you consider the knapsack problem different. Why would unary even enter the discussion? What algorithm anywhere uses unary to represent N? – The111 Jun 30 '13 at 05:34
  • From what I've read today, it seems to me the reason the O(w * N) dyn table build is not polynomial is because w and N are both variables that can increase with bigger problems, and more specifically, one of them could be exponentially larger than the other. But in practice, they usually aren't, so the algorithm usually runs in polynomial time. – The111 Jun 30 '13 at 05:36
  • @The111, you are thinking in terms of RAM model, complexity classes like $\mathsf{NP}$ are defined using Turing machines. Reading $N$ numbers cannot be done in $O(N)$ on a Turing machine, the size of $N$ numbers can be arbitrary large and cannot be bound with any function of $N$ (not even exponential). Dynamic programing is polynomial time in the size of input when $w$ is small relative total size of the input, but that is essentially equivalent to saying that we can encode the numbers in unary. – Kaveh Jun 30 '13 at 05:38
  • @Kaveh Ok I will admit I'm not that familiar with the model standard you refer to. But it seems boggling that you claim with this model reading a 1 x N array cannot be bounded even by an exponential... yet a w x N 2-D array is bounded exponentially? How the heck does that work? Wouldn't 2-D in N be more complex than 1-D in N... in any model standard? – The111 Jun 30 '13 at 05:43
  • 1
    @The111, I think it is better if you post that as a new question that and I will post an answer. I think your question is more fundamental and there comments are not very related to this question. – Kaveh Jun 30 '13 at 05:45
  • Ok, done. http://cs.stackexchange.com/questions/12981/turing-machines-and-complexity-1-d-array-vs-2-d Thanks. :-) – The111 Jun 30 '13 at 05:49
44

The main confusion lies in the difference between "size" and "value".

"Polynomial Time" implies polynomial w.r.t the size of input.

"Pseudopolynomial Time" implies polynomial w.r.t the value of the input. It can be shown (below) that this is equivalent to being exponential w.r.t the size of the input.


In other words: Let $N_{size}$ represent the size of the input and $N_{val}$ represent the value of the input.

Polynomial Time: $O(N_{size}^x)$ for $x\in\mathbb{N}$

Pseudopoly. Time: $O(N_{val}^x)$ for $x\in\mathbb{N}$

Now, the knapsack problem has a pseudopolynomial, not polynomial, solution because the dynamic programming solution gives a running time dependent on a value -- i.e. $O(nW)$, where $W$ is a value representing the max capacity.

Now, a value can be converted into a size by representing it in terms of # of digits it takes to represent it. $N_{size}=Log_b(N_{val})$ tells you how many digits are needed to represent $N_{val}$ using base $b$. This can be solved for $N_{val}$ to give:

$$N_{val}=b^{N_{size}}$$

Plugging this into the pseudopolynomial time definition shows that it is exponential w.r.t $N_{size}$:

Pseudopoly. Time: $O(b^{xN_{size}})$ for $b, x\in\mathbb{N}$

bcorso
  • 561
  • 5
  • 6
  • 8
    Created an account here just to say thank you so much! Only after your example I've finally understood it. – Inoryy Mar 06 '14 at 06:03
  • 2
    Your answer beats everyone, bravo! – Muhammad Razib Dec 01 '15 at 05:48
  • 3
    To add to this great answer we can say that if we change the W from 100 to 101 the size of the problem is not increased, the size is increased if we add another bit to W which makes it twice as big, so the table would have twice as much rows and therefore with increasing the size by one, the problem time is doubled, that is why it's exponential. – Amen Jul 26 '17 at 22:44
  • @bcorso Suppose you are given a value N. And you had to find the sum of numbers from 1 to N and you used the for loop method, that would be a pseudopolynomial Time algorithm? – DollarAkshay Jul 21 '19 at 17:01
7

The Knapsack problem as defined in Karp's paper is NP-Complete since there is a reduction from other NPC problem (Exact Cover, in this case) to Knapsack. This means that there is no polynomial algorithm that can solve all instances of the Knapsack problem, unless $\text{P}=\text{NP}$.

There are, however, different variants (e.g., 0-1 Knapsack and others) that may or may not have polynomial-time solutions or good approximations. But this is not the same as the general Knapsack problem. Also, there might be efficient algorithms that work for specific (families of) instances, but these algorithms will take longer on other instances.

Juho
  • 22,554
  • 7
  • 62
  • 115
Ran G.
  • 20,684
  • 3
  • 60
  • 115
  • logged in just to downvote, the questions asks exactly the conundrum posed by p = np theory depiste the illusion of knapsack running in poly time using dynamic programming. Putting the theory that knapsack cannot be P because it is NP doesn’t help – Sid Sarasvati Apr 05 '20 at 23:28