I ended up writing this for someone on StackOverflow; thought I’d preserve it here too.

We start with the idea of a decision problem, a problem for which an algorithm can always answer “yes” or “no.” We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we’re used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new “process” and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it.

A decision problem is in P if there is a known polynomial-time algorithm to get that answer. A decision problem is in NP if there is a known polynomial-time algorithm for a non-deterministic machine to get the answer.

Problems known to be in P are trivially in NP — the nondeterministic machine just never troubles itself to fork another process, and acts just like a deterministic one. There are problems that are known to be neither in P nor NP; a simple example is to enumerate all the bit vectors of length n. No matter what, that takes 2^{n} steps.

(Strictly, a decision problem is in NP if a nondeterministic machine can arrive at an answer in poly-time, and a deterministic machine can *verify* that the solution is correct in poly time.)

But there are some problems which are known to be in NP for which no poly-time deterministic algorithm is known; in other words, we know they’re in NP, but don’t know if they’re in P. The traditional example is the decision-problem version of the Traveling Salesman Problem (decision-TSP): given the cities and distances, is there a route that covers all the cities, returning to the starting point, in less than *x* distance? It’s easy in a nondeterministic machine, because every time the nondeterministic traveling salesman comes to a fork in the road, he takes it: his clones head on to the next city they haven’t visited, and at the end they compare notes and see if any of the clones took less than *x* distance.

(Then, the exponentially many clones get to fight it out for which ones must be killed.)[1]

It’s not known whether decision-TSP is in P: there’s no known poly-time solution, but there’s no proof such a solution doesn’t exist.

Now, one more concept: given decision problems P and Q, if an algorithm can transform a solution for P into a solution for Q in polynomial time, it’s said that Q is *poly-time reducible* (or just reducible) to P.

A problem is NP-complete if you can prove that (1) it’s in NP, and (2) show that it’s poly-time reducible to a problem already known to be NP-complete. (The hard part of that was proving the *first* example of an NP-complete problem: that was done by Steve Cook in Cook’s Theorem.)

So really, what it says is that if anyone ever finds a poly-time solution to one NP-complete problem, they’ve automatically got one for *all* the NP-complete problems; that will also mean that P=NP.

A problem is NP-hard if and only if it’s “at least as” hard as an NP-complete problem. The more conventional Traveling Salesman Problem of finding the shortest route is NP-hard, not strictly NP-complete.

Footnotes:- Or, of course, the first clone into home-base first checks to see if their distance was less than
*x*, hangs the yes or no sign, and then machine guns all the late arrivals. Simultaneous arrivals are left as an exercise. [↩]

## { 2 } Comments

A problem is NP-complete if you can prove that (1) it’s in NP, and (2) show that it’s poly-time reducible to a problem already known to be NP-complete.

didn’t you just put inside the definition of NP-Complete the term itself to define? inside the definition you use NP-Complete, so what is the first NP-Complete problem?

Ooh, excellent question. Steve Cook was the first person to define NP-complete problems, and you’re exactly right. What Cook’s theorem originally said was that there exists a class of problems which are (1) in NP, and (2) which can be reduced in polynomial time to an instance of boolean satisfiability. So this definition can be restated as defining the class of problems which are in NP and which are reducible to boolean satisfiability. Since polynomial-time reducibility is closed under composition — that is, if X is poly-time and Y is poly-time, the Y(X) is also poly-time — all that’s necessary to prove any particular problem is in the class is to show that the problem is in NP and can be reduced in poly-time to any known NP-complete problem.

## { 4 } Trackbacks

[...] What are P, NP-Complete, and NP-Hard? | Explorations We start with the idea of a decision problem, a problem for which an algorithm can always answer “yes” or “no.” We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we’re used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new “process” and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it. (tags: np-complete np-hard) [...]

[...] What are P, NP-Complete, and NP-Hard? | Explorations We start with the idea of a decision problem, a problem for which an algorithm can always answer “yes” or “no.” We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we’re used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new “process” and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it. (tags: np-complete np-hard) [...]

[...] What are P, NP-Complete, and NP-Hard? [...]

[...] Blogs that have covered this topic: Explorations To Imaging, and [...]

## Post a Comment