NP-Completeness (NPC)

Introduction

Computer scientists have been trying to provably distinguish the problems that exist in the set NP – P from P. This task has baffled the computer science community for a long while now. Ever since 1971, when Stephen Cook presented his paper The Complexity of Theorem Proving Procedures the computer science community has taken one step closer to showing that a problem L exists in NP – P. This kind of problem is called an NP-complete problem. NPC problems are a class of extremely difficult problems to solve (in the worst case), in fact they are considered to be the hardest problems in NP. NPC theory provides techniques for showing that a given difficult problem can be a member of this class and is just as hard as any other problem in this class, primarily so that one can make a wise choice for an approximate solution instead of an exact solution.

The two diagrams on the right help show where NPC is with respect P, NP and undecidable problems. The entire premise behind NPC is the complexities of many hard problems in NP are related in such a way that if one NPC problem can be found with only a polynomial time bounded solution then all NPC problems can have a polynomial time bounded solution, thus proving P = NP. So far no NPC problems have a polynomial time bounded solution thus suggesting that P ê NP. We ought to believe this to be true until one is found.

 

 

NPC

 

NP

 

Tractable Problems

(poly-time algorithms)

 

Intractable Problems

(super-poly algorithms

exist, possibly NPC)

 

Undecidable

Problems

(no algorithm exists)

 
Octagon:

P

 

Moreover, NPC problems are applicable only towards decision problems and not optimization problems. However, an optimization problem can be cast to a decision problem by setting a bound on the value that is to be optimized. NPC problems are interesting because they exist in real world applications. For instance a traveling salesman, who must visit many cities, each only once and end at the city that he started from, wants to find the minimum length in total distance that he has to travel. This problem being an optimization problem can be cast to a decision problem. For instance, given a certain boundary B, is the total length of the traveling salesman's tour less than B? This problem, known as the Traveling Salesman problem, is NPC.

Cobham-Edmonds

In a nutshell, NPC deals with the notion of the required computational resources of reducing one NPC problem to another one in polynomial bounded time via a deterministic computation model such as a Turing Machine. The Cobham-Edmonds Thesis provides the definition of "feasible computation", which is a problem computable in polynomial bounded time. Theoretical considerations exist that make some problems appear unlikely that any polynomial-time solution will be discovered for problems that may have inefficient solutions. Let's first state the Cobham-Edmonds Thesis for computational feasibility:

The problem of determining whether a given string is an element of a given language L is tractable iff L ë P.

NPC Proof

To show that a problem L ë NPC, an attempt is made to show that no efficient solution exists. There are four main steps to proving a problem L ë NPC:

1. Prove L ë NP
2. Select an already known NPC problem La
3. Prove L ë NP-hard by describing a transformation¦, where La µ Lb, and
4. Verify that the algorithm computing ¦ runs in poly-time

Existence in NP

Showing that L ë NP is usually easy, because when given an assignment of the input, it is easy to verify that answer to the decision problem. Lets look at a very simple example to emphasize the point. Suppose we are given the following problem:

General Question: Is there a positive integer m such that N = 4m?
General Problem Instance: A positive integer N.

Suppose that this is an NP decision problem, which it is not but is used here for demonstration purposes only then we would want to verify this. A non-deterministic machine (NDM) can guess m and check in poly-time weather or not m * 4 = N. For instance, if N = 12, our first guess may be 0, our second guess may be 1, and then 2, then finally 3. Then N = 12 is a "yes" instance. A "no" instance would be N = 10, because (a NDM would guess) no value for m exists that when multiplied by 4 equals 10.

The idea of the NDM is that it simultaneously computes all guesses and one of them is the correct one, hence the power of a NDM. So, supply an (general or abstract) answer (a guess, or a certificate) and check it. If this verification can be done in poly-time then it is in NP. This part of the proof is usually omitted, since it is usually trivial to show L ë NP.

NP-Hard, NPH, (via Polynomial Reducibility)

The main technique for showing that two (NPC) problems are related to each other is polynomial reducibility. Reducibility is constructing a transformation that maps all instances of the first problem to all equivalent instances of the second problem. For instance, when problem P1 is (poly) reduced to another problem P2, P2 is then at least as hard as P1. Any solution to P2 immediately yields a solution to P1. So, P1 is no harder than P2. In other words, this reducible transformation allows for the conversion of an algorithm solving the second problem into an algorithm that can solve the first problem. When a problem is efficiently reduced to a second problem and a solution can be found to the second problem, that solution can be efficiently used to solve the original problem.

Stephen Cook and Leonid Levin discovered that certain problems in complexity class NP are related to the complexity of the entire class, such that if a polynomial bounded time solution exists for any of these certain problems then all problems in NP can be solved in polynomial bounded time, which would determine that P = NP. Ultimately, the Cook-Levin theorem states that once we know one NPC problem then we can obtain other NPC problems by polynomial time reducing from it. So, we have that if L1 µ L2, then if L2 ë P, then L1 ë P, and if L1 ì P, then L2 ì P, likewise. The proof gives light to the property that if one poly-time solution is found for an NPC problem then all of the NPC problems, because they are poly related, have poly-time solutions suggesting that P = NP. The proof constructs a poly-time machine M1 that recognizes L1, by composing a poly-time machine Mf that computes f and M2 that recognizes L2. The following diagram shows the conceptual view that M1 computes x ë L1 iff ¦(x) ë L2:

 

 

 

 

 


Let's now take a look at a definition of polynomial reducibility:

Problem La is polynomial-time reducible to problem Lb, which can be written as La ²p Lb, iff $ a Turing-computable polynomial time function ¦ such that for any input instance x, x ë La ó ¦(x) ë Lb.

Mapping instances in poly-time from one (NPC) problem to another (NPC) problem is not always simple. Sometimes it takes creativity to discover the relationship between two NPC problems. Some NPC problems are more suitable for this poly transformation process, for instance, poly-mapping LSAT to L3SAT, because both problem instances use OR clauses. There are basically three parts to poly reduction:

The more formal and more powerful definition of NP-hard (NPH) is Turing reducibility in using an Oracle Turing machine, OTM. An OTM is much like a deterministic TM, except that given an additional oracle tape, when the machine enters the oracle state, say qc, then the oracle computes an output all in one step, for a corresponding input that begins on its tape. Conceptually, the TM consults an oracle, which computes a hypothetical function. The idea behind NPH vs. NPC is that NPC problems are poly-time solvable iff P = NP, where as NPH problems cannot be solved in poly-time period unless P = NP.

1. Specify a function ¦ that maps each instance of La to a corresponding instance Lb.
2. Prove that function ¦ satisfies x ë La¦(x) ë Lb.
3. Prove that function ¦ satisfies x ë La¦(x) ë Lb.

Lets do a simple example to illustrate the point of poly-reducibility:

Suppose we want to show that the Traveling Salesman (TS) problem above is NPC. We then have to choose an existing NPC problem that is well suited for a poly mapping. We choose the Hamiltonian Circuit (HC) below. An instance to the TS problem is a set C of cities, a set of distances D: C x C → N, (where N is the set of natural numbers), and a bound B. So the problem of TS would be, is there a tour where the salesman visits each city c ë C only once, ending at the first city, such that the total distance d ë D ² B?

Well, the poly-time transformation is simply as follows:

Part 1: (construct ¦)

Let C = V.

Let d(α, β) : α, β ë C, be a distance between 2 cities, where d(α, β) = 1 if α, β ë E, and 2 otherwise.

Let bound B = |V|.

For each of the m(m - 1) / 2 distances d(α, β), we only need to examine G to see whether or not {α, β} is an edge in E, thus function ¦ runs in poly-time.

Part 2: (HC to TS)

For the first requirement, if LHC is yes on G, then there is a tour of cities of length B, since each distance traveled in the tour corresponds to an edge of G, thus has length 1.

Part 3: (TS to HC)

For the converse requirement, if LTS is yes on (C, D, B), then from the construction of d we have all the edges for LHC.

NPC

Now, we can formally define NPC:

If L ë NP and L ë NP-hard, then L is said to be NPC.

In other words: If problem L is a member of NP and every L¢ ²p L, where L¢ is NPC, then L is NPC.

The First NPC Problem

Satisfiability (SAT)

SAT was the first problem to be proved NPC. This is known as Cook's theorem. The problem instance is a set of variables U, and a collection of clauses C over U. The question then becomes is there a truth assignment that satisfies C? The proof is rather lengthy and uses the existence of a nondeterministic Turing machine to represent the general language L ë NP, so that L is reducible to LSAT.

More NPC Problems

The following are probably the most popular NPC problems that are used for proving new NPC problems, primarily because they were in the first set of 21 problems that came about in 1972.

1. Three-Satisfiability (3SAT)

General Question: Does there exist a truth assignment for U that satisfies all clauses in C?
General Problem Instance: Collection of clauses C = {c1, c2, ..., cm} over a finite set of variables U, where |ci| = 3 such that 1 ² i ² m.

2. Three-Dimensional Mapping (3DM)

General Question: Does S contain a S¢ , where S¢ ê S and |S¢ | = k, and no two members of S¢ have the same coordinate?
General Problem Instance: A set S
ê X x Y x Z, such that X, Y and Z are disjoint sets where |X| = |Y| = |Z| = k.

3. Vertex Cover (VC)

General Question: Does there exist a VC of size K ² J for G, such that V¢ ê V, |V¢ | ² K, and for each edge {α, β} ë E, at least one of α and β belongs to V¢ ?
General Problem Instance: A graph G = (V, E), and a J ² |V|, such that J
ë Z+.

4. Clique

General Question: Is there a clique in graph G of size J ³ K, where V¢ ê V, |V¢ | ² J, and every two vertices in V¢ are joined by an edge in E?
General Problem Instance: A graph G = (V, E), a K ² |V|, such that K
ë Z+.

5. Hamiltonian Circuit (HC)

General Question: Is there a HC for G, with an ordering of vertices <v1, v2, ..., vm> of G, such that m = |V|, {vm, v1} ë E and {vj, vj+1} ë E where 1 ² j ² m, " j?
General Problem Instance: A graph G = (V, E).

6. Partition

General Question: Is there an S¢ ê S, where Σ f(s) " s ë S¢ = Σ f(s) " s ë S - S¢ ?
General Problem Instance: A finite set S, and
" s ë S we have size f(s) ë Z+.

Conclusion

It is important for algorithm designers to become familiar with class of NPC problems because proving that a problem is NPC would then lead you to find an approximation algorithm otherwise you could waste your time in finding an algorithm to solve the problem exactly. If P í NP then no NPC have poly solutions. It appears, based on the identification of many NPC problems, that P í NP or that no one is able to prove that an NPC problem exists in P. Since there is no proof either way, we're left with circumstantial evidence that strongly suggests P ê NP. So, this implies that NPC problems are solvable in poly-time by a nondeterministic (Turing) machine and not by a deterministic one. Thus, by the Cobham-Edmonds thesis, we can say that NPC problems are intractable. Currently, for over 30 years now, with attempts by many researchers in the field at finding the answer, the general consensus is that a polynomial bounded time solution to any of these problems would have been found by now, which leaves one of the most significant questions of computer science open-ended.

Links

http://www.nd.edu/~cholak/computability/computability.html
http://www.busygin.dp.ua/npc.html
http://www.princeton.edu/~kazad/resources/cs/npcomplete.htm
http://www.cs.virginia.edu/~robins/cs660/cs660_slides_144_157.pdf
http://www.nada.kth.se/~viggo/problemlist/compendium.html
http://www.nada.kth.se/~viggo/wwwcompendium/wwwcompendium.html
http://www.cs.duke.edu/education/courses/cps130/spring02/lectures/lecture24.pdf
http://en.wikipedia.org/wiki/NP-hard
http://www.cs.pdx.edu/~york/cs350/NPC1.pdf