In computational complexity theory, a problem is NP-complete if it is in NP, and it is NP-hard. Informally, a problem is in NP if there exist an efficient algorithm, a polynomial time algorithm, that can verify the solution to this problem. It is NP-hard if every problem in NP can be reduced to it in polynomial time. In other, NP-hard problem is at least as hard as the hardest problem in NP. A problem that is NP-Hard does not necessary belongs to NP.

The Boolean satisfiability problem (SAT) asks to determine if a propositional formula (example depicted) can be made true by an appropriate assignment of truth values to its variables. While it is easy to verify whether a given assignment renders the formula true,[1] no essentially faster method to find a satisfying assignment is known than to try all assignments in succession. Cook and Levin proved that each easy-to-verify problem can be solved as fast as SAT, which is hence NP-complete.

The name "NP-complete" is abbreviation for "Nondeterministic Polynomial-time Complete". In this name, "Nondeterministic Polynomial-time" refers to the complexity class of Decision problems that can be decided in polynomial number of steps using nondeterministic Turing machines, a Turing machine that have an nondeterministic transition function. "Complete" refers to the property of being able to simulate every problem in a given complexity class.

The set of NP-complete problems is often denoted by NP-C or NPC.

Although a solution to an NP-complete problem can be verified "efficiently", there is no known algorithm till now that decides NP-Complete problems efficiently. That is, the Time complexity required to decide the problem by any currently known algorithm, so far, increases rapidly as the size of the problem grows.

Knowing if an efficient algorithm exists to decide NP-Complete problem is a major unsolved problems in computer science, called the P versus NP problem. Since NP-complete problems are very common and frequent in several fields, several coping mechanism and algorithm techniques has been developed such the using heuristic methods, approximation algorithms, and Fixed-parameter algorithms.

Formal Definition edit

We define a language   as subset of binary strings from all possible binary string combinations. We say a language   is in NP if there exist a polynomial time Turing machine   that takes two binary strings, usually called the verifier of  , such for every binary string  ,

 

  is a polynomial in size of  , and   is called the certificate for   with respect to the language   and machine  

Given two languages  , we say   is a Karp Polynomial-time reducible to  , If there exists a polynomial-time computable function   such that if   then   We denote this fact by  .

A language   is NP-hard if for every  , we have  . A language   is NP-complete if it is NP-hard and  .

Background edit

 
Euler diagram for P, NP, NP-complete, and NP-hard set of problems. The left side is valid under the assumption that P≠NP, while the right side is valid under the assumption that P=NP (except that the empty language and its complement are never NP-complete, and in general, not every problem in P or NP is NP-complete)

The concept of NP-completeness was introduced in 1971 (see Cook–Levin theorem), though the term NP-complete was introduced later. At the 1971 STOC conference, there was a fierce debate between the computer scientists about whether NP-complete problems could be solved in polynomial time on a deterministic Turing machine. John Hopcroft brought everyone at the conference to a consensus that the question of whether NP-complete problems are solvable in polynomial time should be put off to be solved at some later date, since nobody had any formal proofs for their claims one way or the other. This is known as "the question of whether P=NP".

Nobody has yet been able to determine conclusively whether NP-complete problems are in fact solvable in polynomial time, making this one of the great unsolved problems of mathematics. The Clay Mathematics Institute is offering a US$1 million reward to anyone who has a formal proof that P=NP or that P≠NP.

The Cook–Levin theorem states that the Boolean satisfiability problem is NP-complete. In 1972, Richard Karp proved that several other problems were also NP-complete (see Karp's 21 NP-complete problems); thus there is a class of NP-complete problems (besides the Boolean satisfiability problem). Since the original results, thousands of other problems have been shown to be NP-complete by reductions from other problems previously shown to be NP-complete; many of these problems are collected in Garey and Johnson's 1979 book Computers and Intractability: A Guide to the Theory of NP-Completeness.[2]

NP-complete problems edit

 
Some NP-complete problems, indicating the reductions typically used to prove their NP-completeness

An interesting example is the graph isomorphism problem, the graph theory problem of determining whether a graph isomorphism exists between two graphs. Two graphs are isomorphic if one can be transformed into the other simply by renaming vertices. Consider these two problems:

  • Graph Isomorphism: Is graph   isomorphic to graph  ?
  • Sub-graph Isomorphism: Is graph G1 isomorphic to a sub-graph of graph  ?

The Sub-graph Isomorphism problem is NP-complete. The graph isomorphism problem is suspected to be neither in P nor NP-complete, though it is in NP. This is an example of a problem that is thought to be hard, but is not thought to be NP-complete. These are called NP-Intermediate problems and exist if and only if P≠NP.

The easiest way to prove that some new problem is NP-complete is first to prove that it is in NP, and then to reduce some known NP-complete problem to it. Therefore, it is useful to know a variety of NP-complete problems. The list below contains some well-known problems that are NP-complete when expressed as decision problems.

To the right is a diagram of some of the problems and the reductions typically used to prove their NP-completeness. In this diagram, problems are reduced from bottom to top. Note that this diagram is misleading as a description of the mathematical relationship between these problems, as there exists a polynomial-time reduction between any two NP-complete problems; but it indicates where demonstrating this polynomial-time reduction has been easiest.

There is often only a small difference between a problem in P and an NP-complete problem. For example, the 3-satisfiability problem, a restriction of the boolean satisfiability problem, remains NP-complete, whereas the slightly more restricted 2-satisfiability problem is in P (specifically, NL-complete), and the slightly more general max. 2-sat. problem is again NP-complete. Determining whether a graph can be colored with 2 colors is in P, but with 3 colors is NP-complete, even when restricted to planar graphs. Determining if a graph is a cycle or is bipartite is very easy (in L), but finding a maximum bipartite or a maximum cycle subgraph is NP-complete. A solution of the knapsack problem within any fixed percentage of the optimal solution can be computed in polynomial time, but finding the optimal solution is NP-complete.

Solving NP-complete problems edit

At present, all known algorithms for NP-complete problems require time that is superpolynomial in the input size, in fact exponential in  [clarify] for some   and it is unknown whether there are any faster algorithms.

The following techniques can be applied to solve computational problems in general, and they often give rise to substantially faster algorithms:

  • Approximation: Instead of searching for an optimal solution, search for a solution that is at most a factor from an optimal one.
  • Randomization: Use randomness to get a faster average running time, and allow the algorithm to fail with some small probability. Note: The Monte Carlo method is not an example of an efficient algorithm in this specific sense, although evolutionary approaches like Genetic algorithms may be.
  • Restriction: By restricting the structure of the input (e.g., to planar graphs), faster algorithms are usually possible.
  • Parameterization: Often there are fast algorithms if certain parameters of the input are fixed.
  • Heuristic: An algorithm that works "reasonably well" in many cases, but for which there is no proof that it is both always fast and always produces a good result. Metaheuristic approaches are often used.

One example of a heuristic algorithm is a suboptimal   greedy coloring algorithm used for graph coloring during the register allocation phase of some compilers, a technique called graph-coloring global register allocation. Each vertex is a variable, edges are drawn between variables which are being used at the same time, and colors indicate the register assigned to each variable. Because most RISC machines have a fairly large number of general-purpose registers, even a heuristic approach is effective for this application.

Completeness under different types of reduction edit

In the definition of NP-complete given above, the term reduction was used in the technical meaning of a polynomial-time many-one reduction.

Another type of reduction is polynomial-time Turing reduction. A problem   is polynomial-time Turing-reducible to a problem   if, given a subroutine that solves   in polynomial time, one could write a program that calls this subroutine and solves   in polynomial time. This contrasts with many-one reducibility, which has the restriction that the program can only call the subroutine once, and the return value of the subroutine must be the return value of the program.

If one defines the analogue to NP-complete with Turing reductions instead of many-one reductions, the resulting set of problems won't be smaller than NP-complete; it is an open question whether it will be any larger.

Another type of reduction that is also often used to define NP-completeness is the logarithmic-space many-one reduction which is a many-one reduction that can be computed with only a logarithmic amount of space. Since every computation that can be done in logarithmic space can also be done in polynomial time it follows that if there is a logarithmic-space many-one reduction then there is also a polynomial-time many-one reduction. This type of reduction is more refined than the more usual polynomial-time many-one reductions and it allows us to distinguish more classes such as P-complete. Whether under these types of reductions the definition of NP-complete changes is still an open problem. All currently known NP-complete problems are NP-complete under log space reductions. All currently known NP-complete problems remain NP-complete even under much weaker reductions such as   reductions and   reductions. Some NP-Complete problems such as SAT are known to be complete even under polylogarithmic time projections.[3] It is known, however, that AC0 reductions define a strictly smaller class than polynomial-time reductions.[4]

Naming edit

According to Donald Knuth, the name "NP-complete" was popularized by Alfred Aho, John Hopcroft and Jeffrey Ullman in their celebrated textbook "The Design and Analysis of Computer Algorithms". He reports that they introduced the change in the galley proofs for the book (from "polynomially-complete"), in accordance with the results of a poll he had conducted of the theoretical computer science community.[5] Other suggestions made in the poll[6] included "Herculean", "formidable", Steiglitz's "hard-boiled" in honor of Cook, and Shen Lin's acronym "PET", which stood for "probably exponential time", but depending on which way the P versus NP problem went, could stand for "provably exponential time" or "previously exponential time".[7]

Common misconceptions edit

The following misconceptions are frequent.[8]

  • "NP-complete problems are the most difficult known problems." Since NP-complete problems are in NP, their running time is at most exponential. However, some problems have been proven to require more time, for example Presburger arithmetic. Of some problems, it has even been proven that they can never be solved at all, for example the Halting problem.
  • "NP-complete problems are difficult because there are so many different solutions." On the one hand, there are many problems that have a solution space just as large, but can be solved in polynomial time (for example minimum spanning tree). On the other hand, there are NP-problems with at most one solution that are NP-hard under randomized polynomial-time reduction (see Valiant–Vazirani theorem).
  • "Solving NP-complete problems requires exponential time." First, this would imply P ≠ NP, which is still an unsolved question. Further, some NP-complete problems actually have algorithms running in superpolynomial, but subexponential time such as O(2nn). For example, the independent set and dominating set problems for planar graphs are NP-complete, but can be solved in subexponential time using the planar separator theorem.[9]
  • "Each instance of an NP-complete problem is difficult." Often some instances, or even most instances, may be easy to solve within polynomial time. However, unless P=NP, any polynomial-time algorithm must asymptotically be wrong on more than polynomially many of the exponentially many inputs of a certain size.[10]
  • "If P=NP, all cryptographic ciphers can be broken." A polynomial-time problem can be very difficult to solve in practice if the polynomial's degree or constants are large enough. For example, ciphers with a fixed key length, such as Advanced Encryption Standard, can all be broken in constant time by trying every key (and are thus already known to be in P), though with current technology that time may exceed the age of the universe. In addition, information-theoretic security provides cryptographic methods that cannot be broken even with unlimited computing power.

Properties edit

Viewing a decision problem as a formal language in some fixed encoding, the set NPC of all NP-complete problems is not closed under:

It is not known whether NPC is closed under complementation, since NPC=co-NPC if and only if NP=co-NP, and whether NP=co-NP is an open question.[11]

See also edit

References edit

Citations edit

  1. ^ For example, simply assigning true to each variable renders the 18th conjunct   (and hence the complete formula) false.
  2. ^ Garey, Michael R.; Johnson, D. S. (1979). Victor Klee (ed.). Computers and Intractability: A Guide to the Theory of NP-Completeness. A Series of Books in the Mathematical Sciences. San Francisco, Calif.: W. H. Freeman and Co. pp. x+338. ISBN 978-0-7167-1045-5. MR 0519066.
  3. ^ Agrawal, M.; Allender, E.; Rudich, Steven (1998). "Reductions in Circuit Complexity: An Isomorphism Theorem and a Gap Theorem". Journal of Computer and System Sciences. 57 (2): 127–143. doi:10.1006/jcss.1998.1583. ISSN 1090-2724.
  4. ^ Agrawal, M.; Allender, E.; Impagliazzo, R.; Pitassi, T.; Rudich, Steven (2001). "Reducing the complexity of reductions". Computational Complexity. 10 (2): 117–138. doi:10.1007/s00037-001-8191-1. ISSN 1016-3328. S2CID 29017219.
  5. ^ Don Knuth, Tracy Larrabee, and Paul M. Roberts, Mathematical Writing Archived 2010-08-27 at the Wayback Machine § 25, MAA Notes No. 14, MAA, 1989 (also Stanford Technical Report, 1987).
  6. ^ Knuth, D. F. (1974). "A terminological proposal". SIGACT News. 6 (1): 12–18. doi:10.1145/1811129.1811130. S2CID 45313676.
  7. ^ See the poll, or [1] Archived 2011-06-07 at the Wayback Machine.
  8. ^ Ball, Philip. "DNA computer helps travelling salesman". doi:10.1038/news000113-10.
  9. ^ Bern (1990); Deĭneko, Klinz & Woeginger (2006); Dorn et al. (2005); Lipton & Tarjan (1980).
  10. ^ Hemaspaandra, L. A.; Williams, R. (2012). "SIGACT News Complexity Theory Column 76". ACM SIGACT News. 43 (4): 70. doi:10.1145/2421119.2421135. S2CID 13367514.
  11. ^ Talbot, John; Welsh, D. J. A. (2006), Complexity and Cryptography: An Introduction, Cambridge University Press, p. 57, ISBN 9780521617710, The question of whether NP and co-NP are equal is probably the second most important open problem in complexity theory, after the P versus NP question.

Sources edit

Further reading edit


Category:1971 in computing Category:Complexity classes Category:Mathematical optimization