"P = NP ?" It is generally considered the most important issue in computer science. There is Clay math The Institute, or even offered a reward of 1 million US dollars to the people who solve it. But what I want to tell you today is that this problem does not actually exist and does not need to be resolved at all. I am not the first person to think so. A mathematician pointed out very early that p = NP? It was a silly question. In order to laugh at it, he wrote a "paper" specifically in Yu, saying that he proved P = NP. There are some very smart people around me who basically don't take this issue seriously. If I talk to them about these things, I'm afraid it's too old. However, I found that computer science students in China have always considered this question as sacred and have no joke at all. So I plan to popularize it here. This is not a good explanation. First, you need to figure out what is "P = NP ?" For this reason, you must first understand what is"AlgorithmComplexity ". Therefore, you must first understand what an algorithm is ". You can simply think of an algorithm as a machine, just like a meat grinder. If you give it some "input", it will give you some "output ". For example, the input of a meat grinder is a minced meat, and the output is a meat residue. The input of a cow is grass, and the output is milk (or milk ). The input of the "sub-device" is two integers, and the output is the sum of the two integers. The question discussed in "algorithm theory" is how to design these machines to make them more effective. It is like how to cultivate high-quality cows, eat the same amount of grass, and produce more milk Faster. The so-called "computing problem" usually requires a certain amount of time (also called "computing") for the algorithm to get results. The time required for calculation is often related to the input size. The more grass your ox eats, the longer it takes to turn the grass into milk. The conversion speed of grass and milk is usually called "algorithm complexity ". Algorithm complexity is usually expressed as a function f (N), where N is the input size. The value of this function is usually the demand for certain resources, such as time or space. For example, if the time complexity of your algorithm is n
2. When you input 10 things, it takes 100 units of time to complete the calculation. When you enter 100 things, it takes 10000 units of time to complete. When 1000 data entries are input, it takes 1000000 units of time. Simple. The so-called "P time" is "polynomial time", polynomial time. In short, this complexity function f (n) is a polynomial. What do you know about polynomials? If you don't know, go over the middle school mathematics textbook. "P = NP ?" "P" refers to the "set" of all these algorithms whose complexity is polynomial, that is, the "all" algorithm whose complexity is polynomial. To briefly describe the following, I define some terms: "F (n) Time Algorithm" = "algorithm that can solve a problem within F (n) Time" When F (n) is a polynomial (such as N
2) This is the "polynomial time algorithm" (P time Algorithm ). When F (n) is an exponential function (such as 2N), this is the "exponential time algorithm ). Many people think that NP is a matter of exponential time, while NP and exptime are actually irrelevant. Obviously, P is not equal to exptime, but there is no conclusion that P is equal to NP. Now let me explain what NP is. Generally, computers are deterministic, and they can only have one behavior at a time. If you useProgramWhen they encounter a condition judgment (Branch), they can only explore one of the paths at a time. For example: If (x = 0 ){ One (); } Else { Two (); } Here, only one operation is performed based on whether the X value is zero, one () and two. However, someone imagined a machine called the Nondeterministic computer, which can run two branches of the program, one () and two () at the same time (). How can this be used? When you do not know the size of X, you can infer whether X is zero based on whether one () and two () are "running successfully. This non-deterministic computer is called a non-deterministic Turing machine in the "Computing Theory ". The opposite is the "deterministic Turing Machine", which is usually called a "computer ". In fact, the name of "Turing Machine" is completely irrelevant here. You only need to know that non-deterministic computers can explore multiple possibilities at the same time. This is not an ordinary "Parallel Computing", because every time a pivot is reached, non-deterministic computers will generate new computing units to explore these paths at the same time. This machine is like a separation technique. When such a pivot point exists in a loop (or recursion), it will repeatedly generate new computing units, and new computing units will generate more computing units, just like cell division. Generally, computers do not have such super power. They only have a fixed number of computing units. Therefore, they can only explore one path first. If they fail, they will go back and explore another path. So they seem to have to spend more time to get results. At this point, the basic concepts have been defined, so we can give a complete definition of P and NP. P and NP are two "problem sets ": P = all the problems that "deterministic computers" can solve in "polynomial time" NP = all problems that can be solved by non-deterministic computers in polynomial time (Note that they differ only in "deterministic" or "non-deterministic ".) The definition is complete. Now back to "P = NP ?" Discuss the problem. "P = NP ?" The goal of the problem is to know whether the P and NP sets are equal. To prove that two sets (A and B) are equal, we generally need to prove the two directions: 1. A contains B 2. B contains As you may have seen, NP must include P. Because any non-deterministic machine can be used as a deterministic machine. You only need to explore one path at each pivot point without using its "super power. So "P = NP ?" The key to the problem is whether P also contains NP. That is to say, if only deterministic computers are used, can all non-deterministic computers solve problems within polynomial time. Let's take a closer look at what is polynomial time ). We all know that N
2 Is polynomial, n1000000 It is also a polynomial. There is a world of difference between polynomials and polynomials. It is very inaccurate to describe the time required to solve the problem using the general concept of "polynomial. In actual large-scale applications, n2 The algorithm is too slow. Finding an algorithm of "polynomial time" does not actually explain the problem. This topic is fond of saying that even larger polynomials (such as N
1000000), and no small exponential function (such as 1.0001 N) In contrast, because there is always "exist" a m, when n> M, 1.0001 N It will exceed n 1000000. But the key to the problem lies not in M's "presence", but in its "size ". If your input must reach an astronomical number to make the exponent function exceed the polynomial, it would be better to use an exponential complexity algorithm. So, "P = NP ?" The mistake of this problem is that it does not address our actual needs, but first assumes that we have an "infinite" input, an "infinite" Time and patience, this gives the algorithm of polynomial time a "final" advantage. "Infinite" and "final" are the kill of the geeks. To show this problem, we can draw a coordinate curve to compare n
1000000 And 2 N, and solve the N when they are equal. I don't need 1.0001 N To avoid being unfair. I like to be lazy. I often use Mathematica to solve these formulas. The following are the results and curves I have drawn from it: As you can see, when 1 <n <24549200, we both have 2
N <N 1000000 (N 1000000 The curve rushed to the sky as soon as it exceeded 1 ). So as long as the input does not reach million, 2 N Than N 1000000 The algorithm is fast.
N1000000 The problem may not be explained, but the range of polynomial is too large. N10100 , N1010100 ,...... All are polynomials. In fact, as long as C is a constant, any constant, nC It is a polynomial.
You can imagine how big n needs, 2N To exceed n10100 ? When n = 2, n10100 Is 210100. You may have realized that this number is equivalent to 2N Complexity algorithms, accept 10100 Input. If you know 10100 (followed by 100 0) is already greater than the number of basic particles in the universe, and you may realize that, this is the power set of all particles in the computing universe, that is, all the combinations of all particles in the universe. To put it simply, it is to enumerate all possible objects in the universe! When any supercomputer completes this task, the universe may no longer exist. Moreover, this computation cannot be completed at all, because even if each particle can provide the energy required for one count, you will not be able to count to 10. 100 all the energy in the universe is used up. Finally, because the two n are synchronized, when 2 the input of n is 10 100 , n 10 100 equal to (10 100) 10 100. So even if we enumerate all possible objects in the universe, 2 n still lags far behind n 10 100!
You may have discovered that N is not necessary in the above discussion.
10100 For such a big polynomial, we only need to use a large constant (such as 10100) That's enough, because constants are also polynomials. The reason for using polynomials is to demonstrate how big a polynomial can be.
When you grasp this critical point, the theorists often say that the example you gave is too outrageous and solves "P = NP ?", Both positive and negative conclusions will bring benefits: 1. If P is equal to NP, it can help us find a "fast" polynomial algorithm, 2. If P is not equal to NP, we know that the polynomial algorithm "does not exist" to avoid unnecessary work. Both conclusions have fatal logical errors. Let's take a look at the first point, "If P is equal to NP, It can help us find a fast polynomial algorithm ". This is actually a sophistry of the concept of stealing, "P = NP ?" The "goal" is logically different from the "meaning" claimed by the theoretical practitioners. One thing you need to understand in particular is that the definition of this issue does not mention the word "quick" from start to end. It only cares about whether the polynomial time algorithm can be found, rather than the quick polynomial time algorithm. The word "quick" is missing, that is, it can be any polynomial, that is, it can be like n
10100 This is the case. So, "P = NP ?" The goal (to prove whether a polynomial time algorithm can be found to solve all NP problems) is actually not used to find a quick polynomial algorithm at all. This is the difference between "Existence Problem" and "numerical problem. It is like saying, "I am a rich man ." When everyone looks at him, he says, "I have a dollar !" This is a basic logic, but many people are confused by it. Let's take a look at the second point: "If P is not NP, then we know Polynomial algorithms do not exist to avoid unnecessary work ". If P is not equal to NP, we create a "logical inverse" for the definition of this problem and obtain: "Not all NP problems, we can find polynomial time algorithms ." That is to say: "Some NP problems cannot find polynomial time algorithms ." Note that here we are talking about "some" instead of "all ). In other words, for some NP problems, even if p is not equal to NP, we still "may" find polynomial algorithms. There may be hope, and someone may spend time searching for it. Since we failed to completely eliminate the "Hope" of finding polynomial algorithms for NP problems, how can we "avoid unnecessary work "? The reason for misunderstanding is that the logic inverse of "all" is "somewhat ". They mistakenly think that "P is not equal to NP" means that "All NP problems cannot find polynomial time algorithms ." If so, this conclusion may be useful. In the second point, besides the key logic errors, there are also the same problems as the first point. That is, it first assumes that in any case, the polynomial time algorithm is better than other complex algorithms. It subconsciously thinks that everyone "wants" to get a polynomial time algorithm. So it will feel that if such "hope" can be eliminated, it will save unnecessary troubles for these people. However, the correct method is not just to look for polynomial algorithms, but to find multiple algorithms and draw a graph like me. Based on Different Input sizes, resource limits, and computing targets, implementation difficulty. After analysis and comparison, select the best algorithm according to local conditions. My early research on algorithms showed me an experience. If I spent a lot of effort and still could not find a simple and fast polynomial algorithm, then the fast polynomial algorithm often did not exist. Of course, you can spend more time finding complex polynomial algorithms by proving various upper and lower limits. However, the actual efficiency of these algorithms is usually not as simple as exponential time algorithms. In addition, the algorithm complexity here refers to the worst case complexity. If your algorithm can adjust itself as the input changes, or add randomness or heuristics, unexpected results will often be achieved. Therefore, we can see that no matter whether P is NP or not, the conclusions we get cannot produce the expected results. So, "P = NP ?" There is no need for an answer because it is a non-existent question. Source: http://my.oschina.net/MrWizard/blog/116875