- Outline:
- Summary
- First, the definition of prime number
- Common implementation method of prime number within n
- Third, the Optimization method
- Principle Level
- Code level
- Range and Xrange
- While 1 and while true are really important?
Summary
This article is mainly referring to the "programming Zhu Ji Nanxiong-renewal version" of the first chapter on the interpretation of prime numbers, describing the definition of prime numbers, as well as the common method of solving the prime number of N, the last step to give the optimization method. The code uses Python to implement two optimization methods, and gives a theory-level and code-level analysis. One of the more interesting is that the final part, the first code is written only to consider the implementation of the function, without considering how to optimize, the final result is very surprised. After reflection, the discovery of problems in the code, and code optimization, but also proved that the more advanced algorithm principle on the same code implementation, based on better performance. but also from the side to get a conclusion: even if the principle of advanced, if the code is not written, performance will not go. There are similar rumors in the lake and the river. "The theoretical performance of programming language or algorithmic principle is a cloud, the person who writes the code is the key." Also left to their own inspiration is very important, the function of the implementation of the future, we must think more about the possibility of further optimization. first, the definition of prime number
  prime number (prime number) also known as prime number there are unlimited. There are no more divisor divisible than 1 and it itself. According to the arithmetic basic theorem integers either itself a prime number, or it can be written as a series of prime numbers, the smallest prime is 2. (from Baidu Encyclopedia, more discussion please refer to the keyword "prime")
From the definition, we can get some boundary information, for example, 1 is not prime, the smallest prime number is 2. Then, according to the definition, we can easily get a thought of the number of primes, the prime number within N, for any 2<=x<=n, as long as the judge whether 1 and X is divisible by other divisor. The chosen range of the initial proposed divisor is [2,x], which is the most intuitive idea that can be derived from the definition. Later we know that we can achieve some optimization by narrowing the selection of this divisor, which is the most traditional optimization method, and more optimization methods are discussed in the third section of this article.
Common Implementation method of prime number within nfirst of all, we implement a textbook has been mentioned in the method, I believe you are familiar with, directly on the code (temporarily do not do any code optimization):
#!/bin/env Python#-*-coding:utf-8-*-import mathimport sysdef Prime (n): if n <= 1: return 0 #for i in range (2,int (MATH.SQRT (n) +1)): For I in range (2,n): if n%i = = 0: return 0 return 1if __name__ = = "__main__":
n = Int (sys.argv[1]) for I in range (2,n+1): if Prime (i): print I
The Comment line in the code is taken [2, +1] as the divisor range, we test the performance of the n=100000, the machine is a 8-core redhat5.3 host, configuration is OK :
$ time./old_prime.py 100000 |wc-l9592real2m49.129suser2m48.567ssys0m0.024s
we replace the FOR statement in the function prime with the comment line and test it again:
$ time./old_prime.py 100000 |wc-l9592real0m0.843suser0m0.830ssys0m0.012s
Obviously,[2, +1] range, the efficiency is much faster, time acceptable. third, the optimization method principle Levelwith the implementation and testing of the second section, the problem is solved, but is there a more witty way to find the primes within n? The answer is yes. Here we mention two kinds: 1, we think of such a question, in addition to 2, the remaining number of even is not likely to be prime? Obviously impossible, so we again in [2, +1] Excluding half of the calculation, then to see the odd, 1 is not prime, starting from 3, except 3, the rest can be divisible by 3 is composite, Then we on the basis of the remainder of the calculation and then ruled out one-third of the calculation, and then see 5, except 5, the rest can be divisible by 5 is composite, we also on the basis of the remaining amount of calculation to exclude one-fifth of the calculation, add up, we altogether in [2, +1 ] In the range, nearly 3/4 of the calculated amount is excluded. 2, we will extend a different angle, we construct a list of size n, the initial value is 1, each found a prime, we put all its multiples of 0, until the next number of 1 is found to be prime. This is the one mentioned in the Book of the Ed sieve method. Code Levellet's start by implementing the first optimization idea:
#!/bin/env python#-*-coding:utf-8-*-import sysimport Mathdef Prime (n): if n%2 = = 0: return n==2 if n%3 = = 0:< C5/>return n==3 If n%5 = = 0: return n==5 for p in range (7,int (MATH.SQRT (n)) +1,2): #只考虑奇数作为可能因子 if N%p = = 0: return 0 return 1if __name__ = = "__main__": n = int (sys.argv[1]) for I in range (2,n+1): #1不是 Prime number, starting from 2 if Prime (i): print I
to implement the second way of thinking, the code is as follows:
#!/bin/env python#-*-coding:utf-8-*-#寻找n以内的素数, see execution time, Prime in example 100000 import Sysdef prime (n): flag = [1]* (n+2) Flag[1] = 0 # 1 is no prime flag[n+1] = 1 p=2 while (p<=n): print P for I in range (2*p,n+1,p): C8/>flag[i] = 0 while 1: p + = 1 if (flag[p]==1): break# testif __name__ = = "__main__": n = int (sys . argv[1]) prime (N)
Unified Testing (The following tests are measured many times, the results are similar), here we add n an order of magnitude, the original 100000, became 1000000,
$ time./sushu_v0.1.py 1000000 |wc-l78498real0m6.203suser0m6.169ssys0m0.031s$ time./sushu2.py 1000000 |WC- l78498real0m0.754suser0m0.730ssys0m0.033s
Well, the difference is clear. The second method is better than the first, where we optimize the code
first, change range to xrange,again under test:
$ time./sushu_v0.1.py 1000000 |wc-l78498real0m5.440suser0m5.404ssys0m0.018s$ time./sushu2.py 1000000 |WC- L78498real0m0.624suser0m0.615ssys0m0.013s
There are two ways of improving speed. Let's talk about the difference between range and xrange, and as I understand it, RANGE returns a list one at a time, and xrange only generates one at a time, and does not preserve the last generated value. More discussions can be self-searching for relevant information.
Fatal error: There is also an implementation method for range (2*P,N+1,P), Range (2*p,n+1) [::p], but these two are completely irrelevant, and the list returned by range (2*P,N+1,P) is generated according to the P step, The range (2*p,n+1) [::p], is a 1-step list, the last list to perform slice operations, only the value of the P-step return, obviously no range (2*P,N+1,P) is more direct, although the return value of the same, but the actual test found that the efficiency difference is very large, Can even subvert the advantages of the algorithm, then we will subvert the following:
#!/bin/env python#-*-coding:utf-8-*-#寻找n以内的素数, see execution time, Prime in example 100000 import Sysdef prime (n): flag = [1]* (n+2) Flag[1] = 0 # 1 is no prime flag[n+1] = 1 p=2 while (p<=n): print P for I in range (2*p,n+1) [::p ]: Flag[i] = 0 while 1: p + = 1 if (flag[p]==1): break# testif __name__ = = "__main__": n = Int ( SYS.ARGV[1]) prime (N)
$ time./sushu2.py 100000 |wc-l9592real0m16.049suser0m15.836ssys0m0.014s
Note that I am here n is 100000, not yet 1000000. The results are still unsatisfactory, and even make us doubt the merit of the algorithm, so before we suspect the theory, we must fully check the problems in our code.
While 1 and while true are really important? We changed the while 1 in the sushu2.py to the while true test, While true:
$ time./sushu2.py 10000000 |wc-l664579real0m7.648suser0m7.538ssys0m0.182s
In the case of while 1:
$ time./sushu2.py 10000000 |wc-l664579real0m7.399suser0m7.214ssys0m0.160s
Summary : to avoid confusion, we paste the final code:First Kind
#!/bin/env python#-*-coding:utf-8-*-import sysimport Mathdef Prime (n): if n%2 = = 0: return n==2 if n%3 = = 0:< C9/>return n==3 If n%5 = = 0: return n==5 for p in xrange (7,int (MATH.SQRT (n)) +1,2): #只考虑奇数作为可能因子 if n%p = = 0: return 0 return 1if __name__ = = "__main__": n = int (sys.argv[1]) for I in xrange (2, n+1): #1不是素数, starting from 2 if Prime (i): print I
The second Kind
#!/bin/env python#-*-coding:utf-8-*-#寻找n以内的素数, see execution time, Prime in example 100000 import Sysdef prime (n): flag = [1]* (n+2) Flag[1] = 0 # 1 is no prime flag[n+1] = 1 p=2 while (p<=n): print P for i in Xrange (2*p,n+1,p): Flag[i] = 0 while 1: p + = 1 if (flag[p]==1): break# testif __name__ = = "__main__": n = Int (s YS.ARGV[1]) prime (N)
There may still be some lingering problems, such as the first method, why only consider 2,3,5, why not take 7 and even take 11 into account? If 7 is considered, does the code have an awkward place? For example, N=6 is the case, n=7 whether it is necessary to consider alone, think here, probably understand. also ,
For P in Xrange (7,int (MATH.SQRT (n)) +1,2): #只考虑奇数作为可能因子 if n%p = = 0: return 0 return 1
so n before 49, did you count all the odd numbers in it? I think this problem, can be self-test is clear, please do not ignore the previous three if statements.
In these scenarios, the last one is the fastest, the most efficient, but there is an application premise is that the search list must be ordered and continuous, so it is more suitable for a number within N to meet a certain condition, and the first several methods, the list of each element is judged, the return type is bool. Maybe you can change it later.
The implementation of Python and the optimization method for the number of prime numbers within n