Comparison of basic algorithms:
Summary: When n data volume is large, stability and the best performance is to merge,
When n data volume is small, stable and the best performance is bubbling
When a sorted list is used, it is best to insert
Common algorithm time complexity from small to large in order: 0 (1) <0 (log2n) <0 (n) <0 (nlog2n) <0 (n2) <0 (n3) ... <0 (2N) <0 (n!)
Usually, for a given algorithm, we need to do two analyses. The first is to prove the correctness of the algorithm mathematically, this step mainly uses the formal proof method and the related inference mode, such as the cyclic invariant, the mathematical induction method and so on. On the basis of proving that the algorithm is correct, the second part is the time complexity of the analysis algorithm. The time complexity of the algorithm reflects the increment of the program execution time with the increase of the input scale, and it can well reflect the merits or demerits of the algorithm to a great extent. Therefore, as a programmer, it is necessary to master the basic algorithm time complexity analysis method.
The execution time of the algorithm is measured by the time it takes to run the program on the computer based on the algorithm. There are usually two ways to measure the execution time of a program.
I. Methods of post-mortem statistics
This approach works, but is not a good approach. This method has two defects: first, in order to evaluate the running performance of the algorithm, it is necessary to make the corresponding program according to the algorithm and run it, and secondly, the statistic of the time depends on the computer hardware, software and other environmental factors, sometimes it is easy to conceal the advantages of the algorithm itself.
Ii. methods of pre-analysis and estimation
Because the post-mortem method relies more on the computer hardware, software and other environmental factors, it is sometimes easy to conceal the merits and demerits of the algorithm itself. So people often use the method of pre-analysis and estimation.
Before the program is written, the algorithm is estimated according to statistical method. The time that a program written in a high-level language runs on a computer depends on the following factors:
(1). The strategy and method adopted by the algorithm; (2). The code quality generated by the compilation; (3). The input scale of the problem; (4). The speed at which the machine executes instructions.
An algorithm consists of a control structure (order, branch, and Loop 3) and the original operation (refers to the operation of the intrinsic data type), then the algorithm time depends on the combined effect of the two. To facilitate the comparison of different algorithms for the same problem, it is common practice to choose from the algorithm a primitive operation that is basic to the problem (or algorithm type) being studied, with the number of repetitions of the basic operation as the time metric of the algorithm.
1. Complexity of Time
(1) Time frequency an algorithm implementation of the time spent, from the theoretical can not be calculated, must be on the machine to run the test to know. But we can not and do not need to test each algorithm, just know which algorithm spends more time, which algorithm spends less time on it. And the time that an algorithm spends is proportional to the number of executions of the statement in the algorithm, which algorithm takes more time than the number of statements executed. The number of times a statement is executed in an algorithm is called a statement frequency or time frequency. Note as T (N).
(2) Time complexity in the time frequency mentioned earlier, n is called the scale of the problem, and when N is constantly changing, the time frequency t (n) will change constantly. But sometimes we want to know what the pattern is when it changes. To do this, we introduce the concept of time complexity. Under normal circumstances, the number of iterations of the basic operation of the algorithm is a function of the problem size n, denoted by T (n), if there is an auxiliary function f (n), so that when n approaches infinity, the limit value of T (n)/f (n) is not equal to zero constant, then f (n) is the same order of magnitude function of t As T (n) =o (f (n)), called O (f (n)) is the progressive time complexity of the algorithm, which is referred to as the complexity of time.
In addition, the Landau symbol used in the above formula is actually introduced by the German number theory Paul Bachman (Paul Bachmann) in his 1892 book "Analytic Number Theory", which is promoted by another German number-theoretic researcher Aidemont Landau (Edmund Landau). The function of the Landau symbol is to use simple functions to describe the behavior of complex functions and to give an upper or lower (true) bounds. In the calculation of the complexity of the algorithm is generally used only large o symbol, landau symbol system of small o symbol, θ symbol, etc. is not commonly used. Here O, originally with uppercase Greek letters, but now all with uppercase English letter o; small o sign is also in lowercase English letter o,θ symbol to maintain uppercase Greek alphabet θ.
T (n) =0 (f (n)) indicates that there is a constant C, so that there is always T (n) ≤c * f (n) when n tends to infinity. Simply put, T (N) is almost as large as f (n) when n tends to infinity. That is, when n approaches positive infinity, the upper bound of T (n) is c * F (n). Although there is no provision for f (n), it is generally the simplest possible function. For example, O (2N2+n + 1) = O (3N2+n+3) = O (7N2 + N) = O (N2), generally only with O (N2) means yes. Notice that a constant c is hidden in the large O symbol, so there is generally no coefficient in f (n). If T (n) is used as a tree, then O (f (n)) expresses the trunk, only cares about the trunk, and all the other details are discarded.
In a variety of different algorithms, if the algorithm is a constant number of execution times, the time Complexity of O (1), in addition, the time frequency is different, the time complexity may be the same, such as t (n) =N2+3n+4 and T (n) =4N2+2n+1 their frequency is different, but the time complexity is the same, all O (N2). In order of magnitude increment, common time complexity is: Constant order O (1), logarithmic order O (log2n), Linear order O (n), linear logarithmic order O (nlog2n), Square Order O (N2), Cubic Order O (N3),..., K order O (Nk), Index order O (2N). With the increasing of the problem scale N, the complexity of the time is increasing and the efficiency of the algorithm is less.
As we can see, we should choose the polynomial order O (nk) algorithm instead of the exponential order algorithm.
Common algorithm time complexity from small to large in order: 0 (1) <0 (log2n) <0 (n) <0 (nlog2n) <0 (n2) <0 (n3) ... <0 (2N) <0 (n!)
In general, for a problem (or a class of algorithms) simply select a basic operation to discuss the time complexity of the algorithm, and sometimes need to consider several basic operations, and even different operations can be given different weights to reflect the relative time required to perform different operations, This approach facilitates the synthesis and comparison of two completely different algorithms for solving the same problem.
(3) The specific steps to solve the time complexity of the algorithm are:
⑴ find out the basic statements in the algorithm;
The most frequently executed statement in the algorithm is the basic statement, usually the loop body of the most inner loop.
⑵ the order of the number of executions of the base statement;
Simply calculate the order of magnitude of the base statement execution, which means that all coefficients of the lower and highest powers can be ignored as long as the highest power in the function that guarantees the execution of the base statement is correct. This simplifies algorithmic analysis and focuses attention on the most important point: growth rates.
⑶ represents the time performance of the algorithm with a large 0 notation.
Place the order of magnitude of the base statement execution into the large 0 notation.
If the algorithm contains nested loops, the base statement is usually the inner loop body, and if the algorithm contains a parallel loop, the time complexity of the parallel loop is added. For example:
[Java]View PlainCopy
- For (i=1; i<=n; i++)
- x + +;
- For (i=1; i<=n; i++)
- For (j=1; j<=n; j + +)
- x + +;
The time complexity of the first for loop is 0 (n) and the time complexity of the second for loop is 0 (n2), the time complexity of the entire algorithm is 0 (n+n2) =0 (n2).
0 (1) indicates that the execution number of the base statement is a constant, in general, as long as there is no circular statement in the algorithm, its time complexity is 0 (1). where 0 (log2n), 0 (N), 0 (nlog2n), 0 (n2), and 0 (N3) are called polynomial times, while 0 (2N) and 0 (n!) Called exponential time. Computer scientists generally believe that the former (that is, polynomial time complexity algorithm) is an effective algorithm, this kind of problem is called P (polynomial, polynomial) class problem, and the latter (that is, exponential time complexity algorithm) called NP (Non-deterministic polynomial, Non-deterministic polynomial) problem.
In general, polynomial level complexity is acceptable, many problems have polynomial-level solutions-that is, the problem, for a size is n input, in n^k time to get results, called P problem. Some problems are more complicated, there is no polynomial-time solution, but you can verify that a certain guess is correct in polynomial time. Like asking if 4294967297 is prime? If you want to start directly, then you have to take the square root of less than 4294967297 of all the primes to see if we can divide evenly. Fortunately Euler tells us that this number equals 641 and 6700417 of the product, not prime, well verified, by the way trouble to tell the horse his conjecture is not tenable. Large number decomposition, Hamilton circuit and other problems, can be polynomial time to verify that a "solution" is correct, such problems are called NP problem.
(4) In calculating the time complexity of the algorithm, there are several simple program analysis rules:
(1). For some simple input-output statements or assignment statements, it is approximate that an O (1) time is required
(2). For sequential structures, the time it takes to execute a series of statements in sequence can be done by using the "summation law" under the Big O
Summation rule: means that if the 2-part time complexity of the algorithm is T1 (n) =o (f (n)) and T2 (n) =o (g (n)), then T1 (n) +t2 (n) =o (max (f (n), g (n)))
In particular, if T1 (m) =o (f (m)), T2 (n) =o (g (n)), then T1 (m) +t2 (n) =o (f (m) + g (n))
(3). For a selection structure, such as an if statement, its main time spent is the time spent executing then or else words, it should be noted that the test condition also requires O (1) time
(4). For the loop structure, the running time of the circular statement is mainly embodied in the execution of the loop body in several iterations and the time consuming of verifying the cycle condition, generally can use the "multiplication law" under the Big O.
Multiplication rule: means that if the 2-part time complexity of the algorithm is T1 (n) =o (f (n)) and T2 (n) =o (g (n)), then T1*t2=o (f (n) *g (n))
(5). For complex algorithms, it can be divided into several easy-to-estimate parts, and then use the summation and multiplication rules to technique the time complexity of the whole algorithm
The following 2 algorithms are also available: (1) if G (n) =o (f (n)), then O (f (n)) + O (g (n)) = O (f (n)), (2) O (Cf (n)) = O (f (n)), where C is a normal number
(5) Examples of several common time complexities are shown below:
(1), O (1)
Temp=i; I=j; J=temp;
The frequency of the above three individual statements is 1, and the execution time of the program segment is a constant independent of the problem size n. The time complexity of the algorithm is the constant order, which is recorded as T (N) =o (1). Note: If the execution time of the algorithm does not grow with the increase of the problem size n, even if there are thousands of statements in the algorithm, the execution time is only a large constant. The time complexity of such an algorithm is O (1).
(2), O (n2)
2.1. Exchange of the contents of I and J
[Java]View PlainCopy
- sum=0; (one time)
- For (i=1;i<=n;i++) (n+1 times)
- For (j=1;j<=n;j++) (n2 times)
- sum++; (N2 times)
Solution: Because θ (2n2+n+1) =n2 (θ is: to go to the lower order, remove the constant term, remove the common parameter of the higher order), so t (n) = =o (n2);
2.2.
[Java]View PlainCopy
- for (i=1;i<n;i++ )
- {
- y=y+1; ①
- for (j= 0;j<= (2*n); j + +)
- x++; ②
- }
Solution: The frequency of statement 1 is n-1
The frequency of Statement 2 is (n-1) * (2n+1) =2n2-n-1
F (n) =2n2-n-1+ (n-1) =2n2-2;
also θ (2n2-2) =N2
The program has a time complexity of T (n) =o (n2).
In general, the step Loop statement only takes into account the number of executions of the statement in the loop body, ignoring the steps in the statement, the step plus 1, the final value, control transfer and other components, when there are several loop statements, the time complexity of the algorithm is the maximum number of nested layers in the loop statement of the most internal statement frequency f (n).
(3), O (n)
[Java]View PlainCopy
- a=0;&NBSP;&NBSP;
- b= 1; ①
- for (i=1;i<=n;i++) ②
- {
- s=a+b; ③
- b=a; ④
- a=s; ⑤
- }
Solution: Frequency of Statement 1:2,
Frequency of statement 2: N,
Frequency of Statement 3: N-1,
Frequency of Statement 4: n-1,
Frequency of Statement 5: N-1,
T (n) =2+n+3 (n-1) =4n-1=o (n).
(4), O (log2n)
[Java]View PlainCopy
- i=1; ①
- Hile (I<=n)
- i=i*2;②
Solution: The frequency of statement 1 is 1,
The frequency of setting statement 2 is f (n), then: 2^f (N) <=n;f (n) <=log2n
The maximum value f (n) =log2n,
T (n) =o (log2n )
(5), O (n3)
[Java]View PlainCopy
- For (i=0;i<n;i++)
- {
- For (j=0;j<i;j++)
- {
- For (k=0;k<j;k++)
- x=x+2;
- }
- }
Solution: When the i=m, J=k, the number of times the inner loop is k when I=m, J can take 0,1,..., m-1, so here the most internal cycle of 0+1+...+m-1= (m-1) m/2 times So, I from 0 to N, then the cycle has been carried out: 0+ (1-1) *1/2+ ... + (n-1) n/2=n (n+1) (n-1)/6 So time complexity is O (n3).
(5) Time complexity and spatial complexity of common algorithms
An empirical rule: where C is a constant, if the complexity of an algorithm is C, log2n , n, n*log2n , then the algorithm time efficiency is higher, if it is 2N,3N, n!, Then a slightly larger n will make the algorithm can not move, in the middle of the few are passable.
Algorithm time complexity analysis is a very important problem, any programmer should master its concept and basic method, and be good at the mathematical level to explore its essence, can accurately understand its connotation.
2, the spatial complexity of the algorithm
Similar to the discussion of time complexity, the spatial complexity of an algorithm (space complexity) S (n) is defined as the storage space consumed by the algorithm, and it is also a function of the problem size n. Asymptotic spatial complexity is also often referred to as spatial complexity.
Spatial complexity (space complexity) is a measure of the amount of storage space that is temporarily occupied by an algorithm while it is running. The storage space occupied by an algorithm in the computer memory, including the storage space occupied by the storage algorithm itself, the storage space occupied by the input and output data of the algorithm and the storage space occupied by the algorithm in the running process three aspects. The storage space occupied by the input and output data of the algorithm is determined by the problem to be solved, which is passed by the calling function by the parameter table, and it does not change with the algorithm. Storage algorithm itself occupies the storage space and the length of the algorithm written in proportion, to compress the storage space, you must write a shorter algorithm. Algorithm in the running process of temporary occupied storage space varies with the algorithm, some algorithms only need to occupy a small amount of temporary work units, and does not change with the size of the problem, we call this algorithm is "in-place \", is to save the memory of the algorithm, as described in this section of the algorithm is so , some algorithms need to occupy the number of temporary work and solve the problem of the size of N, it increases with the increase of N, when n is large, will occupy more storage units, such as in the Nineth chapter described in the Quick Sort and merge sorting algorithm is the case.
If the spatial complexity of an algorithm is a constant, that is, it can be represented as O (1) when it is not changed by the size of N of the processed data, and when the spatial complexity of an algorithm is proportional to the logarithm of the base N of 2, it can be represented as 0 (10g2n), and when an algorithm's empty I-division complexity is linearly proportional to N, can be represented as 0 (n). If the parameter is an array, it is only necessary to allocate a space for it to store an address pointer transmitted by the argument, that is, a machine word space, and if the formal parameter is a reference, it is only necessary to allocate a space for it to store the address of the corresponding argument variable. To automatically reference the argument variable by the system.
Reference 1:http://www.cnblogs.com/songqq/archive/2009/10/20/1587122.html
Reference 2:http://www.cppblog.com/85940806/archive/2011/03/12/141672.html
Algorithm space and time complexity