Analysis of algorithm complexity

Last Update:2015-08-11 Source: Internet

Author: User

Tags square root

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Summary
This paper discusses an important problem in the field of algorithm analysis--the basic content of time complexity analysis. This paper will first clarify the significance of time complexity, and then discuss its mathematical definition and related derivation in a formal way. To help you understand the concept in essence.

Preface
Usually, for a given algorithm, we need to do two analyses.The first is to prove the correctness of the algorithm mathematically., this step mainly uses the formal proof method and the related inference pattern, such as the cyclic invariant, the mathematical induction method and so on. And on the basis of proving that the algorithm is correct,the second part is the time complexity of the analysis algorithm。 The time complexity of the algorithm reflects the increment of the program execution time with the increase of the input scale, and it can well reflect the merits or demerits of the algorithm to a great extent. Therefore, as a programmer, it is necessary to master the basic algorithm time complexity analysis method.
But many friends do not understand the concept clearly, the reason, mainly because not from the mathematical level of understanding its essence, but accustomed to from the intuitive understanding. Next, we step closer to the mathematical nature of the algorithm's time complexity.

mathematical significance of algorithm time complexity
Mathematically defined,given algorithm A, if there is a function f (n), when N=k, F (k) represents the run time of algorithm A in the case of input size k, then f (n) is the time complexity of algorithm a。
Here we first need to explicitly enter the concept of scale. On the input scale, not very well defined, not strictly speaking, the input scale refers to the size of the natural independent body that the input is accepted by the algorithm A. For example, for the sorting algorithm, the input size is generally the number of elements to be sorted, and for the two of the same matrix of the product of the algorithm, the input scale can be regarded as a single square of the dimension. For the sake of simplicity, in the following discussion, we always assume that the input size of the algorithm is represented by an integer greater than 0, i.e. n=1,2,3,......, k,......
We also know that for the same algorithm, the time of each execution depends not only on the input size, but also on the characteristics of the input and the state of the specific hardware environment at the time of execution. So it is impossible to get a uniform and precise f (n). To solve this problem, let's do two instructions:
1. Ignoring hardware and environmental considerations, assume that the hardware and environmental conditions are fully consistent at each execution.
2. For the differences in the input characteristics, we will mathematically analyze it and bring it into the analytic form of the function.

Algorithm time Complexity analysis example
To make it easier for my friends to understand, I will not use classic examples such as quick sorting, merge sorting, and so on in textbooks, but rather a very simple algorithm as an example. Let's start by defining the problem.
Problem definition:
Input--This problem is entered as an ordered sequence with a number of elements of n,n to an integer greater than 0. The elements in the sequence are n integers from 1 to n, but their order is completely random.
Output-the position where the element n is located. (first element position is 1)

The problem is very simple, and the following is a straightforward solution to one of its algorithms (pseudo code):

Locationn (A)
{
for (int i=1;i<=n;i++)-----------------------T1
{
if (a[i] = = N)----------------------------T2
{return i;} ------------------------T3
}
}

Let's take a look at this algorithm. where T1, T2, and T3 each represent the time it takes for this line of code to execute.
First, the input scale n is one of the factors that affect the execution time of the algorithm. In cases where n is fixed, different input sequences can also affect their execution time. In the best case, N is ranked in the first position of the sequence, then the run time is "T1+T2+T3". In the worst case, N is ranked last in the sequence, and the run time is "n*t1+n*t2+t3= (t1+t2) *n+t3". As you can see, the best time to run is a constant, and the worst-case run time is a linear function of the input scale. So what is the average situation?
The problem definition says that the input sequence is completely random, that is, n appears in the n position of 1...N, it is possible, that is, the probability is 1/n. The average number of executions is the mathematical expectation of the number of executions, and the solution is:

E
= P (n=1) *1+p (n=2) *2+...+p (n=n) *n
= (1/n) * (1+2+...+n)
= (1/n) * ((N/2) * (1+n))
= (n+1)/2

That is, on average the for loop is going to execute (n+1)/2 times, the average run time is "(t1+t2) * (n+1)/2+t3".
Thus we conclude that:
T1+t2+t3 <= f (n) <= (T1+T2) *n+t3, on average f (n) = (t1+t2) * (n+1)/2+t3

Asymptotic time complexity of the algorithm
In the above analysis, we analyze the time complexity f (n) of the algorithm accurately. However, many times, we do not need to perform such an accurate analysis for the following reasons:
1. In more complex algorithms, accurate analysis is very complex.
2. In fact, most of the time we don't care about the accuracy of f (n), but only about its magnitude.
Based on this, the concept of asymptotic time complexity is proposed. Before formally giving asymptotic time complexity, a few mathematical definitions are given:

Definition One: Θ (g (n)) ={f (n) | If there is a normal number C1, c2, and positive integer n0, when N>=n0, 0<C1G (n) <=f (n) <=c2g (n) is established}
Definition 2:0 (g (n)) ={f (n) | If there is a normal number C and a positive integer n0, so that when N>=n0, 0<=f (n) <=CG (n) is established}
Definition Three: Ω (g (n)) ={f (n) | If there is a normal number C and a positive integer n0, so that when N>=n0, 0<=CG (n) <=f (n) is established}

As you can see, the three definitions actually define a set of functions, except that the functions in the collection need to meet different conditions. With the above definition, you can define the asymptotic complexity of the time.
But here's another question: F (n) is not deterministic, he is in a range of changes, then we care about which F (n)? In general, we use the worst-case f (n) to evaluate the efficiency of the algorithm when analyzing the algorithm, for the following two points:
1. If you know the worst case scenario, we can guarantee that the algorithm will not be any worse than the situation at any time.
2. In many cases, the probability of the worst-case scenario of an algorithm's operation is still large, such as finding the unknown origin element does not exist in the problem. And in many cases, the asymptotic time complexity of the average condition and the asymptotic time complexity of the worst case is a magnitude.

The following definition is given:set F (n) to algorithm A in the worst case f (n), if f (n) is θ (g (n)), then the asymptotic time complexity of algorithm A is g (n), and g (n) is the asymptotic certainty bounds of f (n)。

As an example of the above example, the above definition is f (n) = (t1+t2) *n+t3. The asymptotic certainty of f (n) is n, which proves as follows:

Proof:
Set c1=t1+t2,c2=t1+t2+t3,n0=2
And because t1,t2,t3 are more than 0
Then, when the N>n0, 0<c1n<=f (n) <=c2n namely 0< (T1+T2) *n<= (t1+t2) *n+t3<= (T1+T2+T3) *n permanent establishment.
So F (n) belongs to θ (n)
So n is the asymptotic certainty boundary of F (N)
Certificate of Completion

in practical application, we usually use the asymptotic time complexity instead of the actual time complexity to analyze the efficiency of the algorithm. It is generally believed that an asymptotic complexity of n algorithm is better than the asymptotic complexity of n^2 algorithm. Note that this is not to say that the asymptotic complexity n algorithm is necessarily more efficient in any case, but that the worst case of the previous algorithm is always better than that of the latter when the input size is large enough (greater than the critical condition N0). It turns out that this analysis is reasonable and effective in practice.
Similarly, the upper and lower bounds of the algorithm's time complexity can be given:
set F (n) to algorithm A in the worst case f (n), if f (n) is 0 (g (n)), then the asymptotic time complexity of algorithm A is limited to G (n), and g (n) is the asymptotic upper bound of f (n).
Set f (n) to algorithm A in the worst case f (n), if f (n) belongs to Ω (g (n)), then the asymptotic time complexity of algorithm A is lower than g (n), and g (n) is f (n) the asymptotic lower bound.
It must be noted thatsince we are analyzing the worst case of f (n), we can 100% guarantee that when the input scale exceeds the critical condition, the running time of the algorithm will not be higher than the asymptotic n0, but it is not 100% guaranteed that the algorithm will not run less than the asymptotic boundary, but only 100% Ensure that the worst running time of the algorithm is not lower than that of the asymptotic lower bounds.

Summary
Algorithm time complexity analysis is a very important problem, any programmer should master its concept and basic method, and be good at the mathematical level to explore its essence, can accurately understand its connotation. In the above analysis, we have only discussed the "tight boundary", in fact, in reality, the boundary is also divided into "tight" and "non-compact", interested friends can access the relevant information.
This article is here, I hope the content of this article can help you.

Above content source: http://www.cnblogs.com/leoo2sk/archive/2008/11/14/1332381.html Feeling is not very clear, had to find a blog post to study.

One is the complexity of time , and the other is the complexity of the asymptotic time . The former is the time consuming of an algorithm, which is the function of solving the problem size n, while the latter refers to the order of time complexity of the algorithm when the problem scale tends to infinity.

When we evaluate the time performance of an algorithm, the main criterion is the asymptotic time complexity of the algorithm, therefore, in the algorithm analysis, often do not distinguish between the two, often is the asymptotic time complexity T (n) =o (f (n)) abbreviation for the time complexity, where f (n) is generally the frequency of the most frequently used in the algorithm.

In addition, the frequency of the statements in the algorithm is not only related to the problem size, but also to the value of each element in the input instance. But we always consider the time complexity in the worst case scenario . To ensure that the algorithm does not run longer than it does.

The common time complexity, ascending by order of magnitude is: constant order O (1), Logarithmic order O (log2n), linear order O (n), linear logarithmic order O (nlog2n), square order O (n^2), cubic Order O (n^3), K-Order O (n^k), exponent order O (2^n).

1. Large O-notation

Defined

The time complexity of setting up a program is represented by a function T (n), for a lookup algorithm, as follows:

int seqsearch (int a[], const int n, const int x)
{
int i = 0;
for (; A[i]! = x && i < n; i++);
if (i = = n) return-1;
else return i;
}
This program compares the input values sequentially to the elements in the array to find the elements that are equal to each other.

When the first element is found it needs to be compared once, in the second element to find a need to compare 2 times, ..., in the nth element to find the need to compare n times. For an array with n elements, if the probability of each element being found is equal, the average number of comparisons for a successful lookup is:

F (n) = 1/n (n + (n-1) + (n-2) + ... + 1) = (n+1)/2 = O (n)

This is the original definition of the legendary big O function.

use Big O to express
To fully analyze an algorithm, you need to consider the time cost of the algorithm in the worst and best case, and the time cost in the average case. For the worst case scenario, a general reference to the large O notation (note that the "general formulation" is used here) is: when and only if there are positive integers C and n0, so that T (n) <= c*f (n) is set for all n >= n0. The asymptotic time complexity of the algorithm is called t (n) = O (f (n)). This should be the first chapter in advanced mathematics, the limit inside the knowledge. Here f (n) = (n+1)/2, then C * F (n) is a once function. It is on the image that if the function is below C*f (n), it is the complexity of t (n) = O (f (n)).

For the number of levels, we can use the Big O notation as O (log2n).

Rules
1) Addition rule
T (n,m) = T1 (n) + T2 (n) = O (max (f (n), G (m))
2) Multiplication Rules
T (n,m) = T1 (n) * T2 (m) = O (f (N) * g (M))
3) A special case
In large O notation there is a special case, if T1 (n) = O, c is an arbitrary constant unrelated to N, and T2 (n) = O (f (n)) is
T (n) = T1 (n) * T2 (n) = O (c*f (n)) = O (f (n)).

In other words, in large O notation, any non-0 normal number is the same order of magnitude, denoted by O (1).
4) A rule of thumb
Has the following complexity relationship
C < log2n < n < n * log2n < n^2 < N^3 < 2^n < 3^n < n!
where c is a constant, if the complexity of an algorithm is C, LOG2N, N, n*log2n, then the algorithm time efficiency is higher, if it is 2^n, 3^n, n!, then a slightly larger n will make this algorithm can not move, in the middle of a few is passable.

1) Basic knowledge points: The complexity of a program without loops is constant, the complexity of a layer of loops is O (n), and the complexity of the two-layer loop is O (n^2)? (I use ^2 to denote the square, the same ^3 represents the cubic);

2) standard deviation of two-dimensional matrices, residuals, information entropy, time complexity of fft2,dwt2,dct2: standard deviation and residuals may O (n), FFT2 is O (Nlog (n)), DWT2 may also be O (Nlog (n)), information entropy requires probability, and DCT process is the same as JPEG. As with JPEG, the two difficult matrices were processed. Y=t*x*t ', z=y.*mask, this way, there are also divided into 8*8 sub-image;

3) Example:

1, set three functions f,g,h f (N) =100n^3+n^2+1000, g (n) =25n^3+5000n^2, h (n) =n^1.5+5000nlgn

Please determine if the following relationships are true:

(1) F (n) =o (g (n))

(2) G (n) =o (f (n))

(3) H (n) =o (n^1.5)

(4) H (n) =o (NLGN)

Here we review the representation of asymptotic time complexity T (n) =o (f (n)), the "O" here is a mathematical symbol, and its strict definition is "if T (n) and F (n) are the two functions defined on the set of positive integers, then T (n) =o (f (n)) denotes the existence of positive constants C and N0, 0≤t (n) ≤c?f (n) are satisfied when n≥n0. "In the easy-to-understand words, these two functions are a constant that is not equal to 0 when the integer argument n tends to infinity." In this way, it's a good calculation.

(1) established. Since the highest of the two functions is n^3, when n→∞, the ratio of two functions is a constant, so the relationship is established.

(2) established. In the same vein.

(3) established. In the same vein.

(4) not established. Since n^1.5 is faster than NLGN when n→∞, the ratio of H (N) to NLGN is not constant and therefore not valid.

2, set n is a positive integer, using the large "O" notation, the following program segment execution time is represented as a function of N.

(1) I=1; K=0

while (I<n)

{k=k+10*i;i++;

}

Solution: T (n) =n-1, t (n) =o (n), this function is incremented by the linear order.

(2) X=n; N>1

while (x>= (y+1) * (y+1))

y++;

Solution: T (n) =n1/2, t (n) =o (N1/2), the worst case is y=0, then the number of cycles is N1/2 times, which is a function that increments by the square root order.

(3) x=91; y=100;

while (y>0)

if (x>100)

{x=x-10;y--;}

else x + +;

Answer: T (n) =o (1), this program looks a little scary, a total of 1000 cycles, but we see n no? Didn't. The operation of this program is independent of N, even if it is recycled for 10,000 years, we do not care about him, just a constant order function.

The same problem can be solved by different algorithms, and the quality of an algorithm will affect the efficiency of the algorithm and even the program. The purpose of the algorithm analysis is to select the suitable algorithm and the improved algorithm. The evaluation of an algorithm is mainly considered in terms of time complexity and spatial complexity.

1. Complexity of Time

(1) Time frequency
The time it takes for an algorithm to execute is theoretically impossible to figure out and must be tested on the machine. But we can not and do not need to test each algorithm, just know which algorithm spends more time, which algorithm spends less time on it. And the time that an algorithm spends is proportional to the number of executions of the statement in the algorithm, which algorithm takes more time than the number of statements executed. The number of times a statement is executed in an algorithm is called a statement frequency or time frequency. Note as T (N).

(2) Complexity of time
In the time frequency mentioned just now, N is called the scale of the problem, and when N is constantly changing, the time frequency t (n) will change constantly. But sometimes we want to know what the pattern is when it changes. To do this, we introduce the concept of time complexity.

Under normal circumstances, the number of iterations of the basic operation of the algorithm is a function of the problem size n, denoted by T (n), if there is an auxiliary function f (n), so that when n approaches infinity, the limit value of T (n)/f (n) is not equal to zero constant, then f (n) is the same order of magnitude function of t As T (n) =o (f (n)), called O (f (n)) is the progressive time complexity of the algorithm, which is referred to as the complexity of time.

In various algorithms, if the algorithm is a constant number of execution times, the time complexity is O (1), in addition, the time frequency is not the same, the time complexity may be the same, such as T (n) =n2+3n+4 and T (n) =4n2+2n+1 their frequency is different, but the time complexity of the same, all O(n2).

In order of magnitude increments, common time complexity is:
Constant order O (1), Logarithmic order O (log2n), linear order O (n),
Linear logarithmic order O (nlog2n), square order O (n2), Cubic O (n3),...,
K-Order O (NK), index order O (2n). With the increasing of the problem scale N, the complexity of the time is increasing and the efficiency of the algorithm is less.

2. Complexity of Space
Like time complexity, spatial complexity is the measure of the amount of storage space required for an algorithm to execute within a computer. Recorded as:
S (n) =o (f (n))

We are generally talking about the size of the secondary storage unit in addition to the normal memory overhead. The discussion method is similar to the complexity of time, and is not discussed.

(3) Time performance of progressive time complexity evaluation algorithm
The time performance of an algorithm is evaluated mainly by the order of magnitude of the algorithm's time complexity (i.e., the asymptotic time complexity of the algorithm).
"Example 3." 7 "There are two algorithms A1 and A2 solve the same problem, the time complexity is T1 (n) =100n2,t2 (n) =5n3.
(1) When the input amount is n<20, there is T1 (n) >t2 (n), which takes less time.
(2) with the increase of the problem size n, the time cost of the two algorithms is also increased with the 5N3/100N2=N/20. That is, when the problem scale is large, the algorithm A1 is more effective than the algorithm A2.
Their asymptotic time complexity O (n2) and O (N3) Evaluate the temporal quality of the two algorithms on a macroscopic scale. In the algorithm analysis, the time complexity and the asymptotic time complexity of the algorithm are often not distinguished, but the asymptotic time complexity T (n) =o (f (n)) is referred to as the time complexity, and the F (n) is usually the most frequent statement frequency in the algorithm.
"Example 3." The time complexity of the 8 "algorithm matrixmultiply is generally t (n) =o (N3), and F (n) =n3 is the frequency of the statement (5) in the algorithm. The following example shows how to find the time complexity of the algorithm.

"Example 3." 9 "Swap the contents of I and J.
Temp=i;
I=j;
J=temp;

The frequency of the above three individual statements is 1, and the execution time of the program segment is a constant independent of the problem size n. The time complexity of the algorithm is the constant order, which is recorded as T (N) =o (1).
If the execution time of the algorithm does not grow with the increase of the problem size n, even if there are thousands of statements in the algorithm, the execution time is only a large constant. The time complexity of such an algorithm is O (1).

"Example 3." One of the 10 "variable counts.
(1) x=0;y=0;
(2) for (k-1;k<=n;k++)
(3) x + +;
(4) for (i=1;i<=n;i++)
(5) for (j=1;j<=n;j++)
(6) y++;

In general, the step Loop statement only takes into account the number of executions of the statement in the loop body, ignoring the step plus 1, the final value discriminant, the control transfer and other components of the statement. Therefore, the most frequent statement in the above program segment is (6), the frequency is f (n) =n2, so the time complexity of the program segment is t (n) =o (n2).
When there are several loop statements, the time complexity of the algorithm is determined by the frequency f (n) of the most inner statement in the loop statement with the highest number of nesting layers.

"Example 3." 11 "Variable count of two.
(1) X=1;
(2) for (i=1;i<=n;i++)
(3) for (j=1;j<=i;j++)
(4) for (k=1;k<=j;k++)
(5) x + +;

The most frequently used statement in this program segment is (5), the number of executions in the inner loop is not directly related to the problem size n, but is associated with the variable value of the outer loop, and the number of outermost loops is directly related to n, so the number of executions of the statement (5) can be analyzed from the inner loop to the outer layer:

The time complexity of the program segment is t (n) =o (n3/6+ Low) =o (N3).

(4) The time complexity of the algorithm depends not only on the scale of the problem, but also on the initial state of the input instance.
"Example 3." 12 The algorithm for finding the given value K in the value a[0..n-1 is roughly as follows:
(1) i=n-1;
(2) while (i>=0&& (a[i]!=k))
(3) i--;
(4) return i;

The frequency of the statement (3) In this algorithm is not only related to the problem size n, but also to the value of each element of a in the input instance and the value of K:
① If there is no element equal to K in a, the frequency f (n) of the statement (3) is =n;
② if the last element of a is equal to K, the frequency f (n) of the statement (3) is the constant 0.

(5) Worst time complexity and average time complexity
The worst-case time complexity is called the worst time complexity. In general, it is not particularly stated that the time complexity of the discussion is the worst-case time complexity.
The reason for this is that the worst-case time complexity is the upper bound of the algorithm's run time on any input instance, which guarantees that the algorithm will not run longer than any other.
"Example 3." 19 "Find Algorithm" Example 1 8 "in the worst case the time complexity is T (n) =0 (n), which means that for any input instance, the algorithm's run time cannot be greater than 0 (n)."
The average time complexity is the expected run time of the algorithm when all possible input instances are present with equal probabilities.
The common time complexity is ascending by order of magnitude: constant 0 (1), log Order 0 (LOG2N), line order 0 (n), linear logarithm order 0 (nlog2n), square 0 (N2) Cubic 0 (n3) 、...、 K-order 0 (NK), exponent order 0 (2n). Obviously, the algorithm with time complexity of exponential order 0 (2n) is very inefficient and cannot be applied when the n value is slightly larger.
Similar to the discussion of time complexity, the spatial complexity of an algorithm (space complexity) S (n) is defined as the storage space consumed by the algorithm, and it is also a function of the problem size n. Asymptotic spatial complexity is also often referred to as spatial complexity. The time complexity and spatial complexity of the algorithm are called the complexity of the algorithm.

Analysis of algorithm complexity

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More