Python data structure and algorithm--algorithm analysis

Source: Internet
Author: User
Tags first string for in range

In computer science, algorithmic analysis (analyst ofalgorithm) is the process of analyzing the amount of computing resources (such as compute time, memory usage, etc.) that are consumed by executing a given algorithm. The efficiency or complexity of an algorithm is theoretically represented as a function. The defined field is the length of the input data, which is usually the number of steps (time complexity) or the number of memory locations (spatial complexity). Algorithm analysis is an important part of computational complexity theory.

This article address: http://www.cnblogs.com/archimedes/p/python-datastruct-algorithm-analysis.html, reprint please indicate source address.

An interesting question often arises, that is, two seemingly different programs, which one is better?

To answer this question, we must know that there is a big difference between the program and the algorithm that represents the program. The algorithm is a generic, problem-solving instruction. Provides a solution to any instance problem method with the specified input, and the algorithm produces the desired result. A program, on the other hand, is to implement the algorithm in a certain programming language code. There are many programs that implement the same algorithm, depending on the use of programmers and programming languages.

Further exploring this difference, examine the following function code. This function solves a simple problem of calculating the first N natural number of the and. The solution iterates through the n integers and assigns them to the accumulator after the addition.

def sumofn (n):    = 0   for in range (1,n+1):       = thesum + i    return  thesumprint(sumofn (10))

Next look at the following code. At first glance it feels strange, but after a deep understanding you will find that the function and the function above do the same work. The reason for this is that the function is not so obvious and the code is ugly. We did not use a good variable name resulting in poor readability, and we also declared variables that were not necessarily declared.

def foo (tom):     = 0    for in range (1,tom+1):       = Bill       = Fred + Barney     return Fred Print (foo (10))

Which code is better? The answer to the question depends on your criteria. If you focus only on readability, the function is sumOfN definitely better foo . In fact, you may have seen a lot of examples in your programming initiation class that teach you how to write well-readable and easy-to-understand programs. However, we are also interested in algorithms here.

As an alternative space requirement, we analyze and compare algorithms based on their execution time. This measure is sometimes referred to as the "Execution time" or "Run Time" of the algorithm. sumOfNOne way we measure the execution time of a function is to do a benchmark analysis. In Python, we can mark the beginning and end of a program on the system we are using. In the time module there is a time function called, which returns the current time of the system. By calling this function two times, starting and ending, and then calculating the difference, we can get the exact execution time.

Listing 1

Import  Time def sumOfN2 (n):    = time.time ()   = 0   for in range (1,n+1):      = thesum + i   = time.time ()   return Thesum,end-start

Listing 1 shows the sumOfN time overhead of the function before and after the sum. The test results are as follows:

>>> forIinchRange (5):       Print("Sum is%d required%10.7f seconds"%SUMOFN (10000))Sum is   50005000 required 0.0018950 secondssum   is 50005000 required 0.0018620 secondssum /c4> is 50005000 required 0.0019171 secondssum   is 50005000 required 0.0019162 secondssum /c8> is 50005000 required 0.0019360 seconds

We found that the time was quite consistent and spent an average of 0.0019 seconds executing the program. So what happens if we increase N to 100,000?

>>> forIinchRange (5):       Print("Sum is%d required%10.7f seconds"%sumofn (100000))Sum is   5000050000 required 0.0199420 secondssum   is 5000050000 required 0.0180972 seconds Sum   is 5000050000 required 0.0194821 secondssum   is 5000050000 required 0.0178988< c13> secondssum is 5000050000 required 0.0188949 seconds  >>>

Again, the time is longer and very consistent, averaging 10 times times the time. Will n grow to 1,000,000 we reach:

>>> forIinchRange (5):       Print("Sum is%d required%10.7f seconds"%sumofn (1000000))Sum is   500000500000 required 0.1948988 secondssum   is 500000500000 required 0.1850290 sec Ondssum   is 500000500000 required 0.1809771 secondssum   is 500000500000 required 0.172925 0 secondssum   is 500000500000 required 0.1646299 seconds >>>

In this case, the average execution time is once again proven to be 10 times times the previous one.

Now look at Listing 2and propose a different way to solve the summation problem. This function, sumOfN3 using an equation: ∑ni = (n+1) n/2 to calculate the previous n natural number instead of the cyclic calculation.

Listing 2

def sumOfN3 (n):    return (n (n+1))/2print(sumOfN3 (10))

If we were to sumOfN3  do some testing, using 5 different n values (100,000, 1,000,000, 10,000,000, and 100,000,000), we got the following result:

 is 50005000 required 0.00000095 are 5000050000 required 0.00000191 is 500000500000 Required 0.00000095 is 50000005000000 required 0.00000095 is 5000000050000000 Required 0.00000119 seconds

For this output, there are two areas to note. First, the running time of the above program is shorter than any previous one. Second, regardless of how much execution time n is consistent.

But what does this standard really tell us? Intuitively, we can see that the iterative solution seems to be doing more work because some of the program steps are duplicated. This is the reason why it takes up more running time. As we increase, n the execution time of the loop program is also increasing. However, there is a problem. If we run the same function on different computers or use different programming languages, we may get different results. If it is an old computer, it will probably sumOfN3上 take more time to execute.

We need a better way to describe the execution time of these algorithms. The Baseline method calculates the actual execution time. It does not really provide us with a useful measure, because it is dependent on the specific machine, the current time, the compilation, and the programming language. Instead, we need to have a feature that is independent of the use of programs or computers. This method will independently determine the algorithm used is useful and can be used to compare algorithms in the implementation.

An example of a translocation word formation

An example of a different order of magnitude for a presentation algorithm is the classic string translocation problem. a string and another string if only the position of the letter changes we are called the translocation . For example, ‘heart‘ and ‘earth‘ on each other as a translocation. The string ‘python‘和 ‘typhon‘ is also. To simplify the discussion of the problem, we assume that the characters in the string are 26 English letters and the length of the two strings is the same. Our goal is to write a Boolean A function of type to determine whether the two given strings are mutually translocation.

Method 1: Detect each individually

For the translocation problem, our first solution is to detect whether each letter of the first string is in the second string. If all of the letters are successfully detected, then two of the strings are in a translocation. Check that a letter succeeds will be replaced with Python's special value None  . However, since string is immutable in Python, the first step is to convert the string to a list. Look at the following code:

defAnagramSolution1 (S1,S2): Alist=list (s2) pos1=0 Stillok=True whilePos1 < Len (S1) andStillok:pos2=0 found=False whilePos2 < Len (alist) and  notfound:ifS1[POS1] = =Alist[pos2]: Found=TrueElse: Pos2= Pos2 + 1ifFound:alist[pos2]=NoneElse: Stillok=False pos1= pos1 + 1returnStillokPrint(AnagramSolution1 ('ABCD','DCBA'))
Method 2: Sort comparisons

Another solution is based on the idea that, even if two strings s1 and s2 different, t they are displaced when and only if they contain exactly the same set of letters. So if we first sort the characters of two strings according to the dictionary, and if the two strings are translocation, then we will get exactly the same two strings . In Python we can use the built-in method of list sort to simply implement sorting. Look at the following code:

defAnagramSolution2 (S1,S2): Alist1=list (S1) Alist2=list (s2) alist1.sort () Alist2.sort () Pos=0 Matches=True whilePos < Len (S1) andmatches:ifalist1[pos]==Alist2[pos]: Pos= pos + 1Else: Matches=FalsereturnmatchesPrint(AnagramSolution2 ('ABCDE','EDCBA'))

At first glance, you may think that the time complexity of the program is O(n), because there is only one simple comparison of n-letter loops. However, two calls to the Python sort function do not take into account the overhead. We'll introduce you later, The time that sorting will takecomplexity is o(n2) or o(nlogn) , so the ranking is dominated by the cycle.

Method 3: Violence

A brute force counting method is to enumerate all possibilities. For this problem, we can use s1 the letters to simply generate all possible strings and see s2 if they appear. However, this approach has a difficult point. We enumerate all the possibilities of S1, the first letter has n possible, the second position has the n-1 species possibility, the third position has the n-2 kind of possibility, ....... The total probability is: n (n-1) * (n-1) *3*2*1 = n!. N! has been shown to increment very quickly when N is very large, n! The increment speed exceeds 2n .

Method 4: Calculate and compare

The last solution is based on the fact that any two of the characters of the translocation have the same number of ' a ', the same number of ' B ', the same number of ' C ' .... In order to determine whether the two string is a translocation, we first calculate the number of times for each letter. Since there are only 26 possible letters, we can use a list to save 26 counts, each saving the possible letters. Each time we see a special letter, we increase the corresponding count. Finally, if the corresponding count of the two lists is exactly the same, the two strings are the translocation. Look at the following code:

defAnagramSolution4 (S1,S2): C1= [0]*26C2= [0]*26 forIinchRange (len (S1)): Pos= Ord (S1[i])-ord ('a') C1[pos]= C1[pos] + 1 forIinchRange (len (s2)): Pos= Ord (S2[i])-ord ('a') C2[pos]= C2[pos] + 1J=0 Stillok=True whileJ<26 andStillok:ifc1[j]==C2[j]: J= j + 1Else: Stillok=FalsereturnStillokPrint(AnagramSolution4 ('Apple','Pleap'))

Still, this solution contains a lot of loops. However, unlike the first scenario, they are not embedded. The first two loop schemes all calculate letters on the basis of N. The third scheme of loops, comparing the number of counts in two strings, requires only 26 steps because a string has only 26 possible letters. To accumulate together we get T(n)=2n+ step. That is O(n). We have found the linear time solution for this problem.

Before we leave this example, we need to talk about space overhead. While the final solution can be run in linear time, it succeeds in having to keep the number of characters in two lists by using additional storage. In other words, the algorithm uses space-changing time.

This is a common situation. On many occasions, you need to make decisions about the tradeoffs between time and space. In the present case, the amount of extra space is not significant. However, if the following letter has millions of words, you must pay more attention to the space overhead. As a computer scientist, when choosing an algorithm, it is up to you to decide how to use the computer resources to solve a particular problem.

You may also be interested in:

Python data structures and algorithms-object-oriented

Python data structures and algorithms--type

Python Basics (10)--Digital

Python Basics (9)--Regular expressions

Python Basics (8)--File

Python Basics (7)--functions

Python Basics (6)-conditions, loops

Python Basics (5)--Dictionary

Python Basics (4)--string

Python Basics (3)--lists and tuples

Python Basics (2)--Object type

Python Basics (1)--python programming habits and features

Python data structure and algorithm--algorithm analysis

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.