Recently changed places, the net fee expires, there are two weeks not updated blog, blog or to insist on writing, sometimes work encountered related problems, check the relevant blog, or can get some ideas or inspiration. Although writing a blog to pay a lot of time (I usually spend 1.5 hours to two hours between), but this middle code words, summed up the process or let me benefit, warm so know new! Of course, sharing their own learning experience, will also let oneself know some like-minded friends, also very good. Not to mention a lot, today is about how to improve Python performance issues.
Python's performance relative to the C language and so on is still a certain disadvantage, but if you can master some of the techniques to optimize performance, not only to improve the efficiency of the Code, but also to make the code more pythonic. When I first contacted Python, I also found some blog posts on the Internet to improve the performance of Python, or it was another benefit, http://www.jb51.net/article/56699.htm
This blog post is still well written, can be consulted. But combined with my own recent books and books on the proposal, do a more detailed summary of it, not the right place to welcome criticism, common progress!
First, the basic techniques of cycle optimization
The optimization of the cycle should follow the principle of minimizing the amount of computation in the cycle, and in the case of multiple loops try to refer to the previous layer of memory calculation.
(1) Reduce the internal calculation of the cycle. Let's look at the following example
#coding =utf-8import Datetimeimport math# First Method def fun1 (iter,x): j = 0 for i in iter:d = Math.sqrt (x) J + = i * d Return J # second method def fun2 (iter,x): j = 0 d = math.sqrt (x) for i in iter:j + = i * d return ji ter = Range (1,1000) t0 = Datetime.datetime.now () a = FUN1 (iter,9) t1 = Datetime.datetime.now () b = fun2 (iter,9) t2 = DATETIME.D Atetime.now () Print A, "", Bprint t1-t0,t2-t1
Running results such as:
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M02/88/57/wKioL1fx55qz_JUKAAApTCz5FSM590.png-wh_500x0-wm_3 -wmp_4-s_4187676640.png "title=" Selection _007.png "alt=" Wkiol1fx55qz_jukaaaptcz5fsm590.png-wh_50 "/>
The second method is faster than a speed, because the first method, D = math.sqrt (x) within the loop, each cycle will be repeated calculation once, increase the system overhead, where D = math.sqrt (x) is a relatively simple calculation, if you encounter the complex, The first method that is computationally large is really not good enough. In general, the second method is 40%-60% more efficient than the first method of operation.
(2) Change the display loop to an implicit loop. For example, the arithmetic progression can be calculated directly through the loop, which is very simple, as follows:
#coding =utf-8def sum (n): sum = 0for i in Xrange (n+1): Sum +=i
This is no problem, but arithmetic progression has a ready-made formula, that is, N (n+1)/2, there is no need to use the loop again, so if there is a similar situation in the program, you can directly change the display loop to an implicit.
(3) Refer to local variables as much as possible in the loop. According to the "LEGB" principle (which you do not know, refer to the previous post, Address: http://11026142.blog.51cto.com/11016142/1840128), local variables in the namespace are first searched, so local variable queries are faster than global variables. In a loop, if you refer to a variable multiple times, try to convert it to a local variable, see the example below
#coding =utf-8import Datetimeimport mathx = [10,20,30,40,50,60,70,80,90] #第一种方法def fun1 (x): For I in Xrange (len (x)): X[i] = Math.sin (X[i]) return x# second method def fun2 (x): Loc_sin = Math.sin for i in Xrange (len (x)): x[i] = loc _sin (X[i]) return xt0 = Datetime.datetime.now () a = FUN1 (x) t1 = Datetime.datetime.now () b = fun2 (x) t2 = Datetime.datetime . Now () print Aprint bprint t1-t0,t2-t1
The results of the operation are as follows:
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M02/88/57/wKioL1fx7P2g4oXpAAAxhk6B-vg291.png-wh_500x0-wm_3 -wmp_4-s_2709661818.png "title=" Selection _008.png "alt=" Wkiol1fx7p2g4oxpaaaxhk6b-vg291.png-wh_50 "/>
Can see method two faster than method one, I think this method, in peacetime application of a library (including standard library and custom library, module) function, can use, compare program search this library function also takes time, if the first method, that is circular search, that will certainly divide more time, This technique I think worth everybody to learn, draw lessons from.
(4) for nested loops, try to calculate the inner loop to the upper layer. Look at the following example
The first method of #coding =utf-8import datetime# def fun1 (iter1,iter2): max1 = 0 for i in range (Len (iter1)): for j in range (Len (iter2)): x = iter1[i] + iter2[j] max1 = x if x > max1 else max1 return max1# the second method def fun2 (Iter1,iter2): max1 = 0 for i in range (Len (iter1)): temp = iter1[i] for j in range (Len (iter2)) : x = temp + iter2[ J] max1 = x if x > max1 else max1 return max1l1 = [1,23,4,5,34,8,10,18,42,10,6,88]l2 = [100,102,34,15,16,56]t0 = datetime.datetime.now () a = fun1 (L1,L2) t1 = Datetime.datetime.now () b = fun2 (L1,L2) t2 = datetime.datetime.now () print aprint Bprint t1-t0,t2-t1
The results of the operation are as follows:
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M00/88/5B/wKiom1fx8e-zVz7QAAAacLt41Tk666.png-wh_500x0-wm_3 -wmp_4-s_503345563.png "title=" Selection _009.png "alt=" Wkiom1fx8e-zvz7qaaaaclt41tk666.png-wh_50 "/>
Visible method Two is faster, nested for loop operating mechanism is i=0 (example of the above example), then J from 0 to the maximum, and then I increment 1,j from 0 to the maximum, and so on. So add a temporary variable to the upper layer of the inner loop, and the inner layer is used without recalculation. Here to illustrate, for the list of its index, slicing and other operations are also a calculation/operation, it will take time.
Second, using different data structures to optimize performance
The most commonly used data structure is the list, its memory management is similar to the C + + vector, that is, pre-allocating a certain amount of memory, when the pre-allocated memory is exhausted but not enough, and continue to insert elements, will start a new round of memory allocation, The list object will re-request a larger memory space based on the memory growth algorithm, then copy all the original elements, destroy the previous memory, and then insert the new element, and if not enough, continue repeating the above steps. Deleting elements is also true, if you find that the used space is less than half of the pre-allocated memory space, the list will request a small piece of memory, copy again, and then destroy the large memory ( if you want to understand this part of the knowledge, I recommend you look at the "Python Source code Anatomy" This book, this book I also began to see one , not yet read, I think this book is very good, it is worth reading two or three or even four or five, the first chapter and the second chapter of the content to be read carefully, otherwise the later things look more confused (except the great God)).
As can be seen, if the list object often has a huge change in the number of elements, and more frequently, this time should consider using Deque. If you do not know deque, you can refer to my previous blog http://11026142.blog.51cto.com/11016142/1851791
The deque is a dual-ended queue with both stack and queue features. can provide both insert and delete operations with an O (1) complexity. Its greatest advantage over the list is memory management. When it does not have enough memory to apply, it does not look like a list, but instead applies new memory to accommodate the new element, and then joins the new element with the old element to avoid copying the elements. As a result, however, when the number of elements frequently changes dramatically, the performance of Deque is several times the list.
Array, the Chinese name is an array, is a sequence data structure that looks similar to the list, but all members must be of the same base type. When an array is instantiated, it needs to indicate its type of storage element. For example, ' C ' means storing a char type in the C language, taking up memory size of 1 bytes. This is illustrated from another perspective, which optimizes the memory space of the code. Look at the following example
#coding =utf-8import Sysimport Arraya = Array.array (' C ', "Hello,world") c = List ("Hello,world") print sys.getsizeof (a), Sys.getsizeof (c)
The results of the operation are as follows:
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M00/88/5C/wKiom1fx_2DBFfuFAAAVEXUmHmo035.png-wh_500x0-wm_3 -wmp_4-s_167102668.png "title=" Selection _010.png "alt=" Wkiom1fx_2dbffufaaavexumhmo035.png-wh_50 "/>
Obviously, the list object consumes more memory. This can affect performance gains for some other operations, such as converting a container object to a string, at which point the array performance is higher than list. Look at the following example
#coding =utf-8import Arrayimport Datetimea = Array.array (' C ', ' Hello,world ') c = List ("Hello,world") t0 = Datetime.datetime.now () S1 = ". Join (c) t1= datetime.datetime.now () s2 =". Join (a) t2 = Datetime.datetime.now () Print T1- T0,t2-t1
The results of the operation are as follows:
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/88/58/wKioL1fyAcmQ1t7wAAAVZw_2o38190.png-wh_500x0-wm_3 -wmp_4-s_366691489.png "title=" Selection _011.png "alt=" Wkiol1fyacmq1t7waaavzw_2o38190.png-wh_50 "/>
Also not so the array performance improvement is relatively large, such as sorting, array performance than list, see the following example:
#coding =utf-8import Arrayimport Datetimea = Array.array (' C ', ' Hello,world ') c = List ("Hello,world") t0 = Datetime.datetime.now () c.reverse () t1= Datetime.datetime.now () a.reverse () t2 = Datetime.datetime.now () print T1-t0,t2 -T1
The results are as follows:
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/88/5C/wKiom1fyApLBwFjzAAAVyrI3nDg926.png-wh_500x0-wm_3 -wmp_4-s_786055152.png "title=" Selection _012.png "alt=" Wkiom1fyaplbwfjzaaavyri3ndg926.png-wh_50 "/>
(iii) Advantages of using good set
Set is a set, and the collection in Python is an unordered set of elements implemented by a hash algorithm, and the creation of a set is achieved through set (). See:
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/88/58/wKioL1fyA8fRvBi3AAKqrLC2sho474.png-wh_500x0-wm_3 -wmp_4-s_1860976773.png "title=" Selection _013.png "alt=" Wkiol1fya8frvbi3aakqrlc2sho474.png-wh_50 "/>
The Set object also supports adding elements, but its performance is several times the performance of the list add element, which is not too much to describe, there is time devoted to writing a blog about the Python collection.
Set in the intersection, set, difference sets and other sets of operations related to the performance to force list fast, so related to the list of intersection, set, difference set and other operations, you can convert the list to set
All right, we're going to be here tomorrow. Summarize some of the tips for improving Python performance, such as generators, processes, and threads (pools).
Some suggestions for improving Python performance (i)