The first trick: Snake hit seven inch: Positioning bottlenecks
First, the first step is to locate the bottleneck. For a simple chestnut, a function can be optimized from 1 seconds to 0.9 seconds, the other function can be optimized from 1 minutes to 30 seconds, if the cost is the same, and time limit can only be done one, which to engage? According to the principle of the short plate, of course, the second one.
An experienced programmer will be hesitant here, wait a minute? Function? So, what's the number of calls to consider? If the first function needs to be called 100,000 times throughout the program, and the second function is called 1 times throughout the program, this is not necessarily the case. This chestnut, is to show that the bottleneck of the program is sometimes not necessarily a glance can be seen. or the choice above, the programmer you should have the feeling, most of the cases: a "can" from a minute to optimize to 30 seconds of the function will be more than a "can" from 1 seconds optimization to 0.9 seconds of the function more easily capture our attention, because there is a lot of room for progress.
So, with so much nonsense, the first trick, profile. This is a tool for Python's own locator bottleneck! Although it offers three options for Profile,cprofile,hotshot. Also divided into internal and external. However, the individual feel a kind of enough, external cprofile. The heart is as follows:
Python-m profile tease program. PY
The effect of this trick is to output a series of things, such as the function has been called several times, the total time, how much of the function is the child function spent, each time spent, and so on. A picture wins thousands of words:
Filename:lineno (function): File name: Number of lines (function name)
Ncalls: The goods have been called several times.
Tottime: How much time does it take for the goods to get rid of the cost of the internal functions?
Percall: Average time spent on each call, Tottime divided by ncalls
Cumtime: The goods and all of its internal functions. The total cost of the younger brothers
Percall: It's like the percall above, but cumtime divided by Ncalls.
Find the best point to optimize, and then do it.
The second trick: a snake Zen: just one trick
I remember when I first approached Python, one of my seniors told me that Python had a great idea, and it wanted everyone who used it to write the exact same program. The Zen of Python has a cloud:
There should be one--and preferably only one--obvious-do it
So the professional master of Python offers some of the most common functions of only one. I looked at the legend of the Pythonwiki:performancetips, summed up a few "do not Jiangzi" to Jiangzi.
Do not Jiangzi when merging strings:
s = "" for substring in list:s + = substring
To Jiangzi:
s = "". Join (Slist)
Do not Jiangzi when formatting a string:
out = "" + head + Prologue + query + tail + ""
To Jiangzi:
out = "%s%s%s%s"% (head, prologue, query, tail)
Do not use loops without loops, such as not Jiangzi:
NewList = [] for word in Oldlist:newlist.append (Word.upper ())
To Jiangzi:
NewList = Map (str.upper, Oldlist)
or Jiangzi:
NewList = [S.upper () for s in Oldlist]
Dictionary initialization, more commonly used:
Wdict = {} for Word in Words:if Word is wdict:wdict[word] = 0 Wdict[word] + = 1
If there are too many repetitive words, consider using the Jiangzi pattern to save a lot of judgment:
Wdict = {} for Word in Words:try:wdict[word] + = 1 except keyerror:wdict[word] = 1
Minimize the number of function calls and replace them with internal loops, for example, do not Jiangzi:
x = 0 def doit1 (i): Global x x = x + I list = range (100000) T = Time.time () for I in List:doit1 (i)
To Jiangzi:
x = 0 def doit2 (list): Global x for i in list:x = x + I list = range (100000) T = Time.time () doit2 (list)
The third trick: Snake Sniper: high-speed search
This part comes from Ibm:python Code performance optimization techniques, the highest level of search algorithm is O (1) algorithm complexity. That is, Hash Table. Fortunately, I learned some data structure when I was a bachelor. Know that Python's list is implemented using a method like a linked list. If the list is large, in the vast number of items with if X in List_a to do the search and judge the efficiency is very low.
Python's tuple I use very little, not comment. The other two I used very much is set and Dict. These two are used similar to the implementation of Hash Table.
So try not to Jiangzi:
K = [10,20,30,40,50,60,70,80,90] for I in Xrange (10000): If I in K: #Do something continue
To Jiangzi:
' k = [10,20,30,40,50,60,70,80,90] k_dict = {i:0 for i in k}
Convert list to Dictionary first
For I in Xrange (10000): If I in k_dict: #Do something Continue "'
Find the intersection of list, do not Jiangzi:
List_a = [1,2,3,4,5]list_b = [4,5,6,7,8]list_common = [A For a in list_a if a in list_b]
To Jiangzi:
List_a = [1,2,3,4,5]list_b = [4,5,6,7,8]list_common = set (list_a) &set (list_b)
Four strokes: Snake snake ... : I can't think of a name, it's a variety of tips
Variable exchange does not require intermediate variables: A, B = b,a (there is a pit of God, so far memory: True,false = false,true)
If the use of python2.x, with xrange instead of range, if the python3.x,range is already xrange, Xrange has been wood. Instead of generating a list like range, Xrange generates an iterator that saves memory.
You can use X>y>z instead of x>y and y>z. More efficient and more readable. Of course, theoretically x>y
is add (x, y) generally faster than a+b? This I have doubts, experiment a bit, first add can not directly use, to import operator, second, my experimental results indicate that add (x, y) is not a+b fast, let alone also sacrifice readability.
While 1 is really a little bit faster than while True. Two experiments were done, about 15% faster.
Five strokes: No snake wins with snakes: Performance outside the code
Outside the code, in addition to the hardware, is the compiler, here is a grand recommendation PyPy. PyPy is a real-time compiler called Just-in-time. The compiler's feature is to compile a sentence run, and static compiler difference, I see on the understanding of a very image of the metaphor:
Assuming that you are a director, static compilation is to let the actor recite the entire script down thoroughly understand, and then perform one hours of continuous performance. Dynamic compilation is to let the actor perform two minutes, then think about it, then look at the script, then perform two minutes ...
Dynamic compilation and static compilation have their own strengths, see if you are in a movie or a play.
There's also a cython that can have some C code built into Python. I use very little, but the critical moment really works.