I believe in python a lot of classmates, I have always had a love for Python, there is no doubt that Python as an explanatory dynamic language is not efficient, but Python is simple, easy to read and extensibility features such as the great popularity of it.
Many colleagues in the work are using Python, but often very few people pay attention to its performance and usage, generally is now learning to use, after all, Python is not our main language, we generally just use it to do some system management work. But why don't we do it better? Python Zen has such a sentence: there should be one--and preferably only one--obvious the way to do it. Although that is obvious at first unless you ' re Dutch. The idea is that Python encourages an optimal way to do something, which is a difference from Ruby. So a good python writing habits personally think it is very important, this article on the point of view from a performance perspective on Python some of the customary methods to do a simple summary, I hope to be useful to everyone ~
When it comes to performance, the easiest thing to think about is reducing complexity, which can typically be analyzed by measuring the Code loop complexity (Cyclomatic complexitly) and the Landau symbol (Big O), such as Dict lookup is O (1), and the lookup of the list is O (n), It is obvious that the choice of storing data directly affects the complexity of the algorithm.
One, the choice of data structure
1. Look in the list:
For a sorted list consider using the Bisect module to implement a lookup element, which uses a binary lookup to implement
def find (Seq, el): pos = bisect (seq, el) if pos = = 0 or (pos = = Len (seq) and seq[-1]! = EL): return-1
return pos-1
A quick insertion of an element can be used:
This inserts the element and does not need to call the sort () again to guarantee the order, knowing that the long list is expensive.
2. Set instead of list:
For example, to do a list to go heavy, the most likely to think of the implementation:
seq = [' A ', ' a ', ' b ']res = []for i in SEQ: If I is not in res: res.append (i)
Obviously the complexity of the above implementation is O (N2), if changed to:
seq = [' A ', ' a ', ' B ']res = set (seq)
The complexity is immediately reduced to O (n) and of course the set is assumed to be sufficient for subsequent use.
In addition, the set of union,intersection,difference and other operations than the list of the iteration faster, so if it involves the intersection of the list, and set or difference set and other problems can be converted to set to do, usually used when more attention, especially when the list is relatively large, The impact on performance is even greater.
3. Replace the built-in container type with the Python collections module:
There are three types of collections:
Deque: Similar list type for enhanced features
Defaultdict: Similar dict type
Namedtuple: Similar to tuple type
The list is based on an array, and Deque is based on a doubly linked list, so the latter inserts the element in the middle or front, or deletes the element much faster.
Defaultdict adds a default factory for new key values, avoids writing an additional test to initialize the mapping entries, is more efficient than dict.setdefault, and refers to an example of a Python document:
#使用profile Stats tool for performance analysis >>> from Pbp.scripts.profiler Import profile, stats>>> s = [(' Yellow ', 1), (' BL UE ', 2), (' Yellow ', 3),... (' Blue ', 4), (' Red ', 1)]>>> @profile (' Defaultdict ') ... def faster (): .... D = defaultdict (list) ... for K, V in S:.. . D[k].append (v) ...>>> @profile (' Dict ') ... def slower (): ... d = {} ... for k, V in S: ... D.setdefault (k, []). Append ( V) ...>>> slower (); Faster () optimization:solutions[306]>>> stats[' dict ']{' stones ': 16.587882671716077, ' memory ': 396, ' time ': 0.35166311264038086}>>> stats[' defaultdict ']{' stones ': 6.5733464259021686, ' memory ': 552, ' time ': 0.13935494422912598}
Visible performance up to 3 times times faster. Defaultdict uses a list factory as a parameter and can also be used for built-in types such as long.
In addition to the algorithms and architectures implemented, Python advocates simplicity and elegance. So the right grammar practice is necessary to write elegant and easy-to-read code.
Ii. Grammar Best Practices
String manipulation: Better than a Python string object is immutable, so the manipulation of any string, such as stitching, modification, and so on, will result in a new string object, not based on the original string, so this continuous copy will somewhat affect the performance of Python:
(1) Use join instead of ' + ' operator, which has copy overhead;
(2) When the string can be processed using regular expressions or built-in functions, select the built-in function. such as Str.isalpha (), Str.isdigit (), Str.startswith ((' x ', ' yz '), Str.endswith ((' x ', ' YZ '))
(3) Character format operation is better than direct concatenation reading:
str = "%s%s%s%s"% (a, B, C, D) # efficient
str = "" + A + B + C + D + "" # Slow
2. Use the list comprehension & Generator (builder) & Decorators (decorator) to familiarize yourself with the modules such as Itertools:
(1) List parsing, I think is the most impressive feature in Python2, example 1:
>>> # The following isn't so pythonic >>> numbers = range (Ten) >>> i = 0 >>&G T Evens = [] >>> while I < Len (numbers): >>> if i%2 = = 0:evens.append (i) >>> ; i + = 1 >>> [0, 2, 4, 6, 8] >>> # The good to iterate a range, elegant and efficient &G t;>> evens = [I for I in range] if i%2 = = 0] >>> [0, 2, 4, 6, 8]
Example 2:
def _treament (POS, Element): return '%d:%s '% (pos, Element) F = open (' Test.txt ', ' r ') if __name__ = = ' __main__ ': # List Comps 1 print sum (len (word) for line and F for word in line.split ()) #list comps 2 Print [(x + 1, y + 1) F or x in range (3) for Y in range (4)] #func print filter (lambda x:x% 2 = = 0, range (Ten)) #list comps3 Print [I-I in range] if I% 2 = = 0] #list COMPS4 pythonic Print [_treament (i, EL) for I, El in enumerate (range)]output:24[(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4)][0, 2, 4, 6, 8][0, 2, 4, 6, 8][' 0:0 ', ' 1:1 ', ' 2:2 ', ' 3:3 ', ' 4:4 ', ' 5:5 ', ' 6:6 ', ' 7:7 ', ' 8:8 ', ' 9:9 ']
Yes, it's so elegant and simple.
(2) The generator expression is introduced in python2.2, which uses the ' lazy evaluation ' idea, so it is more efficient in using memory. Refer to the example of the longest line in the calculation file in Python core programming:
f = open ('/etc/motd, ' r ') longest = max (Len (X.strip ()) for x in F) f.close () return longest
This implementation is simple and does not need to read all the file files into memory.
(3) Python's introduction of adorners in 2.4 is another exciting feature that makes it easier to read and understand the functions and methods encapsulated (functions that receive a function and return an enhanced version). The ' @ ' symbol is the adorner syntax, you can decorate a function, remember to call the results for later use, this technique is called memoization, the following is the decorator to complete a cache function:
Import timeimport hashlibimport picklefrom itertools Import chaincache = {}def is_obsolete (entry, duration): return time. Time ()-entry[' time '] > Durationdef compute_key (function, args, kw): #序列化/Deserialization of an object, Here is the use of the Pickle module to serialize function and parameter objects to a hash value key = Pickle.dumps ((function.func_name, args, kw)) #hashlib是一个提供MD5和sh1的一个库, The result is saved in a global dictionary return HASHLIB.SHA1 (key). Hexdigest () def memoize (duration=10): Def _memoize (function): Def __memoize (*ar GS, **KW): key = Compute_key (function, args, kw) # do we have it already if (key in cache and not are _obsolete (Cache[key], duration)): print ' We got a winner ' return cache[key][' value '] # computing R Esult = function (*args, **kw) # Storing the result cache[key] = {' value ': result,-' time ': Time.tim E ()} return result return __memoize return _memoize@memoize () def very_very_complex_stuff (A, B, c): return a + b + Cprint Very_very_complex_stuff (2, 2, 2) print Very_very_complex_stuFF (2, 2, 2) @memoize (1) def very_very_complex_stuff (A, B): Return a + bprint Very_very_complex_stuff (2, 2) time.sleep (2) PRI NT Very_very_complex_stuff (2, 2)
Operation Result:
6we got a winner644
Decorators are used in many scenarios, such as parameter checking, lock synchronization, unit testing frameworks, and others who are interested in learning more about themselves.
3. Leveraging Python's powerful introspection capabilities (attributes and descriptors): Since the use of Python, it is really surprising that the original introspection can be done so powerful and simple, on this topic, limited to the content of more, here do not repeat, and sometimes do a separate summary, Learning Python must have a good understanding of its introspection.
Third, coding tips
1. Before Python3 version use xrange instead of range, because range () returns the complete list of elements directly and xrange () produces only one integer element per call in the sequence, with little overhead. (xrange no longer exists in Python3, and the inside range provides a iterator that can traverse a range of any length)
2, if-is-not-none is faster than the statement if-done! = none;
3, try to use the "in" operator, concise and fast: For I in Seq:print I
4, ' x < y < Z ' instead of ' x < y and y < Z ';
5, while 1 is faster than while true, because the former is a single-step operation, the latter needs to be calculated;
6, try to use build-in function, because these functions are often very efficient, such as add (A, b) is better than a+b;
7, in a time-consuming cycle, you can change the function of the call to inline, the inner loop should be kept concise.
8. Use multiple assignments to swap elements:
x, y = y, X # Elegant and efficient
Instead of:
temp = x
x = y
y = Temp
9. Ternary operator (after python2.5): V1 if x else V2, avoid using (x and V1) or V2, because the latter when v1= "", there is a problem.
Python switch case implementation: Because the switch case syntax is completely available if else instead, there is no switch case syntax for Python, but we can do this with dictionary or LAMDA:
Switch Case structure:
Switch (VAR) {case v1:func1 (); Case V2:func2 (); ... Case VN:FUNCN (); Default:default_func ();} Dictionary implementation: Values = { v1:func1, v2:func2, ... VN:FUNCN, }values.get (Var, default_func) () lambda implementation: {' 1 ': lambda:func1, ' 2 ': Lambda:func2, ' 3 ': Lambda:func3}[va Lue] ()
Using Try...catch to implement the case with default, the personal recommendation is to use the Dict implementation method.
This is only a summary of some of the Python practices, I hope these suggestions can help everyone who use Python, optimize performance is not the focus, efficient problem solving, so that their own code is easier to maintain!