One, the choice of data structure:
1. Look in the list:
For a sorted list consider using the Bisect module to implement a lookup element, which uses a binary lookup to implement
def find (Seq, el): pos = bisect (seq, el) if pos = = 0 or (pos = = Len (seq) and seq[-1]! = EL): return-1
return pos-1
A quick insertion of an element can be used:
This inserts the element and does not need to call the sort () again to guarantee the order, knowing that the long list is expensive.
2. Set instead of list:
For example, to do a list to go heavy, the most likely to think of the implementation:
seq = [' A ', ' a ', ' b ']res = []for i in SEQ: If I is not in res: res.append (i)
Obviously the complexity of the above implementation is O (N2), if changed to:
seq = [' A ', ' a ', ' B ']res = set (seq)
The complexity is immediately reduced to O (n) and of course the set is assumed to be sufficient for subsequent use.
In addition, the set of union,intersection,difference and other operations than the list of the iteration faster, so if it involves the intersection of the list, and set or difference set and other problems can be converted to set to do, usually used when more attention, especially when the list is relatively large, The impact on performance is even greater.
3. Replace the built-in container type with the Python collections module:
There are three types of collections:
- Deque: Similar list type for enhanced features
- Defaultdict: Similar dict type
- Namedtuple: Similar to tuple type
The list is based on an array, and Deque is based on a doubly linked list, so the latter inserts the element in the middle or front, or deletes the element much faster.
Defaultdict adds a default factory for new key values, avoids writing an additional test to initialize the mapping entries, is more efficient than dict.setdefault, and refers to an example of a Python document:
#使用profile Stats tools for performance analysis
>>> from Pbp.scripts.profiler Import profile, stats>>> s = [(' Yellow ', 1), (' Blue ', 2), (' Yellow ', 3),.. . (' Blue ', 4), (' Red ', 1)]>>> @profile (' Defaultdict ') ... def faster (): .... D = defaultdict (list) ... for K, V in S:.. . D[k].append (v) ...>>> @profile (' Dict ') ... def slower (): ... d = {} ... for k, V in S: ... D.setdefault (k, []). Append ( V) ...>>> slower (); Faster () optimization:solutions[306]>>> stats[' dict ']{' stones ': 16.587882671716077, ' memory ': 396, ' time ': 0.35166311264038086}>>> stats[' defaultdict ']{' stones ': 6.5733464259021686, ' memory ': 552, ' time ': 0.13935494422912598}
Visible performance up to 3 times times faster. Defaultdict uses a list factory as a parameter and can also be used for built-in types such as long.
In addition to the algorithms and architectures implemented, Python advocates simplicity and elegance. So the right grammar practice is necessary to write elegant and easy-to-read code.
Second, the best practice of grammar:
- String manipulation: Better than a Python string object is immutable, so the manipulation of any string, such as stitching, modification, and so on, will result in a new string object, not based on the original string, so this continuous copy will somewhat affect the performance of Python:
(1) Use join instead of ' + ' operator, which has copy overhead;
(2) When the string can be processed using regular expressions or built-in functions, select the built-in function. such as Str.isalpha (), Str.isdigit (), Str.startswith ((' x ', ' yz '), Str.endswith ((' x ', ' YZ '))
(3) Character format operation is better than direct concatenation reading:
str = "%s%s%s%s"% (a, B, C, D) # efficient str = "" + A + B + C + D + "" # Slow
2. Use the list comprehension & Generator (builder) & Decorators (decorator) to familiarize yourself with the modules such as Itertools:
(1) List parsing, I think is the most impressive feature in Python2, example 1:
>>> # The following isn't so pythonic >>> numbers = range (Ten) >>> i = 0 >>&G T Evens = [] >>> while I < Len (numbers): >>> if i%2 = = 0:evens.append (i) >>> ; i + = 1 >>> [0, 2, 4, 6, 8] >>> # The good to iterate a range, elegant and efficient &G t;>> evens = [i-I in range] if i%2 = = 0]
Example 2:
def _treament (POS, Element): return '%d:%s '% (pos, Element)
f = open (' Test.txt ', ' r ') if __name__ = = ' __main__ ': #list comps 1 print sum (len (word) for line and F for word in Li Ne.split ()) #list comps 2 Print [(x + 1, y + 1) for x in range (3) for Y in range (4)] #func print filter (LA MBDA x:x% 2 = = 0, range (Ten)) #list comps3 Print [i-I in range] if I% 2 = = 0] #list comps4 pythonic
print [_treament (i, EL) for I, El in enumerate (range)]output:24[(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4)][0, 2, 4, 6, 8][0, 2, 4, 6, 8][' 0:0 ', ' 1:1 ', ' 2:2 ', ' 3:3 ', ' 4:4 ', ' 5:5 ', ' 6:6 ', ' 7:7 ', ' 8:8 ', ' 9:9 ']
Yes, it's so elegant and simple.
(2) The generator expression is introduced in python2.2, which uses the ' lazy evaluation ' idea, so it is more efficient in using memory. Refer to the example of the longest line in the calculation file in Python core programming:
f = open ('/etc/motd, ' r ') longest = max (Len (X.strip ()) for x in F) f.close () return longest
This implementation is simple and does not need to read all the file files into memory.
(3) Python's introduction of adorners in 2.4 is another exciting feature that makes it easier to read and understand the functions and methods encapsulated (functions that receive a function and return an enhanced version). The ' @ ' symbol is the adorner syntax, you can decorate a function, remember to call the results for later use, this technique is called memoization, the following is the decorator to complete a cache function:
Import timeimport hashlibimport picklefrom itertools Import chaincache = {}def is_obsolete (entry, duration): return Tim E.time ()-entry[' time ' > Durationdef compute_key (function, args, kw): #序列化/Deserialize an object, where the function and parameter objects are serialized as an H using the Pickle module Ash Value key = Pickle.dumps ((function.func_name, args, kw) #hashlib是一个提供MD5和sh1的一个库, the result is saved in a global dictionary return Hashlib.sha 1 (key). Hexdigest () def memoize (duration=10): Def _memoize (function): Def __memoize (*args, **kw): key = Compute_key (function, args, kw) # do we have it already if (key in the cache and not Is_ Obsolete (Cache[key], duration)): print ' We got a winner ' return cache[key][' value '] # Computing result = function (*args, **kw) # Storing the result cache[key] = {' Value ': result,-' time ': Time.time ()} return result return __memoize return _memo Ize@memoize () def very_verY_complex_stuff (A, B, c): return a + B + cprint Very_very_complex_stuff (2, 2, 2) print Very_very_complex_stuff (2, 2, 2) @ Memoize (1) def very_very_complex_stuff (A, B): Return a + bprint Very_very_complex_stuff (2, 2) time.sleep (2) Print Very_ver Y_complex_stuff (2, 2)
Operation Result:
6
We got a winner
6
4
4
Decorators are used in many scenarios, such as parameter checking, lock synchronization, unit testing frameworks, and others who are interested in learning more about themselves.
3. Leveraging Python's powerful introspection capabilities (attributes and descriptors): Since the use of Python, it is really surprising that the original introspection can be done so powerful and simple, on this topic, limited to the content of more, here do not repeat, and sometimes do a separate summary, Learning Python must have a good understanding of its introspection.
Third, coding tips:
- The version before Python3 uses xrange instead of range because range () returns the complete list of elements directly and xrange () produces only one integer element per call in the sequence, with little overhead. (xrange no longer exists in Python3, and the inside range provides a iterator that can traverse a range of any length)
- If-is-not-none is faster than the statement if-done! = none;
- Use the "in" operator as much as possible, concise and fast: For I in Seq:print I
- ' x < y < Z ' instead of ' x < y and y < Z ';
- While 1 is faster than while true, because the former is a single-step operation, and the latter needs to be computed;
- Use build-in functions as much as possible, because these functions are often efficient, such as add (A, b) better than a+b;
- In a more time-consuming loop, you can change the function's invocation to inline, and the inner loop should remain concise.
To swap elements using multiple assignments:
X, y = y, x # Elegant and efficient
Instead of:
temp = x x = y
9. Ternary operator (after python2.5): V1 if x else V2, avoid using (x and V1) or V2, because the latter when v1= "", there is a problem.
Python switch case implementation: Because the switch case syntax is completely available if else instead, there is no switch case syntax for Python, but we can do this with dictionary or LAMDA:
Switch Case structure:
Switch (VAR) {case v1:func1 (); Case V2:func2 (); ... Case VN:FUNCN (); Default:default_func ();}
Dictionary implementation:
Values = { v1:func1, v2:func2, ... VN:FUNCN, }values.get (Var, default_func) ()
Lambda implementations:
{ ' 1 ': lambda:func1, ' 2 ': Lambda:func2, ' 3 ': Lambda:func3}[value] ()
Using Try...catch to implement the case with default, the personal recommendation is to use the Dict implementation method.
Summary of Python coding best practices