Time Complexity of optimization algorithm
The time complexity of the algorithm has the greatest influence on the execution efficiency of the program, in Python, the time complexity can be optimized by selecting the appropriate data structure, such as the time complexity of the list and set to find an element is O (n) and O (1) respectively. Different scenarios have different ways to optimize, in general, there are generally divided into governance, branch boundaries, greed, dynamic planning and other ideas.
Reduce redundant data
such as the upper triangular or lower triangular way to save a large symmetric matrix. A sparse matrix representation is used in the 0-element matrix.
Rational use of copy and Deepcopy
For objects of data structures such as dict and lists, direct assignment uses the reference method. In some cases it is necessary to copy the entire object, and then you can use copy and deepcopy in the copy package, which is the difference between the two functions that are recursively replicated. Efficiency is not the same: (The following program runs in Ipython)
Import Copy
a = range (100000)
%timeit-n copy.copy (a) # run 10 times copy.copy (a)
%timeit-n copy.deepcopy (a)
10 loops, best 3:1.55 ms per loop
loops, best of 3:151 ms Per loop
The Timeit-n indicates the number of runs, and the last two lines correspond to the output of two Timeit, the same below. This shows that the latter is one order of magnitude slow.
Use Dict or set to find elements
Python dict and set are implemented using a hash table (similar to the C++11 Standard library unordered_map), and the time complexity of finding elements is O (1)
A = range (1000)
s = Set (a)
d = dict ((i,1) for I in a)
%timeit-n 10000-D
%timeit-n 10000-in s< c5/>10000 loops, best 3:43.5 ns/Loop
10000 loops, Best of 3:49.6 ns per loop
The dict is slightly more efficient (and takes up more space).
Rational use of generators (generator) and yield
%timeit-n A = (I for I in range (100000))
%timeit-n B = [I for I in range (100000)]
loops, best 3 : 1.54 ms per loop
loops, best of 3:4.56 ms Per loop
Using () Gets a generator object, the required memory space is independent of the size of the list, so the efficiency is higher. For specific applications, such as set (I for I in range (100000)) is faster than set ([I for I in range (100000)]).
However, for situations that require circular traversal:
%timeit-n for x in (I to I in range (100000)): Pass
%timeit-n for x in [I for I in range (100000)]:
Pass Loops, Best of 3:6.51 ms Per loop
loops, best of 3:5.54 ms Per loop
The latter is more efficient, but if there is a break in the loop, the benefits of using generator are obvious. yield is also used to create generator:
def yield_func (LS): for
i in LS:
yield i+1
def not_yield_func (LS): return
[i+1 to I in ls]
ls = range (1000000)
%timeit-n for I-yield_func (ls):p
%timeit-n for i-not_yield_func (LS):p ass-loops
, Best of 3:63. 8 ms per loop
loops, best of 3:62.9 ms Per loop
For a list that is not very large in memory, you can return a list directly, but the readability is yield better (human preference).
python2.x built-in generator functions have xrange functions, itertools packets and so on.
Optimizing loops
What you can do outside the loop is not within the loop, for example, the following optimizations can be as fast as one:
A = Range (10000)
size_a = Len (a)
%timeit-n 1000 for i-a:k = Len (a)
%timeit-n 1000 for i-in a:k = Siz E_a
1000 loops, best 3:569µs per loop
1000 loops, Best of 3:256µs per loop
Optimization contains the order of multiple judgment expressions
For and, you should put less than enough in front, for or, put more than enough to meet the condition. Such as:
A = Range (%timeit-n)
[I for I in a If < I < I/1000 < I <]
%timeit-n [I for I in a If 1000 < I < or M < I <]
%timeit-n [I for I in a if I% 2 = 0 and i > 1900]
%timeit-n [I for I in a If i > 1900 and i% 2 = 0]
loops, best 3:287µs per loop
loops, best of 3:214µs per loop
loops, best 3:128µs/loop
loops, Best of 3:56.1µs per loop
Using a join to merge the strings in the iterator
In [1]:%%timeit
...: s = '
...: For i in a:
...: s = = i ...
:
10000 loops, Best of 3:59.8 Μs per Loop
in [2]:%%timeit
s = '. Join (a) ...
:
100000 loops, best 3:11.8µs per loop
joins are about 5 times times more likely to increase in cumulative ways.
Select the appropriate format character mode
S1, s2 = ' ax ', ' bx '
%timeit-n 100000 ' abc%s%s '% (S1, S2)%timeit-n 100000
' Abc{0}{1} '. Format (s1, S2)
%timei T-n 100000 ' abc ' + s1 + s2
100000 loops, best 3:183 ns per loop
100000 loops, Best of 3:169 ns per loop
100000 loops, best 3:103 ns per loop
In three cases, the% is the slowest, but the gap between the three is not big (very fast). (personally think % of the best readability)
Exchange two variable values without the help of intermediate variables
In [3]:%%timeit-n 10000
a,b=1,2 ...
: c=a;a=b;b=c;
...:
10000 loops, best 3:172 ns/Loop in
[4]:%%timeit-n 10000 a,b=1,2 a,b=b,a
...:
10000 loops, best 3:86 ns per loop
Use a,b=b,a rather than c=a;a=b;b=c; To exchange the value of a,b, can be faster than 1 time times.
Use if is
A = Range (10000)
%timeit-n [I for I in a if i = = True]
%timeit-n [I for I in a if I am true]
loo PS, Best of 3:531µs per loop
loops, Best of 3:362µs per loop
Use if is true to be nearly one times faster than if = = True .
Using cascading comparisons x < y < Z
X, y, z = 1,2,3
%timeit-n 1000000 if x < y < Z:pass
%timeit-n 1000000 if x < y and y < z:pass
1000000 loops, best 3:101 ns/loop
1000000 loops, Best of 3:121 ns per loop
x < y < Z efficiency is slightly higher and readability is better.
While 1 is faster than while True
Def while_1 ():
n = 100000 while
1:
N-= 1
if n <= 0:break
def while_true ():
n = 100000
whi Le True:
N-= 1
if n <= 0:break
m, n = 1000000, 1000000
%timeit-n-while_1 ()
%timeit-n 100 While_true ()
loops, best 3:3.69 ms per loop
loops, best 3:5.61 ms per loop
While 1 is much faster than while true because in python2.x, true is a global variable, not a keyword.
Use * * instead of POW
%timeit-n 10000 c = POW (2,20)
%timeit-n 10000 c = 2**20
10000 loops, best 3:284 ns per loop
10000 Loo PS, best 3:16.9 ns per loop
* * * is faster than 10 times times!
Use C to achieve the same function (corresponding to profile, Stringio, pickle) with CProfile, Cstringio and Cpickle
Import cpickle
Import pickle
a = range (10000)
%timeit-n x = Cpickle.dumps (a)
%timeit-n x = Pickl E.dumps (a)
loops, best 3:1.58 ms per loop
loops, best of 3:17 ms Per loop
The package implemented by C, faster than 10 times times!
Use the best way to deserialize
The following compares Eval, Cpickle, and JSON methods for three of the efficiency of deserializing the corresponding string:
Import JSON
import cpickle
a = range (10000)
S1 = str (a)
s2 = Cpickle.dumps (a)
s3 = Json.dumps (a)
%timeit-n x = eval (S1)
%timeit-n x = cpickle.loads (s2)
%timeit-n x = json.loads (S3)
10 0 loops, best 3:16.8 ms per loop
loops, Best of 3:2.02 ms/Loop
loops, Best of 3:798µs per loo P
You can see that JSON is nearly 3 times times faster than Cpickle, and is faster than eval by more than 20.
Use C extension (Extension)
There are currently cpython (the most common implementations of Python) native APIs, Ctypes,cython,cffi three ways, and their role is to enable Python programs to invoke the dynamic link library compiled by C, which is characterized by:
CPython Native API: by introducing the Python.h header file, the corresponding C program can use the Python data structure directly. The implementation process is relatively cumbersome, but has a relatively large scope of application.
ctypes: typically used to encapsulate (wrap) C programs, so that pure Python programs call functions in dynamic-link libraries (DLLs in Windows or so files in Unix). If you want to use a C-class library in Python, using ctypes is a good choice, with some benchmarks, python2+ctypes is the best way to perform.
Cython: Cython is a CPython superset that simplifies the process of writing C extensions. The advantage of Cython is that the syntax is concise and can be well compatible with NumPy and other libraries that contain a large number of C extensions. The Cython makes the scene generally an optimization of an algorithm or process in a project. In some tests, you can have hundreds of times times the performance boost.
Cffi: Cffi is ctypes in PyPy (see below) in the implementation, and also compatible with the CPython. Cffi provides a way to use the C class library in Python to write C code directly in Python code, while supporting linking to existing Class C libraries.
Using these optimization methods is generally aimed at the existing project performance bottleneck module optimization, can be a small number of changes in the original project, greatly improve the operation efficiency of the whole program.
Parallel programming
Because of the Gil, Python is hard to take advantage of multi-core CPUs. However, there are several parallel modes that can be implemented through the built-in module multiprocessing:
Multi-process : For CPU-intensive programs, you can use packaged classes such as multiprocessing Process,pool to implement parallel computations in a multiple-process way. However, because of the high cost of communication in the process, the program efficiency that needs a lot of data interaction between processes may not be greatly improved.
multithreading : For IO-intensive programs, the Multiprocessing.dummy module uses Multiprocessing interface encapsulation threading, making multithreaded programming easy (for example, using the pool's map interface , simple and efficient).
distributed : The Managers class in multiprocessing provides a way to share data in different processes, and a distributed program can be developed on this basis.
Different business scenarios can choose one or several of these combinations to achieve program performance optimization.
End-Stage Large kill device: PyPy
PyPy is a python implemented with Rpython (a subset of CPython), which is 6 times times faster than the CPython implemented Python, based on the benchmark data from the official website. The quick reason is that the just-in-time (JIT) compiler, a dynamic compiler, differs from a static compiler, such as a gcc,javac, to optimize the data that is used to run the program. For historical reasons, the Gil is still in the pypy, but the ongoing STM project is trying to turn pypy into Python without Gil.
If a Python program contains a C extension (Cffi), the JIT optimization effect can be significantly reduced, or even slower than CPython (NumPy). So it's best to use pure python or Cffi extensions in PyPy.
With the improvement of stm,numpy and other projects, I believe PyPy will replace CPython.
Using Profiling Tools
In addition to the Timeit modules used in Ipython, there are cprofile. CProfile is also very simple to use: python-m cProfile filename.py,filename.py is the file name to run the program, You can see in standard output the number of times each function was invoked and the time it was run to find the performance bottleneck of the program, which can then be targeted for optimization.
Reference
[1] http://www.ibm.com/developerworks/cn/linux/l-cn-python-optim/
[2] http://maxburstein.com/blog/speeding-up-your-python-code/
Original: http://segmentfault.com/blog/defool/1190000000666603