Python becomes the first in the programming language! Send you 20 python performance optimization suggestions!

Source: Internet
Author: User

1. Optimization of the algorithm time complexity

The time complexity of the algorithm has the greatest impact on the execution efficiency of the program, and in Python it is possible to optimize the time complexity by selecting the appropriate data structure, such as the time complexity of the list and set to find an element, respectively O (n) and O (1). Different scenarios have different optimization methods, in general, there are divided, branch boundaries, greed, dynamic planning and other ideas.

The Timeit followed by-n indicates the number of runs, and the next two lines correspond to the output of two Timeit. This shows that the latter is one order of magnitude slower.

Enter the group: 548377875 can get dozens of sets of PDFs Oh!

But for situations where loop traversal is required:

For a list that is not very large in memory, you can return a list directly, but the readability yield is better (a person's preference).

Python2.x has xrange function, Itertools package, etc. built-in generator function.

6. Optimized circulation

What you can do outside of the loop is not in the loop, for example, the following optimizations can be faster:

8. Use join to merge strings in iterators

Joins have a 5 times-fold increase in the way they accumulate.

9. Choose the appropriate format character method

10. Exchange values of two variables without using intermediate variables

12. Use cascading to compare X < Y < Z

13, while 1 is faster than while True

14. Use * * instead of POW

17. Using C extension (Extension)

There are currently cpython (the most common way to implement Python) native API, Ctypes,cython,cffi three ways, their role is to make the Python program can invoke C compiled by the dynamic link library, the characteristics are:

CPython Native API: By introducing the Python.h header file, Python's data structure can be used directly in the corresponding C program. The implementation process is relatively cumbersome, but has a relatively large scope of application.

cTYPES: Typically used for encapsulating (wrap) C programs, allowing pure Python programs to invoke functions in a dynamic-link library (DLL in Windows or so files in Unix). If you want to use a C class library in Python already, using cTYPES is a good choice, and with some benchmarks, python2+ctypes is the best way to perform.

Cython:cython is a superset of CPython for simplifying the process of writing C extensions. The advantage of Cython is that the syntax is concise and can be well compatible with NumPy and other libraries that contain a large number of C extensions. The Cython scenario is typically optimized for an algorithm or process in the project. In some tests, you can have hundreds of times times the performance boost.

Cffi:cffi is ctypes in PyPy (see below) in the implementation of the same-in is also compatible with CPython. Cffi provides a way to use Class C libraries in Python, writing C code directly in Python code, and supporting links to existing C-class libraries.

Using these optimizations is generally optimized for existing project performance bottleneck modules, which can greatly improve the efficiency of the entire program in the case of minor changes to the original project.

18. Parallel Programming

Because of the Gil's presence, Python is difficult to take advantage of multicore CPUs. However, there are several parallel modes that can be implemented through the built-in module multiprocessing:

Multi-process: for CPU-intensive programs, you can use multiprocessing Process,pool and other packaged classes to implement parallel computing in a multi-process manner. However, because the communication cost in the process is relatively large, the efficiency of the program that requires a lot of data interaction between processes may not be greatly improved.

Multithreading: For IO-intensive programs, the Multiprocessing.dummy module uses multiprocessing's interface to encapsulate threading, making multithreaded programming very easy (such as the ability to use the pool's map interface for simplicity and efficiency).

Distributed: The managers class in multiprocessing provides a way to share data in different processes, on which a distributed program can be developed.

Different business scenarios can choose one or several of these combinations to achieve program performance optimization.

19, the final stage big kill device: PyPy

PyPy is a python implemented using Rpython (a subset of CPython), which is 6 times times faster than the CPython implementation of Python based on the benchmark data of the website. The reason for this is that the Just-in-time (JIT) compiler, a dynamic compiler, is different from a static compiler (such as GCC,JAVAC, etc.) and is optimized using data from the process that the program is running. The Gil is still in pypy for historical reasons, but the ongoing STM project attempts to turn PyPy into Python without Gil.

If the Python program contains a c extension (non-cffi), the JIT optimization effect will be greatly reduced, even slower than CPython (than NumPy). Therefore, it is best to use pure python or cffi extension in PyPy.

With the improvement of stm,numpy and other projects, I believe PyPy will replace CPython.

20. Using Performance analysis tools

In addition to the Timeit modules used above in Ipython, there are cprofile. CProfile is also very simple to use: Python-m cProfile filename.py,filename.py is the file name to run the program, you can see in the standard output the number of times each function is called and the elapsed time, so as to find the program's performance bottleneck, It can then be optimized in a targeted manner.

Python becomes the first in the programming language! Send you 20 python performance optimization suggestions!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.