Comparison between several methods to improve Python running efficiency, python Efficiency
In my opinion, the python community is divided into three schools: python 2. x, python 3. x, and PyPy. This classification can basically be attributed to the compatibility and speed of class libraries. This article will focus on some general code optimization techniques and the significant performance improvement after compiling to C. Of course, I will also give the runtime of the three major python genres. My goal is not to prove that one is better than the other, but to let you know how to use these examples for comparison in different environments.
Use Generator
A memory optimization that is generally ignored is the use of generators. The generator asks us to create a function to return only one record at a time, instead of returning all records at a time. If you are using python2.x, this is why you use xrange to replace range or ifilter to replace filter. A good example is to create a large list and splice them together.
import timeitimport random def generate(num):while num:yield random.randrange(10)num -= 1 def create_list(num):numbers = []while num:numbers.append(random.randrange(10))num -= 1return numbersprint(timeit.timeit("sum(generate(999))", setup="from __main__ import generate", number=1000))>>> 0.88098192215 #Python 2.7>>> 1.416813850402832 #Python 3.2print(timeit.timeit("sum(create_list(999))", setup="from __main__ import create_list", number=1000))>>> 0.924163103104 #Python 2.7>>> 1.5026731491088867 #Python 3.2
This is not only faster, but also avoids storing all the lists in the memory!
Ctypes Introduction
For the key performance code, python also provides an API to call the C method, mainly through ctypes. You can use ctypes without writing any C code. By default, python provides a pre-compiled Standard c library. Let's return to the generator example to see how much time it takes to use ctypes.
import timeitfrom ctypes import cdll def generate_c(num):#Load standard C librarylibc = cdll.LoadLibrary("libc.so.6") #Linux#libc = cdll.msvcrt #Windowswhile num:yield libc.rand() % 10num -= 1 print(timeit.timeit("sum(generate_c(999))", setup="from __main__ import generate_c", number=1000))>>> 0.434374809265 #Python 2.7>>> 0.7084300518035889 #Python 3.2
Instead of a random function of c, the running time is reduced by more than half! If I tell you that we can still do better, do you believe it?
Cython Introduction
Cython is a superset of python. It allows us to call C functions and declare variables to improve performance. We need to install Cython before trying to use it.
sudo pip install cython
Cython is essentially another branch that is no longer developed similar to the class library Pyrex. It compiles our class Python code into a C library, which can be called in a python file. Use the. pyx suffix to replace the. py suffix for your python file. Let's take a look at how Cython runs our generator code.
#cython_generator.pyximport random def generate(num):while num:yield random.randrange(10)num -= 1
We need to create a setup. py so that we can get Cython to compile our function.
from distutils.core import setupfrom distutils.extension import Extensionfrom Cython.Distutils import build_ext setup(cmdclass = {'build_ext': build_ext},ext_modules = [Extension("generator", ["cython_generator.pyx"])])
Compile and use:
python setup.py build_ext --inplace
You should see two cython_generator.c files and generator. so files. We use the following method to test our program:
import timeitprint(timeit.timeit("sum(generator.generate(999))", setup="import generator", number=1000))>>> 0.835658073425
Well, let's see if there are any improvements. We can declare "num" as an integer first, and then we can import the Standard C library to take charge of our random functions.
#cython_generator.pyxcdef extern from "stdlib.h":int c_libc_rand "rand"() def generate(int num):while num:yield c_libc_rand() % 10num -= 1
If we compile and run it again, we will see this string of amazing numbers.
>>> 0.033586025238
Only a few changes bring about a good result. However, sometimes this change is boring, so let's take a look at how to implement it using python with rules.
PyPy Introduction
PyPy is a real-time compiler of Python2.7.3. In other words, this means that your code runs faster. Quora uses PyPy in the production environment. PyPy has some installation instructions on their download page, but if you are using the Ubuntu system, you can install it through apt-get. Its running mode is immediately available, so there is no crazy bash or running script, you just need to download and run it. Let's take a look at the performance of our original generator code in PyPy.
import timeitimport random def generate(num):while num:yield random.randrange(10)num -= 1 def create_list(num):numbers = []while num:numbers.append(random.randrange(10))num -= 1return numbersprint(timeit.timeit("sum(generate(999))", setup="from __main__ import generate", number=1000))>>> 0.115154981613 #PyPy 1.9>>> 0.118431091309 #PyPy 2.0b1print(timeit.timeit("sum(create_list(999))", setup="from __main__ import create_list", number=1000))>>> 0.140175104141 #PyPy 1.9>>> 0.140514850616 #PyPy 2.0b1
Wow! Without modifying a line of code, the running speed is eight times that of pure python.
Why further research is required for further testing? PyPy is the champion! Not all. Although most programs can run on PyPy, some libraries are not fully supported. Furthermore, it is easier to write C extensions for your project than to change the compiler. Let's take a deeper look at how ctypes allows us to use C to write the database. Let's test the speed of merging and sorting and calculating the Fibonacci series. Below is the C code (functions. c) We will use ):
/* functions.c */#include <stdio.h>#include <stdlib.h>#include <string.h> /* http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#C */inline voidmerge (int *left, int l_len, int *right, int r_len, int *out){int i, j, k;for (i = j = k = 0; i < l_len && j < r_len;)out[k++] = left[i] < right[j] ? left[i++] : right[j++];while (i < l_len)out[k++] = left[i++];while (j < r_len)out[k++] = right[j++];} /* inner recursion of merge sort */voidrecur (int *buf, int *tmp, int len){int l = len / 2;if (len <= 1)return;/* note that buf and tmp are swapped */recur (tmp, buf, l);recur (tmp + l, buf + l, len - l);merge (tmp, l, tmp + l, len - l, buf);} /* preparation work before recursion */voidmerge_sort (int *buf, int len){/* call alloc, copy and free only once */int *tmp = malloc (sizeof (int) * len);memcpy (tmp, buf, sizeof (int) * len);recur (buf, tmp, len);free (tmp);} intfibRec (int n){if (n < 2)return n;elsereturn fibRec (n - 1) + fibRec (n - 2);}
On the Linux platform, we can compile it into a shared library using the following method:
gcc -Wall -fPIC -c functions.cgcc -shared -o libfunctions.so functions.o
By loading the shared library "libfunctions. so" using ctypes, we can use this library just as we did for the Standard C library. Here we will Compare Python implementation and C implementation. Now we start to calculate the Fibonacci series:
# functions.py from ctypes import *import time libfunctions = cdll.LoadLibrary("./libfunctions.so") def fibRec(n):if n < 2:return nelse:return fibRec(n-1) + fibRec(n-2) start = time.time()fibRec(32)finish = time.time()print("Python: " + str(finish - start)) # C Fibonaccistart = time.time()x = libfunctions.fibRec(32)finish = time.time()print("C: " + str(finish - start))
As we expected, C is faster than Python and PyPy. We can also compare and sort in the same way.
We have not yet dig deep into the proxyes library, so these examples do not reflect the powerful side of python. The proxyes library has only a small number of standard type restrictions, such as int type, char array, float type, bytes (bytes) and so on. By default, there is no integer array, but we can indirectly obtain this array by multiplying it with c_int (ctype is of the int type. This is also to be presented in line 1 of the Code. We have created a c_int array, which is an array of numbers and packaged into a c_int array.
The main reason is that the C language cannot do this, and you do not want. We use pointers to modify the function body. To pass our c_numbers series, we must pass the merge_sort function through references. After running merge_sort, we use the c_numbers array for sorting. I have added the following code to my functions. py file.
#Python Merge Sortfrom random import shuffle, sample #Generate 9999 random numbers between 0 and 100000numbers = sample(range(100000), 9999)shuffle(numbers)c_numbers = (c_int * len(numbers))(*numbers) from heapq import mergedef merge_sort(m):if len(m) <= 1:return mmiddle = len(m) // 2left = m[:middle]right = m[middle:]left = merge_sort(left)right = merge_sort(right)return list(merge(left, right)) start = time.time()numbers = merge_sort(numbers)finish = time.time()print("Python: " + str(finish - start)) #C Merge Sortstart = time.time()libfunctions.merge_sort(byref(c_numbers), len(numbers))finish = time.time()print("C: " + str(finish - start)) Python: 0.190635919571 #Python 2.7Python: 0.11785483360290527 #Python 3.2Python: 0.266992092133 #PyPy 1.9Python: 0.265724897385 #PyPy 2.0b1C: 0.00201296806335 #Python 2.7 + ctypesC: 0.0019741058349609375 #Python 3.2 + ctypesC: 0.0029308795929 #PyPy 1.9 + ctypesC: 0.00287103652954 #PyPy 2.0b1 + ctypes
Here we use tables and icons to compare different results.
.