Python optimization techniques: Using ctypes to improve execution speed, pythonctypes
First, I would like to share with you a small pitfall when using python's ctypes to call the c library.
This issue is caused by a C function. The returned value is the string address generated by malloc. It is normal to use, but it has been used for a period of time and no exception is found.
In this test, we found that a "segment error" occurs during the use of this process, causing the program to exit.
After troubleshooting, it is determined that the problem is caused by the return value of the C function. The default return type of the ctypes function is int.
You need to set the return type in use, for example:
func.restype = c_char_p
Next we will discuss in detail the usage tips of ctypes.
The ctypes library allows developers to develop with C language. This introduction of the C language interface can help us do a lot of things, such as some small issues that require Calling C code to improve performance. You can access the kernel32.dll and msvcrt. dll dynamic link libraries on Windows and the libc. so.6 libraries on Linux. Of course, you can also use your own compiled shared library.
Let's look at a simple example. We use Python to calculate the prime number within 1000000, repeat this process 10 times, and calculate the running time.
import mathfrom timeit import timeitdef check_prime(x): values = xrange(2, int(math.sqrt(x)) + 1) for i in values: if x % i == 0: return False return Truedef get_prime(n): return [x for x in xrange(2, n) if check_prime(x)]print timeit(stmt='get_prime(1000000)', setup='from __main__ import get_prime', number=10)
Output
42.8259568214
Write a check_prime function in C language and import it as a shared library (dynamic link library ).
#include <stdio.h>#include <math.h>int check_prime(int a){ int c; for ( c = 2 ; c <= sqrt(a) ; c++ ) { if ( a%c == 0 ) return 0; } return 1;}
Use the following command to generate a. so (shared object) File
gcc -shared -o prime.so -fPIC prime.c
import ctypesimport mathfrom timeit import timeitcheck_prime_in_c = ctypes.CDLL('./prime.so').check_primedef check_prime_in_py(x): values = xrange(2, int(math.sqrt(x)) + 1) for i in values: if x % i == 0: return False return Truedef get_prime_in_c(n): return [x for x in xrange(2, n) if check_prime_in_c(x)]def get_prime_in_py(n): return [x for x in xrange(2, n) if check_prime_in_py(x)]py_time = timeit(stmt='get_prime_in_py(1000000)', setup='from __main__ import get_prime_in_py', number=10)c_time = timeit(stmt='get_prime_in_c(1000000)', setup='from __main__ import get_prime_in_c', number=10)print "Python version: {} seconds".format(py_time)print "C version: {} seconds".format(c_time)
Output
Python version: 43.4539749622 secondsC version: 8.56250786781 seconds
We can see the obvious performance gap. Here there are more methods to determine whether a number is a prime number.
Let's look at a complex example.Quick sorting
Mylib. c
#include <stdio.h>typedef struct _Range { int start, end;} Range;Range new_Range(int s, int e) { Range r; r.start = s; r.end = e; return r;}void swap(int *x, int *y) { int t = *x; *x = *y; *y = t;}void quick_sort(int arr[], const int len) { if (len <= 0) return; Range r[len]; int p = 0; r[p++] = new_Range(0, len - 1); while (p) { Range range = r[--p]; if (range.start >= range.end) continue; int mid = arr[range.end]; int left = range.start, right = range.end - 1; while (left < right) { while (arr[left] < mid && left < right) left++; while (arr[right] >= mid && left < right) right--; swap(&arr[left], &arr[right]); } if (arr[left] >= arr[range.end]) swap(&arr[left], &arr[range.end]); else left++; r[p++] = new_Range(range.start, left - 1); r[p++] = new_Range(left + 1, range.end); }}
gcc -shared -o mylib.so -fPIC mylib.c
One of the troubles with using ctypes is that the type used by native C Code may not be exactly the same as that used by Python. For example, what is an array in Python? List? Or an array in the array module. So we need to convert
Test. py
import ctypesimport timeimport randomquick_sort = ctypes.CDLL('./mylib.so').quick_sortnums = []for _ in range(100): r = [random.randrange(1, 100000000) for x in xrange(100000)] arr = (ctypes.c_int * len(r))(*r) nums.append((arr, len(r)))init = time.clock()for i in range(100): quick_sort(nums[i][0], nums[i][1])print "%s" % (time.clock() - init)
Output
1.874907
Comparison with the sort method of Python list
import ctypesimport timeimport randomquick_sort = ctypes.CDLL('./mylib.so').quick_sortnums = []for _ in range(100): nums.append([random.randrange(1, 100000000) for x in xrange(100000)])init = time.clock()for i in range(100): nums[i].sort()print "%s" % (time.clock() - init)
Output
2.501257
As for struct, You need to define a class that contains the corresponding fields and types.
class Point(ctypes.Structure): _fields_ = [('x', ctypes.c_double), ('y', ctypes.c_double)]
In addition to importing our own c Language extension files, we can also directly import the library files provided by the system, such as the implementation of the c standard library glibc in linux
Import timeimport randomfrom ctypes import cdlllibc = cdll. loadLibrary ('libc. so.6 ') # Linux # libc = cdll. msvcrt # Windows System init = time. clock () randoms = [random. randrange (1,100) for x in xrange (1000000)] print "Python version: % s seconds" % (time. clock ()-init) init = time. clock () randoms = [(libc. rand () % 100) for x in xrange (1000000)] print "C version: % s seconds" % (time. clock ()-init)
Output
Python version: 0.850172 secondsC version : 0.27645 seconds
The above are basic techniques of ctypes, which are sufficient for common developers.
For more detailed instructions, see: http://docs.python.org/library/ctypes.html