Author:Perrygeo
Translator: Lai Yonghao (Http://laiyonghao.com)
Original article:Http://www.perrygeo.net/wordpress? P = 116
What I like most isPython, ItsCodeElegant and practical. Unfortunately, it is slower than most languages in terms of speed. Most people also think that the speed and ease of use are two poles.--WriteCThe code is indeed very painful. WhileCythonTry to eliminate this dual nature and let you have it at the same timePythonSyntax andCData Types and functions--Both of them are the best in the world. Remember, I am never an expert in this field. This is my first time.CythonNotes for real experience:
Edit: according to some feedback I received, it seems confusing.-- CythonIs used to generateCExtend to instead of independentProgram. All the acceleration is for an existingPythonA function of the application. Not usedCOr LISPRewrite the entire application without handwritingCExtension . Just using a simple method for IntegrationCSpeed andCData TypePythonFunction.
Now we can say that we canGreat_circleFaster functions. The so-calledGreat_circleIs to calculate the distance between two points along the Earth's surface:
Import mathdef great_circle (lon1, LAT1, lon2, LAT2): radius = 3956 # miles x = math. PI/180.0 A = (90.0-lat1) * (x) B = (90.0-lat2) * (x) Theta = (lon2-lon1) * (x) C = math. ACOs (math. cos (a) * Math. cos (B) + (math. sin (a) * Math. sin (B) * Math. cos (theta) return radius * C
Let's call it50Ten thousand times and determine its time :
Import timeit lon1, LAT1, lon2, LAT2 =-72.345, 34.323,-61.823, 54.826num = 500000 T = timeit. timer ("p1.great _ circle (% F, % F)" % (lon1, LAT1, lon2, LAT2), "Import p1 ") print "Pure Python function", T. timeit (Num), "Sec"
About2.2Seconds . It's too slow!
Let's try to use it quicklyCythonRewrite it and see if there is any difference:
Import mathdef great_circle (float lon1, float LAT1, float lon2, float LAT2): cdef float radius = 3956.0 cdef float Pi = 3.14159265 cdef float x = PI/180.0 cdef float a, B, theta, c a = (90.0-lat1) * (x) B = (90.0-lat2) * (x) Theta = (lon2-lon1) * (x) C = math. ACOs (math. cos (a) * Math. cos (B) + (math. sin (a) * Math. sin (B) * Math. cos (theta) return radius * C
Please note that we stillImportmath -- cythonAllows you to mix and match data to a certain extent.PythonAndCThe data type is. The conversion is automatic, but it does not have no cost. In this example, we definePythonFunction, declare that its input parameter is of the floating point type, and declare the type of all variablesCFloating point data type. It still uses the computing part.PythonOfMathModule.
Now we need to convert itCCompile the codePythonExtension. The best way to do this is to writeSetup. pyRelease the script. However, we use the manual method To learn about the Magic:
# This will create a c1.c file-the C source code to build a python extensioncython c1.pyx # compile the object filegcc-C-FPIC-I/usr/include/python2.5/c1.c # link it into a shared librarygcc-shared c1.o-O c1.so
Now you should haveC1.so(Or. Dll) File, which can bePython Import. Run the following command:
T = timeit. timer ("c1.great _ circle (% F, % F)" % (lon1, LAT1, lon2, LAT2), "Import C1 ") print "cython function (still using Python math)", T. timeit (Num), "Sec"
About1.8Seconds . There is no major performance improvement we expected at the beginning. UsePythonOfMathThe module should be the bottleneck. Now let's useCAlternative to the standard library:
Cdef extern from "math. H ": Float cosf (float theta) float sinf (float theta) float aco sf (float theta) def great_circle (float lon1, float LAT1, float lon2, float LAT2 ): cdef float radius = 3956.0 cdef float Pi = 3.14159265 cdef float x = PI/180.0 cdef float a, B, Theta, c a = (90.0-lat1) * (x) B = (90.0-lat2) * (x) Theta = (lon2-lon1) * (x) C = acossf (cosf (a) * cosf (B) + (sinf (a) * sinf (B) * cosf (theta) return radius * C
AndImport mathAccordingly, we useCdef externTo declare the function from the specified header file.CStandard LibraryMath. h). We replaced the expensivePythonFunction, create a new shared library, and test again:
T = timeit. timer ("c2.great _ circle (% F, % F)" % (lon1, LAT1, lon2, LAT2), "Import C2 ") print "cython function (using trig function from math. h) ", T. timeit (Num), "Sec"
Do you like it now?0.4Seconds-BizhunPythonFunction has5Times the speed of growth. How can we increase the speed?C2.great _ circle() Is stillPythonFunction call, which means it generatesPythonOfAPIOverhead (Building parameter tuples, etc.), if we can write a pureCFunction, we may be able to speed up.
Cdef extern from "math. H ": Float cosf (float theta) float sinf (float theta) float aco sf (float theta) cdef float _ great_circle (float lon1, float LAT1, float lon2, float LAT2 ): cdef float radius = 3956.0 cdef float Pi = 3.14159265 cdef float x = PI/180.0 cdef float a, B, Theta, c a = (90.0-lat1) * (x) B = (90.0-lat2) * (x) Theta = (lon2-lon1) * (x) C = acossf (cosf (a) * cosf (B) + (sinf (a) * sinf (B) * cosf (theta) return radius * cdef great_circle (float lon1, float LAT1, float lon2, float LAT2, int num ): cdef int I cdef float X for I from 0 <= I <num: x = _ great_circle (lon1, LAT1, lon2, LAT2) return x
Please note that we still havePythonFunction (Def). It accepts an additional parameter.Num. Loop usage in this functionFor I from 0 <= I <num:InsteadPythonic, But much slowerFor I in range (Num ):. The real computing work isCFunction (Cdef), It returnsFloatType. This version only requires0.2Seconds--Compared with the originalPythonFunction speed improvement10Times.
To prove that we have done enough optimization, we can use pureCWrite a small application and determine the time:
# Include <math. h> # include <stdio. h> # define num 500000 float great_circle (float lon1, float LAT1, float lon2, float LAT2) {float radius = 3956.0; float Pi = 3.14159265; float x = PI/180.0; float a, B, Theta, C; A = (90.0-lat1) * (x); B = (90.0-lat2) * (x); Theta = (lon2-lon1) * (X ); C = ACOs (COS (a) * Cos (B) + (sin (a) * sin (B) * Cos (theta); Return radius * C ;} int main () {int I; float X; for (I = 0; I <= num; I ++) x = great_circle (-72.345, 34.323,-61.823, 54.826); printf ("% F", x );}
UseGcc-lm-octest ctest. cCompile it for testingTime./ctest...About0.2Seconds . This gives me confidence that ICythonScale relative to myCThe Code is also very efficient (this does not mean myCPoor programming skills ).
AvailableCythonThe optimized performance usually depends on the number of cycles, numeric operations, andPythonFunction calls make the program slow. Some people have reported that in some cases100To1000Times faster. Other tasks may not be so useful. Use it franticallyCythonRewritePythonRemember this before coding:
"We should forget small efficiency. premature optimization is the root of all evil.97%In this case."-- Donaldknuth
In other words, usePythonWrite a program and check whether it meets your needs. In most cases, its performance is good enough.......But sometimes it is really slow, you can use the analyzer to find the bottleneck function, and then useCythonRewrite to get higher performance soon.
External link
Worldmill(Http://trac.gispython.org/projects/PCL/wiki/WorldMill)--BySean GilliesUseCythonA quick and concisePythonThe interface module encapsulatesLibgdalLibrary.
write faster Pyrex code (
http://www.sagemath.org: 9001/writingfastpyrexcode ) -- Pyrex , is the predecessor of cython . They have similar goals and syntaxes.