Transferred from: http://blog.csdn.net/gzlaiyonghao/article/details/1483728
This article was originally published in The Love Flower Butterflies blog http://blog.csdn.net/lanphaday, welcome reprint, but please be sure to keep the original text intact, and retain this statement.
[Python]
Performance Optimization with profile assistance Program God said: "Choose the script, do not consider performance." "I'm very supportive of that, and the script is about development speed, good extensibility, and maintainability. Unfortunately in the end, our program will inevitably run too slowly, our customers can not bear, at this time, we have to consider the performance of the code optimization. There are many reasons for a program to run slowly, such as having too many bad code (such as a large number of "." In a program). operator), the real reason is that the comparison is a less-than-good program for a one or two-paragraph design, such as a custom type conversion for a sequence of elements. Because the program performance impact is consistent with the 80/20 rule, that is, 20% of the code's run time takes up 80% of the total elapsed time (in fact, the scale is much more exaggerated, usually dozens of lines of code takes up more than 95% of the elapsed time), it is difficult to identify the code that caused the performance bottleneck by experience. At this time, we need a tool--profile! Recently my project also encountered a number of key areas of performance problems, it was close to the project completion date, fortunately, because the usual code modularity is relatively high, so through profile analysis related to the independent module, basically solve the performance problem. Through this, let me make up my mind to write an article about profile, share the use of the profile of the experience.
Initial Knowledge profileProfile is the standard library for Python. You can count the elapsed time of each function in the program and provide a variety of reports. Using profile to analyze a program is simple, for example, if you have a program that reads:
def foo (): sum = 0 for i in range: sum + = i return sum if __name__ = = "__main__": Foo () |
Now to analyze this program with profile, it is very simple to change the IF program block to the following:
if __name__ = = "__main__": Import Profile Profile.run ("foo ()") |
We just import the profile of the module, and then the program's entry function named parameter called the Profile.run function, the output of the program run as follows:
5 function calls in 0.143 CPU seconds Ordered by: Standard name Ncalls tottime percall cumtime percall Filename:lineno ( function) 1 0.000 0.000 0.000 0.000:0 (range) 1 0.143 0.143 0.143 0.143:0 (setprofile) 1 0.000 0.000 0.000 0.000 <string>:1 (?) 1 0.000 0.000 0.000 0.000 prof1.py:1 (foo) 1 0.000 0.000 0.143 0.143 ProFiLe:0 (foo ()) 0 0.000 0.000 profile:0 (profiler) |
Shows the prof1.py function calls, according to the chart we can clearly see the Foo () function takes up 100% of the elapsed time, the Foo () function is the name of the program in the hot spot. In addition to this approach, profile can also be directly used by the Python interpreter to call the Profiles module to split the PY program, such as the command line interface to enter the following command:
Python-m Profile prof1.py |
The resulting output has the same effect as the direct modification script call Profile.run () function. The statistical results of profile are divided into Ncalls, Tottime, Percall, Cumtime, Percall, Filename:lineno (function) and several other columns:
ncal LS |
The number of calls to the function |
tottime |
Total run time of function, removing function run time called in function |
percall |
The average time that the function runs once, equal to Tottime/ncalls |
cumtime |
percall |
The average time that the function runs once, equal to Cumtime/ncalls |
filename:lineno (Fu nction) |
The file name where the function is, the line number of the function, the name of the functions |
Typically, the output of the profile is output directly to the command line, and the output is sorted by file name by default. This creates a barrier, and we sometimes want to be able to save the output to a file and to see the results in various ways. Profile simply supports a number of requirements, and we can provide an argument in the Profile.run () function, which is to save the file name of the output, and again, in the command-line arguments, we can add one more parameter to hold the output of the profile.
customizing reports with PstatsProfile solves one of our needs, and there is a need: To view the output in many forms, we can solve it through another class of stats. Here we need to introduce a module pstats, which defines a class stats,stats constructor that takes a parameter--that is, the file name of the profile's output file. Stats provides the ability to sort and output the results of the profile output, as we changed the previous program to read as follows:
# ... slightly if __name__ = = "__main__": Import Profile Profile.run ("foo ()", "Prof.txt") import pstats p = Psta Ts. Stats ("Prof.txt") p.sort_stats ("Time"). Print_stats () |
After introducing Pstats, the output of the profile is sorted by the time the function takes, and the output is as follows:
Sun Jan 00:03:12 2007 prof.txt 5 function Calls in 0.002 CPU seconds Ordered by:internal time ncalls tottime Percall cumtime percall Filename:lineno (function) 1 0.002 0.002 0.002 0.002:0 (setprofile) 1 0.000 0.000 0.002 0.002 profile:0 (foo ()) 1 0.000 0.000 0.000 0.000 g:/prof1.py:1 (foo) 1 0.000 0.000 0.000 0.000 <string>:1 (?) 1 0.000 0.000 0.000 0.000:0 (range) 0 0.000 0.000 profile:0 (Profiler) |
Stats has a number of functions, which can give us a different profile report, very powerful function. Here's a quick introduction to these functions:
Strip_dirs () |
The path information that is used to remove the file name before name. |
Add (filename,[...]) |
Add the profile output file to the stats instance for statistics |
Dump_stats (filename) |
Save the stats statistics to a file |
Sort_stats (key,[...]) |
The most important function to sort the output of a profile |
Reverse_order () |
Rearrange the data in the stats instance in reverse order |
Print_stats ([restriction,...]) |
Export the stats report to stdout |
Print_callers ([restriction,...]) |
Outputs information about the function that called the specified function |
Print_callees ([restriction,...]) |
Outputs information about functions that have been called by the specified function |
The most important function here is sort_stats and print_stats, through which we can almost browse through all the information in the proper form, and let us give you a detailed introduction. Sort_stats () accepts one or more string arguments, such as "Time", "name", and so on, which is quite useful to indicate which column to sort by, for example, we can sort by using time as key to know the most consuming function, or by Cumtime. Learn about the functions that consume the most time, so that when we optimize, we have targeted, and we do more. The following parameters are acceptable for sort_stats:
' Ncalls ' |
Number of Calls |
' Cumulative ' |
Total time of function run |
' File ' |
Filename |
' Module ' |
Filename |
' Pcalls ' |
Simple call statistics (compatible with legacy, no statistics recursive call) |
' Line ' |
Line number |
' Name ' |
Name of function |
' NFL ' |
Name/file/line |
' Stdname ' |
Name of standard function |
' Time ' |
function internal run time (regardless of when the child function is called) |
Another fairly important function is the print_stats--used to get the report based on the last call to Sort_stats. Print_stats has a number of optional parameters to filter the output data, the Print_stats parameter can be a number or a Perl-style regular expression, the relevant content is understood through other channels, here is not detailed, only three examples: Print_stats (". 1" , "foo:") This statement indicates that the contents of the stats are taken from the previous 10%, and then the result of the string containing "foo:" is output. Print_stats ("foo:", ". 1") This statement represents the first 10% outputs of the contents of the string containing "foo:" in stats. Print_stats (10) This statement indicates that the first 10 data outputs in the stats. In fact, the result of the profile output is equivalent to calling the stats function:
P.strip_dirs (). Sort_stats ( -1). Print_stats ()
The parameters of the Sort_stats function are-1, which is also reserved for compatibility with older versions. Sort_stats can accept one of the -1,0,1,2, these four numbers correspond to "Stdname", "calls", "time" and "cumulative", but if you use a number as a parameter, then pstats only sort by the first parameter, Other parameters will be ignored.
hotshot--a better profile because of the mechanism of the profile itself (such as the use of precision-to-milliseconds timers, etc.), the "uncertainty" problem in the Profiles module is quite serious in quite a number of cases. Most of the hotshot are implemented in C, which has a much smaller effect on performance than the profile module, and supports statistical run times in behavioral units. The drawback is that hotshot does not support multithreaded programs, in fact, its timing core has a critical section of the bug; more unfortunately, hotshot is no longer maintained and may be removed from the standard library in future versions of Python. However, Hotshot is still a very good splitter for programs that do not use multithreading. Hotshot has a profile class whose constructor prototype is as follows: Class profile (logfile[, lineevents[, Linetimings]]) logfile parameter is the file name to save the split statistic results, lineevents indicates whether to count the running time of each line of the source code, the default is 0, that is, the function execution time is the statistical granularity, Linetimings is to record the time information, the default is 1. The following is still an example:
# ... if __name__ = = "__main__": Import hotshot import Hotshot.stats prof = hotshot. Profile ("Hs_prof.txt", 1) prof.runcall (foo) prof.close () p = hotshot.stats.load ("Hs_prof.txt") p.p Rint_stats () |
Output:
&nbs p; 1 function calls in 0.003 CPU seconds Random Listing Order was used Ncalls tottime percall cumtime percall Filename:lineno ( function) 1 0.003 0.003 0.003 0.003 i:/prof1.py:1 (foo) 0 0.000 0.000 profile:0 (Profiler) |
We can see that the interference information from Hotshot is much less than the profile, which also helps us to analyze the data to find hot spots. But just as I used Prof = hotshot in the previous code. Profile ("Hs_prof.txt", 1), I found that the Lineevents=1 and ignore the linveevents parameter is not different, please enlighten us. With Hotshot, you can more flexibly count the operation of the program, because hotshot. Profile provides the following series of functions:
Run (CMD) |
Execute a script that functions like the run () function of the profile module |
Runcall (func, *args, **keywords) |
Call a function and count the associated run information |
Runctx (CMD, globals, locals) |
Specify the execution environment for a script, execute the script and count the run information |
With these functions, we can easily build a test pile module without having to manually write many drive modules as you would with profile. Hotshot. Profile also provides other useful functions, please refer to the relevant manual.
Python 2.5 ConsiderationsBecause hotshot cannot be used for multi-threading, and its advantages are only at speed, the Python 2.5 version declares that the Hotshot module is no longer maintained and may be removed in a later release. There must be, instead of cprofile, and Cpickle and other modules similar to the Cprofile than the profile module faster. The Cprofile interface is the same as profile, and can be used in previous projects as long as it is replaced with cprofile in the place where the profile is used. Pstats also produced some subtle changes in Python version 2.5, pstats. The Stats constructor adds a default parameter, which becomes: Class Stats (filename[, stream=sys.stdout[, ...]) For us, this is no harm, the stream parameter gives us the opportunity to save the profile report to a file, which is exactly what we need. In summary, if you are using Python version 2.5, I suggest you use Cprofile.
Small and practical Swiss Army knife--timeit if we had a whim, how much time would it take to append an element to the list, or how long it would take to throw an exception, The use of profile is like killing chickens with sledgehammer. At this time our better choice is the Timeit module. Timeit also offers a friendly command line interface in addition to its very friendly programming interface. Let's look at the programming interface first. The Timeit module contains a class timer, its constructor is this: Class timer ([stmt= ' Pass ' [, setup= ' Pass ' [, Timer=<timer Function>]]) The stmt parameter is a code snippet in the form of a string that will be evaluated for elapsed time, and the setup parameter is used to set the operating environment of the stmt; the timer can be used by the user to use a custom precision timing function. Timeit. The timer has three member functions, following a brief introduction: Timeit ([number=1000000]) Timeit () executes the setup statement in the timer constructor once, and repeats the number stmt statement. It then returns the elapsed time elapsed for the total run. Repeat ([repeat=3 [, number=1000000]]) the repeat () function calls the Timeit function repeat times with number parameter and returns the total elapsed time print_exc ([File=none]) The PRINT_EXC () function replaces the standard tracback because PRINT_EXC () outputs the source code of the wrong line, such as:
>>> t = Timeit. The Timer ("t = foo ()/nprint T") ß is Timeit code snippet >>> T.timeit () traceback ( Most recent: file ' <pyshell#12> ', line 1, in-toplevel- T.timeit () file "E :/python23/lib/timeit.py ", line 158, in timeit return Self.inner (it, Self.timer) file" < Timeit-src> ", line 6, in inner foo () ß standard output is such that nameerror:global name ' foo ' is not defined>>> try: T.timeit () except: T.print_exc () traceback ( Most recent: file ' <pyshell#17> ', line 2, ' in ' file ' e:/python23/lib/timeit.py ', line 158, in t imeit return Self.inner (it, Self.timer) file "<timeit-src>", line 6, in inner T = foo ()  The output of the &NBSP;&NBSP;ßPRINT_EXC () is such that it is convenient to locate the error Nameerror:global name ' foo ' is not defined |
In addition to the programming interfaces that can be used with Timeit, we can also use Timeit at the command line, which is very handy: Python timeit.py [-n] [-R N] [-S] [-t] [-c] [-h] [statement ...] The parameters are defined as follows:-n n/--number=n Statement statement execution number of times-R N/--repeat=n repeated calls Timeit (), default is 3-s s/--setup=s to set Stateme NT execution Environment statement, default to "pass"-t/--time timing function, except Windows platform by default use Time.time () function,-c/--clock timer function, Windows platform default use Time.clock () function-v/ --verbose output more accurate timing values-h/--help simple to use to help small and practical timeit hidden potential for you to explore, I do not provide an example here ~ ~
PostScript originally I only intended to write the use of profile and some of their own application of experience, but carefully read the relevant manual found practical things a lot, So ro-ro to write so many, their own application experience and have to allow the later to be described. In writing this introduction, I think of my previous write a * algorithm Python implementation, did not have any optimization, so intend to use it as an example, later to write a more practical example throughout the full text of the document bar. This article is written in reference to the Python 2.3/2.4/2.5 version of manual, the content of the article is mostly applicable to Python version 2.3 or later, Where cprofile requires version 2.5 to support.