Introduction to performance analysis and tuning tools
There is always a time when you want to improve the efficiency of your program, to see which part takes longer to become a bottleneck, and to know the memory and CPU usage when the program runs. At this point you will need some way to perform performance analysis and tuning of the program.
by Context Manager
A timer can be implemented by the context manager itself, as described in the previous introduction to Timeit article, by defining the __enter__ and __exit__ methods of the class to implement the function timing of the management, similar to the following:
# Timer.pyimport Timeclass Timer (object): def __init__ (self, verbose=false): self.verbose = verbose def __ Enter__ (self): Self.start = Time.time () return self def __exit__ (self, *args): self.end = Time.time () self.secs = Self.end-self.start self.msecs = self.secs * $ # milliseconds if self.verbose: print ' Elaps Ed Time:%f ms '% self.msecs
Use the following methods:
From timer import Timerwith timer () as T: foo () print "= + foo () spends%s s"% t.secs
by Decorator
But I think the way the decorator is more elegant
Import timefrom functools Import wrapsdef timer (function): @wraps (function) def function_timer (*args, * * Kwargs): t0 = time.time () result = function (*args, **kwargs) t1 = time.time () print ("Total time running %s:%s seconds "% (Function.func_name, str (t1-t0)) ) return result return Function_timer
It's easy to use:
@timerdef my_sum (n): return sum ([i-I in range (n)]) if __name__ = = "__main__": my_sum (10000000)
Operation Result:
➜ python profile.pytotal time running my_sum:0.817697048187 seconds
System comes with the time command
Examples of use are:
➜time python profile.pytotal time running my_sum:0.854454040527 secondspython profile.py 0.79s user 0.18s system 98% CPU 0.977 Total
The above results illustrate: The execution script consumes 0.79sCPU time, 0.18 seconds to execute the kernel function consumption time, a total of 0.977s time.
Where total time-(User Time + system time) = consumed at input and output and when the system performs other tasks
Python Timeit module
Can be used to do benchmark, you can easily repeat the number of execution of a program to see how many blocks the program can run. Refer to the previously written article.
CProfile
Take a look at the example of using annotations directly.
#coding =utf8def Sum_num (max_num): Total = 0 for i in range (max_num): total + = i return totaldef test (): C4/>total = 0 for i in range (40000): total + = i t1 = sum_num (100000) t2 = sum_num (200000) t3 = Sum_num (300000) T4 = Sum_num (400000) t5 = Sum_num (500000) test2 () return totaldef test2 (): Total = 0 for i in RA Nge (40000): total + = i T6 = Sum_num (600000) t7 = Sum_num (700000) return totalif __name__ = = "__main_ _ ": import cProfile # # To print analysis results directly to console # cprofile.run (" Test () ") # # Save the results to a file # Cprofile.run ("Test ()", filename= "Result.out") # increases the sort mode cprofile.run ("Test ()", filename= "Result.out", sort= "Cumulative")
Cprofile saves the results of the analysis to the Result.out file, but is stored in binary form and is viewed with the provided pstats if you want to view it directly.
Import pstats# Create stats object p = pstats. Stats ("Result.out") # Strip_dirs (): Get rid of irrelevant path information # sort_stats (): Sort, support the same way as above # print_stats (): Print analysis results, you can specify a few lines before printing # And the result of running Cprofile.run directly ("Test ()") is the same as P.strip_dirs (). Sort_stats ( -1). Print_stats () # Sort by function name, print only the first 3 rows of functions, the parameters can also be decimals, Represents the first percent of the function information P.strip_dirs (). Sort_stats ("name"). Print_stats (3) # Sort by run time and function name P.strip_dirs (). Sort_stats (" Cumulative "," name "). Print_stats (0.5) # If you want to know what functions call the Sum_nump.print_callers (0.5," Sum_num ") # View Test () What functions are called in the function p.print_callees ("test")
Intercept a sample output that looks at what functions were called by Test ():
➜python python profile.py Random listing order is used List reduced from 6 to 2 due to restriction < ' test ' >function called ... ncalls tottime Cumtimepro File.py:24 (Test2), 2 0.061 0.077 profile.py:3 (sum_num) 1 0.000 0.00 0 {range}profile.py:10 (test)-5 0.073 0.094 Profile.py:3 (sum_num) 1 0.002 0.079 profile.py:24 (test2) 1 0.001 0.001 {range}
Profile. Profile
Cprofile also provides customizable classes that can be analyzed in more detail, looking at the document.
Format such as: Class profile. Profile (Timer=none, timeunit=0.0, Subcalls=true, Builtins=true)
The following example comes from an official document:
Import CProfile, pstats, STRINGIOPR = Cprofile.profile () pr.enable () # ... do something ... pr.disable () s = Stringio.stringi O () SortBy = ' cumulative ' PS = pstats. Stats (PR, stream=s). Sort_stats (SortBy) ps.print_stats () print s.getvalue ()
Lineprofiler
Lineprofiler is a tool for performing progressive performance analysis of functions, see GitHub Project description, Address: Https://github.com/rkern/line ...
Example
#coding =utf8def Sum_num (max_num): Total = 0 for i in range (max_num): total + = i return total@profile # Add @profile to annotate which function Def test (): Total = 0 for i in range (40000): total + = i t1 = sum_num (10000000) C9/>t2 = Sum_num (200000) t3 = Sum_num (300000) T4 = Sum_num (400000) t5 = Sum_num (500000) test2 () return Totaldef test2 (): Total = 0 for i in range (40000): total + = i T6 = Sum_num (600000) T7 = Sum_num (700000) return totaltest ()
Using the Kernprof command to inject the analysis, the results are as follows:
➜kernprof-l-V profile.pywrote profiles results to Profile.py.lprofTimer unit:1e-06 stotal time:3.80125 sfile:profile. Pyfunction:test at line 10Line # Hits time per hits% time line contents================================= ============================= @profile 11 def test (): 1 5 5.0 0.0 total = 0 13 40001 19511 0 .5 0.5 for I in range (40000): 40000 19066 0.5 0.5 Total + = i 15 16 1 2974373 2974373.0 78.2 t1 = sum_num (10000000) 1 58702 58702.0 1.5 t2 = Sum_num (200000) 1 81170 81170.0 2.1 t3 = Sum_num (300000) 19 1 114901 11 4901.0 3.0 T4 = Sum_num (400000) 1 155261 155261.0 4.1 t5 = sum_num (500000) 21 1 378257 378257.0 10.0 Test2 () 1 2 2.0 0.0 return Total
The places where hits (execution times) and time values are high are places where there is more room for optimization.
Memoryprofiler
Similar to "Lineprofiler" for modules based on the row parser memory usage. GitHub Address: https://github.com/fabianp/me .... PS: Install Psutil, will analyze faster.
Same as the code in "Lineprofiler" above, run the python-m memory_profiler profile.py command produces the following results:
➜python-m Memory_profiler profile.pyFilename:profile.pyLine # Mem usage Increment line contents======== ======================================== 24.473 MiB 0.000 MIB @profile def Test (): 24.473 MiB 0.000 MIB Total = 0 25.719 MiB 1.246 MIB for I in range (40000) : 25.719 MIB 0.000 MIB total + = i 335.594 MiB 309.875 mib t1 = Sum_num (10000000) 337.121 MiB 1.527 mib t2 = sum_num (200000) 339.410 MiB 2.289 mib t3 = sum_num (300000) 342.465 MiB 3.055 mib t4 = Sum_num (400000) 346.281 MIB 3.816 mib t5 = sum_num (500000) 356.203 MiB 9.922 MIB test2 () 22 356.203 MiB 0.000 MIB return total