Introduction to Python performance analysis tools

Last Update:2017-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction to performance analysis and tuning tools

There will always be a time when you want to improve the program execution efficiency, want to see which part of the time consumed has become a bottleneck, and want to know the memory and CPU usage during the program running. At this time, you will need some methods to analyze and optimize the program performance.

By Context Manager

You can use the context manager to implement a timer by yourself. refer to the previous introduction to timeit, you can use the _ enter _ and _ exit _ methods of classes to time the managed functions, such:

# Timer. pyimport timeclass Timer (object): def _ init _ (self, verbose = False): self. verbose = verbose def _ enter _ (self): self. start = time. time () return self def _ exit _ (self, * args): self. end = time. time () self. secs = self. end-self. start self. msecs = self. secs * 1000 # millisecond if self. verbose: print 'elapsed time: % FMs' % self. msecs

The usage is as follows:

from timer import Timerwith Timer() as t:    foo()print "=> foo() spends %s s" % t.secs

By Decorator

However, I think the decoration is more elegant.

import timefrom functools import wrapsdef timer(function):    @wraps(function)    def function_timer(*args, **kwargs):        t0 = time.time()        result = function(*args, **kwargs)        t1 = time.time()        print ("Total time running %s: %s seconds" %                (function.func_name, str(t1-t0))                )        return result    return function_timer

It is easy to use:

@timerdef my_sum(n):    return sum([i for i in range(n)])if __name__ == "__main__":    my_sum(10000000)

Running result:

➜  python profile.pyTotal time running my_sum: 0.817697048187 seconds

Built-in time command

Example:

➜ time python profile.pyTotal time running my_sum: 0.854454040527 secondspython profile.py  0.79s user 0.18s system 98% cpu 0.977 total

The above results show that the execution of the script consumes 0.79sCPU time and 0.18 seconds for kernel function execution. The total time is 0.977s.
Total time-(user time + system time) = the time consumed when the input and output and other tasks are executed by the system

Python timeit module

It can be used for benchmark, so that you can easily repeat the number of times a program is executed and check that the program can run multiple blocks. For more information, see the previous article.

CProfile

For more information, see the example with annotations.

# Coding = utf8def sum_num (max_num): total = 0 for I in range (max_num): total + = I return totaldef test (): total = 0 for I in range (40000): total + = I t1 = sum_num (100000) t2 = sum_num (200000) t3 = sum_num (300000) t4 = sum_num (400000) t5 = sum_num (500000) test2 () return totaldef test2 (): total = 0 for I in range (40000): total + = I t6 = sum_num (600000) t7 = sum_num (700000) return totalif _ name _ = "_ main _": import cProfile # print the analysis result directly to the console # cProfile. run ("test ()") # save the analysis result to the file # cProfile. run ("test ()", filename = "result. out ") # add the sorting method cProfile. run ("test ()", filename = "result. out ", sort =" cumulative ")

CProfile saves the analysis results to the result. out file, but stores them in binary format. if you want to view them directly, use the pstats provided for viewing.

Import pstats # create Stats object p = pstats. stats ("result. out ") # strip_dirs (): Remove irrelevant path information # sort_stats (): sorting, supported in the same way as above # print_stats (): print analysis results, you can specify the first few lines to print # and run cProfile directly. run ("test ()") returns the same p. strip_dirs (). sort_stats (-1 ). print_stats () # Sort by function name. only the information of the first three rows of functions is printed. the parameter can also be a decimal number, indicating the function information of the first few percent p. strip_dirs (). sort_stats ("name "). print_stats (3) # Sort By run time and function name p. strip_dirs (). sort_stats ("cumulative", "name "). print_stats (0.5) # If you want to know which functions have called sum_nump.print_callers (0.5, "sum_num") # check which functions are called in the test () function. print_callees ("test ")

Take an output example to check which functions are called by test:

➜  python python profile.py   Random listing order was used   List reduced from 6 to 2 due to restriction <'test'>Function              called...                          ncalls  tottime  cumtimeprofile.py:24(test2)  ->       2    0.061    0.077  profile.py:3(sum_num)                               1    0.000    0.000  {range}profile.py:10(test)   ->       5    0.073    0.094  profile.py:3(sum_num)                               1    0.002    0.079  profile.py:24(test2)                               1    0.001    0.001  {range}

Profile. Profile

CProfile also provides customizable classes for more detailed analysis. For more information, see the document.
Format: class profile. Profile (timer = None, timeunit = 0.0, subcils = True, builtins = True)
The following example is from the official document:

import cProfile, pstats, StringIOpr = cProfile.Profile()pr.enable()# ... do something ...pr.disable()s = StringIO.StringIO()sortby = 'cumulative'ps = pstats.Stats(pr, stream=s).sort_stats(sortby)ps.print_stats()print s.getvalue()

Lineprofiler

Lineprofiler is a tool for row-by-row performance analysis of functions, see github project description, address: https://github.com/rkern/line...

Example

# Coding = utf8def sum_num (max_num): total = 0 for I in range (max_num ): total + = I return total @ profile # Add @ profile to indicate which function to analyze def test (): total = 0 for I in range (40000 ): total + = I t1 = sum_num (10000000) t2 = sum_num (200000) t3 = sum_num (300000) t4 = sum_num (400000) t5 = sum_num (500000) test2 () return totaldef test2 (): total = 0 for I in range (40000): total + = I t6 = sum_num (600000) t7 = sum_num (700000) return totaltest ()

Run the kernprof command to inject the analysis. the running result is as follows:

➜ kernprof -l -v profile.pyWrote profile results to profile.py.lprofTimer unit: 1e-06 sTotal time: 3.80125 sFile: profile.pyFunction: test at line 10Line #      Hits         Time  Per Hit   % Time  Line Contents==============================================================    10                                           @profile    11                                           def test():    12         1            5      5.0      0.0      total = 0    13     40001        19511      0.5      0.5      for i in range(40000):    14     40000        19066      0.5      0.5          total += i    15    16         1      2974373 2974373.0     78.2      t1 = sum_num(10000000)    17         1        58702  58702.0      1.5      t2 = sum_num(200000)    18         1        81170  81170.0      2.1      t3 = sum_num(300000)    19         1       114901 114901.0      3.0      t4 = sum_num(400000)    20         1       155261 155261.0      4.1      t5 = sum_num(500000)    21         1       378257 378257.0     10.0      test2()    22    23         1            2      2.0      0.0      return total

There is a lot of room for optimization when the values of hits (execution times) and time (time consumed) are high.

Memoryprofiler

Similar to the "lineprofiler" module for row-based analytics program memory usage. Github: https://github.com/fabianp/me .... Ps: install psutil to make analysis faster.

Similarly, the code in "lineprofiler" above is run python-m memory_profiler profile. py command to generate the following results:

➜ python -m memory_profiler profile.pyFilename: profile.pyLine #    Mem usage    Increment   Line Contents================================================    10   24.473 MiB    0.000 MiB   @profile    11                             def test():    12   24.473 MiB    0.000 MiB       total = 0    13   25.719 MiB    1.246 MiB       for i in range(40000):    14   25.719 MiB    0.000 MiB           total += i    15    16  335.594 MiB  309.875 MiB       t1 = sum_num(10000000)    17  337.121 MiB    1.527 MiB       t2 = sum_num(200000)    18  339.410 MiB    2.289 MiB       t3 = sum_num(300000)    19  342.465 MiB    3.055 MiB       t4 = sum_num(400000)    20  346.281 MiB    3.816 MiB       t5 = sum_num(500000)    21  356.203 MiB    9.922 MiB       test2()    22    23  356.203 MiB    0.000 MiB       return total

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to Python performance analysis tools

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to Python performance analysis tools

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support