Introduction to Python performance analysis tools

Source: Internet
Author: User
Introduction to Python performance analysis tools

Introduction to performance analysis and tuning tools

There will always be a time when you want to improve the program execution efficiency, want to see which part of the time consumed has become a bottleneck, and want to know the memory and CPU usage during the program running. At this time, you will need some methods to analyze and optimize the program performance.

By Context Manager

You can use the context manager to implement a timer by yourself. refer to the previous introduction to timeit, you can use the _ enter _ and _ exit _ methods of classes to time the managed functions, such:

# Timer. pyimport timeclass Timer (object): def _ init _ (self, verbose = False): self. verbose = verbose def _ enter _ (self): self. start = time. time () return self def _ exit _ (self, * args): self. end = time. time () self. secs = self. end-self. start self. msecs = self. secs * 1000 # millisecond if self. verbose: print 'elapsed time: % FMs' % self. msecs

The usage is as follows:

from timer import Timerwith Timer() as t:    foo()print "=> foo() spends %s s" % t.secs

By Decorator

However, I think the decoration is more elegant.

import timefrom functools import wrapsdef timer(function):    @wraps(function)    def function_timer(*args, **kwargs):        t0 = time.time()        result = function(*args, **kwargs)        t1 = time.time()        print ("Total time running %s: %s seconds" %                (function.func_name, str(t1-t0))                )        return result    return function_timer

It is easy to use:

@timerdef my_sum(n):    return sum([i for i in range(n)])if __name__ == "__main__":    my_sum(10000000)

Running result:

➜  python profile.pyTotal time running my_sum: 0.817697048187 seconds

Built-in time command

Example:

➜ time python profile.pyTotal time running my_sum: 0.854454040527 secondspython profile.py  0.79s user 0.18s system 98% cpu 0.977 total

The above results show that the execution of the script consumes 0.79sCPU time and 0.18 seconds for kernel function execution. The total time is 0.977s.
Total time-(user time + system time) = the time consumed when the input and output and other tasks are executed by the system

Python timeit module

It can be used for benchmark, so that you can easily repeat the number of times a program is executed and check that the program can run multiple blocks. For more information, see the previous article.

CProfile

For more information, see the example with annotations.

# Coding = utf8def sum_num (max_num): total = 0 for I in range (max_num): total + = I return totaldef test (): total = 0 for I in range (40000): total + = I t1 = sum_num (100000) t2 = sum_num (200000) t3 = sum_num (300000) t4 = sum_num (400000) t5 = sum_num (500000) test2 () return totaldef test2 (): total = 0 for I in range (40000): total + = I t6 = sum_num (600000) t7 = sum_num (700000) return totalif _ name _ = "_ main _": import cProfile # print the analysis result directly to the console # cProfile. run ("test ()") # save the analysis result to the file # cProfile. run ("test ()", filename = "result. out ") # add the sorting method cProfile. run ("test ()", filename = "result. out ", sort =" cumulative ")

CProfile saves the analysis results to the result. out file, but stores them in binary format. if you want to view them directly, use the pstats provided for viewing.

Import pstats # create Stats object p = pstats. stats ("result. out ") # strip_dirs (): Remove irrelevant path information # sort_stats (): sorting, supported in the same way as above # print_stats (): print analysis results, you can specify the first few lines to print # and run cProfile directly. run ("test ()") returns the same p. strip_dirs (). sort_stats (-1 ). print_stats () # Sort by function name. only the information of the first three rows of functions is printed. the parameter can also be a decimal number, indicating the function information of the first few percent p. strip_dirs (). sort_stats ("name "). print_stats (3) # Sort By run time and function name p. strip_dirs (). sort_stats ("cumulative", "name "). print_stats (0.5) # If you want to know which functions have called sum_nump.print_callers (0.5, "sum_num") # check which functions are called in the test () function. print_callees ("test ")

Take an output example to check which functions are called by test:

➜  python python profile.py   Random listing order was used   List reduced from 6 to 2 due to restriction <'test'>Function              called...                          ncalls  tottime  cumtimeprofile.py:24(test2)  ->       2    0.061    0.077  profile.py:3(sum_num)                               1    0.000    0.000  {range}profile.py:10(test)   ->       5    0.073    0.094  profile.py:3(sum_num)                               1    0.002    0.079  profile.py:24(test2)                               1    0.001    0.001  {range}

Profile. Profile

CProfile also provides customizable classes for more detailed analysis. For more information, see the document.
Format: class profile. Profile (timer = None, timeunit = 0.0, subcils = True, builtins = True)
The following example is from the official document:

import cProfile, pstats, StringIOpr = cProfile.Profile()pr.enable()# ... do something ...pr.disable()s = StringIO.StringIO()sortby = 'cumulative'ps = pstats.Stats(pr, stream=s).sort_stats(sortby)ps.print_stats()print s.getvalue()

Lineprofiler

Lineprofiler is a tool for row-by-row performance analysis of functions, see github project description, address: https://github.com/rkern/line...

Example

# Coding = utf8def sum_num (max_num): total = 0 for I in range (max_num ): total + = I return total @ profile # Add @ profile to indicate which function to analyze def test (): total = 0 for I in range (40000 ): total + = I t1 = sum_num (10000000) t2 = sum_num (200000) t3 = sum_num (300000) t4 = sum_num (400000) t5 = sum_num (500000) test2 () return totaldef test2 (): total = 0 for I in range (40000): total + = I t6 = sum_num (600000) t7 = sum_num (700000) return totaltest ()

Run the kernprof command to inject the analysis. the running result is as follows:

➜ kernprof -l -v profile.pyWrote profile results to profile.py.lprofTimer unit: 1e-06 sTotal time: 3.80125 sFile: profile.pyFunction: test at line 10Line #      Hits         Time  Per Hit   % Time  Line Contents==============================================================    10                                           @profile    11                                           def test():    12         1            5      5.0      0.0      total = 0    13     40001        19511      0.5      0.5      for i in range(40000):    14     40000        19066      0.5      0.5          total += i    15    16         1      2974373 2974373.0     78.2      t1 = sum_num(10000000)    17         1        58702  58702.0      1.5      t2 = sum_num(200000)    18         1        81170  81170.0      2.1      t3 = sum_num(300000)    19         1       114901 114901.0      3.0      t4 = sum_num(400000)    20         1       155261 155261.0      4.1      t5 = sum_num(500000)    21         1       378257 378257.0     10.0      test2()    22    23         1            2      2.0      0.0      return total

There is a lot of room for optimization when the values of hits (execution times) and time (time consumed) are high.

Memoryprofiler

Similar to the "lineprofiler" module for row-based analytics program memory usage. Github: https://github.com/fabianp/me .... Ps: install psutil to make analysis faster.

Similarly, the code in "lineprofiler" above is run python-m memory_profiler profile. py command to generate the following results:

➜ python -m memory_profiler profile.pyFilename: profile.pyLine #    Mem usage    Increment   Line Contents================================================    10   24.473 MiB    0.000 MiB   @profile    11                             def test():    12   24.473 MiB    0.000 MiB       total = 0    13   25.719 MiB    1.246 MiB       for i in range(40000):    14   25.719 MiB    0.000 MiB           total += i    15    16  335.594 MiB  309.875 MiB       t1 = sum_num(10000000)    17  337.121 MiB    1.527 MiB       t2 = sum_num(200000)    18  339.410 MiB    2.289 MiB       t3 = sum_num(300000)    19  342.465 MiB    3.055 MiB       t4 = sum_num(400000)    20  346.281 MiB    3.816 MiB       t5 = sum_num(500000)    21  356.203 MiB    9.922 MiB       test2()    22    23  356.203 MiB    0.000 MiB       return total

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.