While not every Python program you write requires a rigorous performance analysis, it is reassuring that the Python ecosystem has a wide variety of tools to deal with problems when they occur.
The performance of the analyzer can be summed up to answer four basic questions:
- How fast it is running
- Where is the speed bottleneck?
- What is the memory usage rate?
- Where is the memory leak?
Below, we'll use some magical tools to delve into the answers to these questions.
time Coarse-grained calculations
Let's start by using a quick and rough way to compute our code: The traditional Unix Time tool.
$ time Python yourprogram.py real
0m1.028s
user 0m0.001s
sys 0m0.003s
The detailed meaning between the three output measured values is here StackOverflow article, but the profile is here:
- Real-refers to the actual time consuming
- User-refers to CPU time that is outside the kernel
- Sys-refers to the CPU time spent in a kernel-specific function
You will have your application run out of CPU cycles, that is, the visual sense, regardless of the system and user time added by other programs running on the system.
If the sum of the SYS and user time is less than real time, then you can guess that most of the program's performance problems are most likely related to IO wait.
Fine-grained computing time with timing context manager
Our next technique includes embedding code directly to get fine-grained timing information. Here's a little snippet of my Code for time measurement.
timer.py
Import Time
class Timer (object):
def __init__ (self, verbose=false):
self.verbose = verbose
def __ Enter__ (self):
Self.start = Time.time () return
self
def __exit__ (self, *args):
self.end = Time.time ()
self.secs = Self.end-self.start
self.msecs = self.secs * 1000 # Millisecs
if self.verbose:
print ' E Lapsed time:%f ms '% self.msecs
To use it, use the Python with keyword and timer context Manager to wrap the code you want to compute. When your code block starts executing, it takes care of the startup timer, and when your code block ends, it stops the timer.
This code fragment example:
From timer import timer
to Redis import redis
Rdb = Redis () with
timer () as T:
rdb.lpush ("foo", "Bar")
print "=> elasped Lpush:%s S"% t.secs with
Timer () as T:
rdb.lpop ("foo")
print "=> elasped Lpop:%s S "% t.secs
To see how the performance of my programs evolves over time, I often record the output of these timers to a file.
use Profiler to time and analyze execution frequency
Robert Kern has a nice project called Line_profiler, which I often use to analyze how fast my scripts are and how often each line of code executes:
To use it, you can install it by using the PIP:
Pip Install Line_profiler
After the installation is complete, you will get a new module called Line_profiler and kernprof.py executable script.
To use this tool, first set the @profile modifier on the function you want to measure. Don't worry, you don't have to introduce anything for this modifier. The kernprof.py script will automatically inject your script at runtime.
primes.py
@profile
def primes (n):
if n==2: return
[2]
elif n<2: Return
[]
s=range (3,n+1,2)
Mroot = n * * 0.5
half= (n+1)/2-1 i=0 m=3 while
m <= mroot:
if s[i]:
j= (m*m-3)/2
S[j ]=0 while
j
Once you have your code set up with modifiers @profile, run the script using kernprof.py.
The-l option tells Kernprof to inject the modifier @profile into your script, and the-v option tells Kernprof to show the timing information once your script is finished. This is a similar output for more than one script:
Wrote profile results to Primes.py.lprof
Timer unit:1e-06 s
File:primes.py
function:primes in line 2
to Tal time:0.00019 S Line
# Hits/ Hit% time line Contents
=========================================== ===================
2 @profile
3 def primes (n):
4 1 2 2.0 1.1 if n==2:
5 return [2]
6 1 1 1.0 0.5 elif n<2:
7 return []
8 1 4 4.0 2.1 s=range (3 , n+1,2)
9 1 10.0 5.3 Mroot = n * * 0.5 1 2 2.0 1.1 half= (n+1)/2-1 1 1 1.0 0.5 i=0 1 1 1.0 0.5 m=3 5 7 1.4 3.7 while m <= mroot: 4
4 1.0 2.1 if s[i]: 3 4 1.3 2.1 j= (m*m-3)/2 3 4 1.3 2.1 s[j]=0 1.0 16.3 While j
Look for rows with a higher hits value or a high time interval. These places have the greatest improvement in space.
How much memory does it use?
Now that we have good timing information for our code, let's continue to figure out how much memory our program uses. We were very fortunate that Fabian pedregosa a good memory analyzer [memory profiler][5] modeled on Robert Kern's line_profiler.
First install it through the PIP:
$ pip install-u memory_profiler
$ pip Install Psutil
Installing Psutil is recommended here because the package can improve Memory_profiler performance.
Like Line_profiler, Memory_profiler requires you to set @profile to modify your function:
@profile
def primes (n):
...
...
Run the following command to show how much memory your function uses:
$ python-m Memory_profiler primes.py
Once your program exits, you should be able to see the output like this:
Filename:primes.py
Line # Mem usage Increment line Contents
==============================================
2 @profile
3 7.9219 MB 0.0000 MB def primes (n):
4 7.9219 mb 0.0000 MB if n==2:
5 return [2]
6 7.9219 MB 0.0000 MB elif N<2:
7 return []
8 7.9219 MB 0.0000 MB s=range (3,n+1,2)
9 7.9258 mb 0.0039 MB mroot = n * * 0.5
7.9258 MB 0.0000 MB half= (n+1)/2-1 one
7.9258 mb 0.0000 MB i=0 7.9258 mb 0.0000 MB m=3 MB
0. 0039 MB while M <= mroot:
7.9297 mb 0.0000 MB if s[i]:
7.9297 mb 0.0000 MB j= (m*m-3)/2
1 6 7.9258 MB-0.0039 mb s[j]=0
7.9297 mb 0.0039 MB while j22 7.9297 MB 0.0000 MB return [2]+[x to x in S if x]
IPython shortcut commands for Line_profiler and Memory_profiler
A little-known feature of Line_profiler and Memory_profiler is that there are quick commands on IPython. All you can do is type the following command on the IPython:
%load_ext Memory_profiler
%load_ext Line_profiler
When you do this, you can use the Magic commands%lprun and%mprun, which behave like copies of their command line, and the main difference is that you don't need to set @profile modifiers for the function you need to analyze. Continue to analyze it directly in your IPython session.
In [1]: From primes import primes into
[2]:%mprun-f primes primes (1000) in
[3]:%lprun-f primes (1000)
This can save you a lot of time and effort, because using these analysis commands, you don't need to modify your source code.
Where is the memory overflow?
The CPython interpreter uses reference counting as its primary way of tracking memory. This means that each object holds a counter, and when an object is added to the store, the counter increases, and when a reference is deleted, the counter is reduced. When the counter reaches 0, the CPython interpreter knows that the object is no longer in use, so the interpreter deletes the object and frees the memory held by the object.
A memory leak often occurs when your program holds a reference to the object even when it is no longer in use.
The fastest way to discover memory leaks is to use a very good tool called [Objgraph][6] written by Marius Gedminas.
This tool lets you see the number of objects in memory, as well as the references to these objects in all the different places in the code.
To begin, we first install Objgraph
Once you have installed the tool, insert a declaration in your code that invokes the debugger.
Import PDB; Pdb.set_trace ()
Which object is most common
At run time, you can check the top 20 most popular objects running in your program
(PDB) Import Objgraph
(PDB) objgraph.show_most_common_types ()
mybigfatobject 20000
tuple 16938
function 4310
dict 2790
wrapper_descriptor
1181 Builtin_function_or_ Method 934
weakref 764
list 634
method_descriptor
507 451
Type 439
Which object has been added or deleted?
We can see which objects have been added or deleted between the two points in time.
(PDB) Import Objgraph
(PDB) objgraph.show_growth ()
...
(PDB) Objgraph.show_growth () # This is shows objects that has been added or deleted since last Show_growth () call
TR Aceback 4 +2
keyboardinterrupt 1 +1
frame +1
list 667 +1
tuple 16969 +1
What is the reference to this leaking object?
As we go on, we can also see where the references to any given object are. Let's take an example of this simple procedure below.
x = [1]
y = [x, [x], {"A": x}]
import pdb; Pdb.set_trace ()
To see what a reference to holding a variable X is, run the Objgraph.show_backref () function:
(PDB) Import Objgraph
(PDB) OBJGRAPH.SHOW_BACKREF ([x], filename= "/tmp/backrefs.png")
The output of this command is a PNG picture that is stored in/tmp/backrefs.png and it should look like this:
Backrefs (1)
The red font at the bottom of the box is the object we are interested in, and we can see that it was referenced once by the symbol X, referenced three times by the list Y. If the X object causes a memory leak, we can use this method to track all of its references in order to see why it has not been automatically retracted.
In retrospect, Objgraph allows us to:
- Show the top N objects that occupy the memory of the Python program
- Shows which objects have been added and which objects have been deleted over a period of time
- Show all references obtained in our script
Effort vs precision
In this article, I showed you how to use some tools to analyze the performance of a Python program. Armed with these tools and techniques, you should be able to get all the information you need to track most memory leaks and quickly identify bottlenecks in Python programs.
As with many other topics, running performance analysis means trade-offs between pay and precision. When in doubt, use the simplest solution to meet your current needs.