"Python" Memory debugging

Source: Internet
Author: User

Full text Copy from: http://blog.csdn.net/BaishanCloud/article/details/76422782

interpreting the problem-locating process

Gdb-python: Figuring out what the Python program is doing
First determine what Python is doing, whether there is a large memory consumption task running, or a deadlock and other unusual behavior.

Starting with gdb-7, GDB supports the gdb extension with Python, which, like the C program, uses GDB to inspect the Python program for threads, call stacks, and so on, and can print both the Python code and the call stack of the internal C code.

This is useful for locating a Python code problem or for its underlying C code problem.

    • Prepare GDB
      First install the Python debuginfo:
# debuginfo-install python-2.7.5-39.el7_2.x86_64

If Debuginfo is missing, when you run the next steps, GdB prompts you to complete the installation as prompted:

Missing separate debuginfos, use: debuginfo-install python-2.7.5-39.el7_2.x86_64
    • Access GDB
      You can directly use GDB attach to 1 Python processes to see their running status:
# gdb python 11122

After attach into GDB, the basic check steps are as follows:

    • View Threads
(GDB) Info threads ID Target ID Frame206 Thread0x7febdbfe3700 (LWP124916)"Python2"0x00007febe9b75413In select () at: /sysdeps/unix/syscall-template. S81205 Thread0x7febdb7e2700 (LWP124917) "Python2" 0x00007febe9b75413 in Select () at: /sysdeps/unix/syscall-template. S:bayi 204 Thread 0x7febdafe1700 (LWP 124918) "Python2" 0x00007febe9b75413 in Select () at. /sysdeps/unix/syscall-template. S:bayi 203 Thread 0x7febda7e0700 (LWP 124919) "Python2" 0x00007febe9b7369d in poll () at .. /sysdeps/unix/syscall-template. S:bayi                  

General locking, deadlock conditions exist, will be wired measuring modules in xx_wait and other functions.

Previously, this method was used to locate the deadlock problem caused by 1 python-logging modules:
Running fork in a multithreaded process causes the logging lock to be locked and fork to the new process, but the unlocked thread does not fork to the new process causing a deadlock.

    • Viewing the call stack
      If a thread is found to have a problem, switch to this thread, review the call stack to determine the specific steps to take, using the BT command:
(gdb) bt#16 0x00007febea8500bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, [email protected]=0x0, args=<optimized out>, [email protected]=1, kws=0x38aa668, kwcount=2, defs=0x3282a88, defcount=2, [email protected]=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330...#19 PyEval_EvalFrameEx ([email protected]=Frame 0x38aa4d0, for file t.py, line 647, in run (part_num=2, consumer=<...

The BT command can see not only the call stack of C, but also the call stack of the Python source. , Frame-16 is the call stack for C, and frame-19 displays the line where the Python source code is located.

If you only see the call stack for Python code, use the PY-BT command:

(gdb) py-bt#1 <built-in method poll of select.epoll object at remote 0x7febeacc5930>#3 Frame 0x3952450, for file /usr/lib64/python2.7/site-packages/twisted/internet/epollreactor.py, line 379, in doPoll(self=<... l = self._poller.poll(timeout, len(self._selectables))#7 Frame 0x39502a0, for file /usr/lib64/python2.7/site-packages/twisted/internet/base.py, line 1204, in mainLoop (self=<...

PY-BT Displays the Python source's call stack, the invocation parameters, and the code for the line in which it is located.

    • Coredump
      For long-time tracking, it is best to coredump the full process information of the Python program before analyzing the core file to avoid affecting the running program.
(gdb) generate-core-file

This command dumps the current GDB attach program to its running directory, named Core, and then uses GDB to load the core file, print the stack, view the variables, and so on, without attach to the running program:

# gdb python core.<pid>

    • Other Commands
      Other commands can be viewed in the gdb input py, corresponding to GDB's commands, for example:
(gdb) pypy-bt               py-list             py-print            pythonpy-down             py-locals           py-up               python-interactive

-py-up, Py-down can be moved to the previous or next frame of the Python call stack;
-py-locals used to print local variables ...
The help command can also be used in gdb to view assistance:

(gdb) help py-printLook up the given python variable name, and print it

During this tracing process, the problem of program logic was ruled out with Gdb-python. Next, continue to trace the memory leak issue.

Pyrasite: Connect to Python Program

Pyrasite can directly connect to a running Python program, open a Ipython-like interactive terminal to run commands, check program status.

This provides a great convenience for debugging.
Installation:

# pip install pyrasite...# pip show pyrasiteName: pyrasiteVersion: 2.0Summary: Inject code into a running Python processHome-page: http://pyrasite.comAuthor: Luke Macken...

Connect to the problematic Python program and start collecting information:

pyrasite-shell <pid>>>>

You can then call any Python code in the process to see the status of the process.

Psutil View Python process status

pip install psutil

First, look at the system memory RSS that the Python process consumes:

11122>>> import psutil, os>>> psutil.Process(os.getpid()).memory_info().rss 29095232

The basic is consistent with the PS command display result:

rss the real memory (resident set) size of the process (in 1024 byte units)

Guppy to get the various object occupancy of memory usage
Guppy can print the space of various objects, and if there are non-freed objects in the Python process, the memory consumption is increased and can be viewed through guppy.

同样,以下步骤是通过pyrasite-shell,attach到目标进程后操作的。
# pip Install GuppyFrom GuppyImport hpyh = Hpy () h.heap ()# Partition of a set of 48477 objects. Total size = 3265516 bytes.# Index Count% Size% cumulative% Kind (Class/dict of Class)# 0 25773 1612820 1612820# 1 11699 483960 2096780# 2 174 0 241584 7 2338364 dict of module# 3 3478 7 222592 7 2560956 types. CodeType# 4 3296 7 184576 6 2745532 function# 5 401 1 175112 5 2920644 89 Dict of Class# 6 108 0 81888 3 3002532 dict (no owner) # 7 114 0 79632 2 3082164 94 dict of Type# 8 117 0 51336 2 3133500 type# 9 667 1 24012 1 3157512 __builtin__.wrapper_descriptor# <76 more rows. Type e.g. ' _.more ' to View.>h.iso (1,[],{}) # Partition of a set of 3 objects. Total size = 176 bytes. # Index Count% Size% cumulative% Kind (Class/dict of Class) # 0 1 136 136 Dict (no owner) # 1 1 1 164 list# 2 33 7 176 int             

You can use the above steps to exclude possible objects that are not disposed of in the python process.

Objects that cannot be reclaimed

While Python itself is garbage collected, individual objects in a Python program cannot be recycled (Uncollectable object) when the following 2 conditions are met:
-Circular references
-The del method is defined on an object in the loop reference chain

The official explanation is that a set of objects that are referenced by a loop is recognized by the GC module as recyclable, but the del on each object must be called before it can be reclaimed. When the user customizes the del object, the GC system cannot determine which delon the ring should be called first, and therefore cannot reclaim such objects.

A Python object that cannot be reclaimed continues to occupy memory, so we speculate that objects that cannot be recycled cause memory to continue to rise.

Finally determine that the memory that is not caused by this problem cannot be freed. Objects that cannot be reclaimed are still listed by Gc.get_objects () and are added to the list of gc.garbage after the Gc.collect () call. But the existence of such objects is not yet discovered.
To find uncollectable objects:

11122>>> import gc>>> gc.collect() # first run gc, find out uncollectable object and put them in gc.garbage # output number of object collected>>> gc.garbage # print all uncollectable objects[] # empty

If you print out any objects that cannot be recycled, you need to look further to determine which object on the loop reference chain contains the del method.

Here are 1 examples to demonstrate how to generate objects that cannot be recycled:

From __future__Import Print_functionImport GCSnippet shows how to create a uncollectible object:it are an object in a cycle reference chain, in which there is a n objectwith __del__ defined.    The simpliest is an object, refers to itself, and with a __del__ defined. > Python uncollectible.py ======= Collectible Object ======= * * init, nr of Referrers:4 GA Rbage: [] created:collectible: <__main__. One object at 0x102c01090> nr of referrers:5 Delete: * * * __del__ called * * * AF     ter GC, nr of referrers:4 garbage: [] ======= uncollectible Object ======= * * * init, Nr of referrers:4 garbage: [] created:uncollectible: <__main__. One object at 0x102c01110> nr of referrers:5 Delete: * * * after GC, nr of referrer S:5 garbage: [<__main__. one object at 0x102c01110>]‘‘‘ DefDd(*msg):For MIn Msg:print (M, end=") print ()ClassOne(object):Def__init__(Self, collectible):If Collectible:self.typ =' Collectible 'Else:self.typ =' Uncollectible '# Make a reference to the IT self, to form a reference cycle.# A reference cycle with __del__, makes it uncollectible. Self.me = SelfDef__del__(self): DD (' * * * * __del__ called ')DefTest_it(collectible): DD () DD (' ======= ', (' Collectible 'If collectibleElse ' uncollectible '),  ' object ======= ') dd () Gc.collect () DD ( "garbage: ', gc.garbage) one = one (collectible) DD ( ' created: ', One.typ,  ': ' , one) DD ( Delete: ') del one Gc.collect () DD ( ' garbage: ', gc.garbage) if __ name__ = =  "__main__": Test_it (Collectible=True) test_it ( Collectible=false)          

The above code creates 2 objects: one recyclable, one not recyclable, they all define the del method, and the only difference is whether they refer to themselves (and thus form the reference ring).

If you find a circular reference in this step, you need to further identify which referential relationships are causing the loop, which in turn destroys the circular reference and ultimately makes the object recyclable.

Objgraph Find Circular references

# pip install objgraphpyrasite-shell 11122>>> import objgraph>>> objgraph.show_refs([an_object], filename=‘sample-graph.png‘)

In the example above, a picture is generated locally that describes the diagram that can be referenced by An_object:

In this step we still haven't found the object that cannot be recycled, and after all the reasons we have inferred LIBC's malloc implementation problem. The problem is eventually fixed by using Tcmalloc instead of the default malloc libc.

"Python" Memory debugging

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.