Python Memory leakage and gc module Usage Analysis, python leak gc Module

Source: Internet
Author: User

Python Memory leakage and gc module Usage Analysis, python leak gc Module

Generally, in Python, object reference counting is used to solve Memory leakage and automatic garbage collection is implemented based on reference counting.
Because Python has the automatic garbage collection function, many beginners mistakenly believe that they have lived a good life and do not have to be disturbed by memory leaks. However, if you carefully check the description of the _ del _ () function in the Python document, you will know that such a good life is also overcast. The following documents are excerpted as follows:

Some common situations that may prevent the reference count of an object from going to zero include: circular references between objects (e.g ., a doubly-linked list or a tree data structure with parent and child pointers); a reference to the object on the stack frame of a function that caught an exception (the traceback stored in sys. exc_traceback keeps the stack frame alive); or a reference to the object on the stack frame that raised an unhandled exception in interactive mode (the traceback stored in sys. last_traceback keeps the stack frame alive ).

It can be seen that the cyclic reference between objects with the _ del _ () function is the main culprit of Memory leakage.
In addition, it should be noted that the circular reference between Python objects without the _ del _ () function can be automatically decommissioned.

How do I know if the memory of an object is leaked?

Method 1: When you think an object should be destroyed (that is, the reference count is 0), you can use sys. getrefcount (obj) is used to obtain the reference count of an object and determine whether the returned value is 0 for Memory leakage. If the returned reference count is not 0, the object obj cannot be recycled by the garbage collector at the moment.

Method 2: You can also use the Python extension module gc to view the details of objects that cannot be recycled.


First, let's look at a normal test code:

#--------------- code begin --------------# -*- coding: utf-8 -*-import gcimport sysclass CGcLeak(object):  def __init__(self):    self._text = '#'*10  def __del__(self):    passdef make_circle_ref():  _gcleak = CGcLeak()#  _gcleak._self = _gcleak # test_code_1  print '_gcleak ref count0:%d' % sys.getrefcount(_gcleak)  del _gcleak  try:    print '_gcleak ref count1:%d' % sys.getrefcount(_gcleak)  except UnboundLocalError:    print '_gcleak is invalid!'def test_gcleak():  # Enable automatic garbage collection.  gc.enable()  # Set the garbage collection debugging flags.  gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | /    gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS)  print 'begin leak test...'  make_circle_ref()  print 'begin collect...'  _unreachable = gc.collect()  print 'unreachable object num:%d' % _unreachable  print 'garbage object num:%d' % len(gc.garbage)if __name__ == '__main__':  test_gcleak()

In test_gcleak (), set the Garbage Collector debugging flag and use collect () for garbage collection, finally, the unattainable Number of spam objects found in the garbage collection and the total number of spam objects in the interpreter are printed.

Gc. garbage is a list object. The list item is an inaccessible (that is, a spam object) discovered by the garbage collector, but cannot be released (that is, it cannot be recycled. Description: A list of objects which the collector found to be unreachable but cocould not be freed (uncollectable objects ).
Normally, the objects in gc. garbage are the objects in the reference ring. Python does not know the security order in which to call the _ del _ () function of the object in the ring. As a result, the object remains in gc. garbage, causing memory leakage. If you know a safe order, you can break the reference ring and run del gc. garbage [:] to clear the list of junk objects.

Output of the above Code is (# The post string is the author's note ):

# ----------------------------------------- Begin leak test... # Variable _ gcleak reference count is 2. _ gcleak ref count0: 2 # _ gcleak becomes an unreachable variable. _ gcleak is invalid! # Start garbage collection begin collect... # The number of inaccessible spam objects discovered in this garbage collection is 0. unreachable object num: 0 # The number of spam objects in the interpreter is 0. garbage object num: 0 #-----------------------------------------

It can be seen that the reference count of the _ gcleak object is correct, and no memory leakage occurs to any object.

If you do not comment out the test_code_1 statement in make_circle_ref:

_gcleak._self = _gcleak

That is, let _ gcleak form a circular reference of itself. Run the above Code and the output result is:

# --------------------------------------- Begin leak test... _ gcleak ref count0: 3_gcleak is invalid! Begin collect... # garbage objects that can be recycled: The address is 012AA090 and the type is CGcLeak. gc: uncollectable <CGcLeak 012AA090> gc: uncollectable <dict 012AC1E0> unreachable object num: 2 #!! The number of garbage objects that cannot be recycled is 1, leading to memory leakage! Garbage object num: 1 #-----------------------------------------

The <CGcLeak 012AA090> object has a memory leakage !! The extra dict garbage is the leaked _ gcleak object dictionary, and the output dictionary information is:

{'_self': <__main__.CGcLeak object at 0x012AA090>, '_text': '##########'}

In addition to its own cyclic references, cyclic references between multiple objects can also cause memory leakage. An example is as follows:

#--------------- code begin --------------class CGcLeakA(object):  def __init__(self):    self._text = '#'*10  def __del__(self):    passclass CGcLeakB(object):  def __init__(self):    self._text = '*'*10  def __del__(self):    passdef make_circle_ref():  _a = CGcLeakA()  _b = CGcLeakB()  _a._b = _b # test_code_2  _b._a = _a # test_code_3  print 'ref count0:a=%d b=%d' % /    (sys.getrefcount(_a), sys.getrefcount(_b))#  _b._a = None  # test_code_4  del _a  del _b  try:    print 'ref count1:a=%d' % sys.getrefcount(_a)  except UnboundLocalError:    print '_a is invalid!'  try:    print 'ref count2:b=%d' % sys.getrefcount(_b)  except UnboundLocalError:    print '_b is invalid!'#--------------- code end ----------------

After this test, the output result is:

#-----------------------------------------begin leak test...ref count0:a=3 b=3_a is invalid!_b is invalid!begin collect...gc: uncollectable <CGcLeakA 012AA110>gc: uncollectable <CGcLeakB 012AA0B0>gc: uncollectable <dict 012AC1E0>gc: uncollectable <dict 012AC0C0>unreachable object num:4garbage object num:2#-----------------------------------------

It can be seen that all the objects _ a and _ B have memory leaks. Because the two are circular references, the garbage collector does not know how to recycle, that is, it does not know how to call the _ del _ () function of the object first.

Use any of the following methods to avoid Memory leakage by breaking the ring reference:

1. comment out the test_code_2 statement in make_circle_ref;
2. comment out the test_code_3 statement in make_circle_ref;
3. uncomment the test_code_4 statement in make_circle_ref.

The output result is as follows:

# --------------------------------------- Begin leak test... ref count0: a = 2 B = 3 # Note: The output result varies depending on the situation. _ a is invalid! _ B is invalid! Begin collect... unreachable object num: 0 garbage object num: 0 #-----------------------------------------

Conclusion: gc of Python has strong functions. For example, you can set gc. set_debug (gc. DEBUG_LEAK) to check memory leakage caused by cyclic reference. If you perform a memory leak check during development and ensure that the memory is not leaked during release, you can prolong the garbage collection interval of Python and even disable the garbage collection mechanism, this improves the running efficiency.


Python uses the gc module to check for Memory leakage. How can this problem be solved?

Bytes.com/..eports

C ++ calling the python module may cause memory leakage. It seems that the python memory management mechanism is a problem. I am not sure if anyone has studied it.

Py_Finalize () will free up all the memory you use in python. If you get PyObject in C, Py_Finalize () should be left empty, it's always worth it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.