Python memory leaks and usage analysis of GC modules

Source: Internet
Author: User
Generally in in Python, in order to solve the memory leak problem, the object reference count is used, and automatic garbage collection is implemented based on the reference count.
Because Python has automatic garbage collection function, it has caused a lot of beginners mistakenly think they have a good day, no longer be disturbed by memory leaks. But if you look closely at the description of the Python document to the __del__ () function, you know there is a cloud in this good day. Here's a little excerpt from the document below:

Some common situations that could prevent the reference count of an object from going to zero include:circular references B Etween objects (e.g., a doubly-linked list or a tree data structure with parent and child pointers); A reference to the object on the stack frame of a function this caught an exception (the traceback stored in Sys.exc_trace Back keeps the stack frame alive); Or a reference to the object on the stack frame, raised an unhandled exception in interactive mode (the Traceback stor Ed in sys.last_traceback keeps the stack frame alive).

As can be seen, circular references between objects with the __del__ () function are the main culprits that cause memory leaks .
It is also necessary to note that the circular reference between Python objects without the __del__ () function can be automatically garbage collected .

How do I know if an object is leaking memory?

Method One, when you think that an object should be destroyed (that is, the reference count is 0), you can use Sys.getrefcount (obj) to get the reference count of the object, and determine if the return value is zero for memory leaks. If the reference count returned is not 0, the object obj cannot be reclaimed by the garbage collector at this point.

Method Two, you can also use the Python Extension module GC to view the details of objects that cannot be reclaimed.


First, take a look at the normal test code:

#---------------code begin--------------#-*-coding:utf-8-*-import gcimport sysclass cgcleak (object):  def __init __:    self._text = ' # ' *10  def __del__ (self):    passdef make_circle_ref ():  _gcleak = Cgcleak () #  _gcleak._self = _gcleak # test_code_1  print ' _gcleak ref count0:%d '% Sys.getrefcount (_gcleak)  del _gcleak< C8/>try:    print ' _gcleak ref count1:%d '% Sys.getrefcount (_gcleak)  except Unboundlocalerror:    print ' _ Gcleak is invalid! ' Def test_gcleak ():  # Enable Automatic garbage collection.  Gc.enable ()  # Set The garbage collection debugging flags.  Gc.set_debug (GC. debug_collectable | Gc. debug_uncollectable | /    GC. debug_instances | Gc. debug_objects)  print ' Begin leak test ... '  make_circle_ref ()  print ' begin collect ... '  _ Unreachable = Gc.collect ()  print ' unreachable object num:%d '% _unreachable  print ' garbage object num:%d '% len ( Gc.garbage) If __name__ = = ' __main__ ':  test_gcleak ()

In Test_gcleak (), after setting the garbage collector debug Flag, garbage collection is done with collect (), finally printing out the number of unreachable garbage objects found by the garbage collection and the number of garbage objects in the entire interpreter.

Gc.garbage is a list object, which is an object that the garbage collector finds unreachable (that is, garbage), but cannot be freed (that is, it cannot be reclaimed). The document is described as: A List of objects which the collector found to being unreachable but could is freed (Uncollectable objects).
In general, objects in Gc.garbage are objects in the reference ring. Because Python does not know what security order to invoke the __del__ () function of an object in the ring, the object always survives in gc.garbage, causing a memory leak. If you know a safe order, then break the reference ring and Execute del gc.garbage[:] To empty the list of junk objects.

The previous segment of the code output is (#后字符串为笔者所加注释):

#-----------------------------------------begin leak the reference count for test...# variable _gcleak is 2._gcleak ref count0:2# _gcleak becomes unreachable ( Unreachable) Illegal variable. _gcleak is invalid!# start garbage collection begin collect...# the number of unreachable garbage objects found by this garbage collection is 0.unreachable object num:0# The total number of garbage objects in the interpreter is 0.garbage object num:0#-----------------------------------------

This shows that the reference count of the _gcleak object is correct, and no memory leaks occur for any objects.

If you do not comment out the test_code_1 statement in Make_circle_ref ():

_gcleak._self = _gcleak

That is, let _gcleak form a circular reference to themselves. Run the above code again and the output will become:

#-----------------------------------------begin leak Test..._gcleak ref Count0:3_gcleak is Invalid!begin collect...# Found garbage objects that can be recycled: address 012aa090, type CGcLeak.gc:uncollectable 
 
  
   
  gc:uncollectable 
  
   
    
   Unreachable Object num:2#!! The number of garbage objects that cannot be reclaimed is 1, causing a memory leak! Garbage object num:1#-----------------------------------------
  
   
 
  

A memory leak occurred on the visible object!! And the more dict garbage is leaking _gcleak object dictionary, print out the dictionary information is:

{' _self ': <__main__. Cgcleak object at 0x012aa090>, ' _text ': ' ########## '}

In addition to a circular reference to itself, a circular reference between multiple objects can also cause a memory leak. A simple example is as follows:

#---------------code begin--------------Class Cgcleaka (object):  def __init__ (self):    self._text = ' # ' *10  def __del__ (self):    passclass cgcleakb (object):  def __init__ (self):    self._text = ' * ' *10  def __ Del__ (self):    passdef make_circle_ref ():  _a = Cgcleaka ()  _b = cgcleakb ()  _a._b = _b # test_code_2  _b._a = _a # test_code_3  print ' ref count0:a=%d b=%d '%/    (Sys.getrefcount (_a), Sys.getrefcount (_b)) #  _b. _a = None  # test_code_4  del _a  del _b  try:    print ' ref count1:a=%d '% Sys.getrefcount (_a)  Except Unboundlocalerror:    print ' _a is invalid! '  Try:    print ' ref count2:b=%d '% Sys.getrefcount (_b)  except Unboundlocalerror:    print ' _b is invalid! ' #---------------Code End----------------

This test results in the following output:

#-----------------------------------------begin leak Test...ref count0:a=3 b=3_a is Invalid!_b invalid!begin ... gc:uncollectable 
 
  
   
  gc:uncollectable 
  
   
    
   gc:uncollectable 
   
    
     
    gc:uncollectable 
    
     
      
     Unreachable Object Num:4garbage Object num:2#-----------------------------------------
    
     
   
    
  
   
 
  

A memory leak has occurred in the visible _a,_b object. Because the two are circular references, the garbage collector does not know how to recycle, that is, the __del__ () function that does not know which object to call first.

A memory leak can be avoided by breaking the ring reference using either of the following methods:

1. Comment out the test_code_2 statement in Make_circle_ref ();
2. Comment out the test_code_3 statement in Make_circle_ref ();
3. Uncomment the Test_code_4 statement in Make_circle_ref ().

The corresponding output turns into:

#-----------------------------------------begin leak Test...ref count0:a=2 b=3 # Note: The output here varies depending on the situation. _a is Invalid!_b is Invalid!begin collect...unreachable Object Num:0garbage Object num:0#-----------------------------------------

Conclusion: Python's GC has strong functions, such as setting Gc.set_debug (GC). Debug_leak) can be checked for memory leaks caused by circular references. If a memory leak check is made at development time, and when it is released to ensure that no memory leaks are available, you can increase the operational efficiency by extending the garbage collection interval for Python, or even proactively shutting down the garbage collection mechanism.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.