Python memory leaks and usage analysis of GC modules _python

Source: Internet
Author: User
Tags garbage collection

Generally in Python, in order to solve the memory leak problem, the object reference count is adopted, and automatic garbage collection is realized based on reference counting.
because Python has the automatic garbage collection function, causes many beginners to mistakenly think oneself from now on the good day, does not have to be subjected to the memory leakage the harassment. But if you look at the description of the __del__ () function in the Python document, you know that the good days are cloudy. Here is a quick excerpt of the following document:

Some common situations that could prevent reference count of an object from going to zero include:circular references B Etween objects (e.g., a doubly-linked list or a tree of data structure with the parent and child pointers); A reference to the "object" on the stack frame of a function that caught a exception (the traceback stored in Sys.exc_trace Back keeps the stack frame alive); Or a reference to the "object" on the stack frame this raised an unhandled exception in interactive mode (the Traceback stor Ed in sys.last_traceback keeps the stack frame alive).

Thus, circular references between objects with __del__ () functions are the main culprits that cause memory leaks .
Also note that a circular reference to a Python object that does not have a __del__ () function can be recycled by automatic garbage .

How do you know if an object has a memory leak?

Method First, when you think an object should be destroyed (that is, the reference count is 0), you can get the reference count of the object by Sys.getrefcount (obj) and determine whether a memory leak is based on whether the return value is 0来. If the returned reference count is not 0, the object obj cannot be reclaimed by the garbage collector at the moment.

Method Two, you can also use the Python Extension module GC to view the details of objects that cannot be reclaimed.


First, take a look at the normal test code:

 #---------------code begin--------------#-*-coding:utf-8-*-import GC import SYS class CGC Leak (object): Def __init__ (self): Self._text = ' # ' *10 def __del__ (self): Pass def make_circle_ref (): _gcle AK = Cgcleak () # _gcleak._self = _gcleak # test_code_1 print ' _gcleak ref count0:%d '% Sys.getrefcount (_gcleak) del _ Gcleak try:print ' _gcleak ref count1:%d '% Sys.getrefcount (_gcleak) except Unboundlocalerror:print ' _gcleak

Is invalid! '
  Def test_gcleak (): # Enable Automatic garbage collection.
  Gc.enable () # Set the garbage collection debugging flags. Gc.set_debug (GC. debug_collectable | Gc. debug_uncollectable | /GC. debug_instances | Gc. 
  debug_objects) print ' Begin leak test ... ' make_circle_ref () print ' Begin collect ... ' _unreachable = Gc.collect () print ' unreachable object num:%d '% _unreachable print ' garbage object num:%d '% len (gc.garbage) If __name__ = ' __m Ain__ ': Test_gcleak () 

In Test_gcleak (), after setting the garbage collector debug flag, then using collect () for garbage collection, the last print out the number of unreachable garbage objects that the garbage collection found and the number of garbage objects in the entire interpreter.

Gc.garbage is a list object that is an object that the garbage collector finds unreachable (that is, it is a garbage object) but cannot be released (that is, it cannot be recycled). The document is described as: A List of objects which the collector found to is unreachable but the not being could (freed uncollectable).
Typically, an object in Gc.garbage is an object in a reference ring. Because Python does not know what security order to call the __del__ () function of the object in the loop, the object always survives in the gc.garbage, causing a memory leak. If you know a secure order, then break the reference ring and then execute del gc.garbage[:] To empty the list of garbage objects.

The previous code output is (#后字符串为笔者所加注释):

#-----------------------------------------
begin leak Test ...
# The reference count for variable _gcleak is 2.
_gcleak ref count0:2
# _gcleak becomes an illegal variable of unreachable (unreachable).
_gcleak is invalid!
# Start garbage collection begin collect ...
# The number of garbage objects found in this garbage collection is 0.
Unreachable Object Num:0
# The number of garbage objects in the entire interpreter is 0.
Garbage Object num:0
#-----------------------------------------

This shows that the reference count of the _gcleak object is correct, and no memory leaks occur for any objects.

If you do not comment out the test_code_1 statement in Make_circle_ref ():

_gcleak._self = _gcleak

That is, let _gcleak form a circular reference to itself. When you run the above code, the output turns to:

#-----------------------------------------
begin leak Test ...
_gcleak ref Count0:3
_gcleak is invalid!
Begin Collect
... # Find garbage objects that can be recycled: The address is 012aa090 and the type is cgcleak.
Gc:uncollectable <cgcleak 012aa090>
gc:uncollectable <dict 012ac1e0>
Object Unreachable c11/>#!! The number of garbage objects that cannot be recycled is 1, resulting in a memory leak!
Garbage Object num:1
#-----------------------------------------

Visible <cgcleak 012aa090> object has a memory leak!! And the more dict garbage is the leak of the _gcleak object dictionary, print out the dictionary information is:

{' _self ': <__main__. Cgcleak object at 0x012aa090>, ' _text ': ' ########## '}

In addition to your own circular references, circular references between multiple objects can also cause a memory leak. Simple examples are as follows:

#---------------Code begin--------------

class Cgcleaka (object):
  def __init__ (self):
    self._text = ' # ' *

  def __del__ (self): Pass

class cgcleakb (object):
  def __init__ (self):
    self._text = ' * ' *10

  def __del__ (self):
    pass

def make_circle_ref ():
  _a = Cgcleaka ()
  _b = cgcleakb () _a._b
  = _b # test_code_2
  _b._a = _a # test_code_3
  print ' ref count0:a=%d b=%d '%/
    (Sys.getrefcount (_a), Sys.getrefcount (_b))
#  _b._a = None  # test_code_4
  del _a
  del _b
  try:
    print ' ref count1:a=%d '% sys.getrefcount (_a)
  except Unboundlocalerror:
    print ' _a is invalid! '
  Try:
    print ' ref count2:b=%d '% Sys.getrefcount (_b)
  except Unboundlocalerror:
    print ' _b is invalid! '

#---------------Code End----------------

The output after this test is:

#-----------------------------------------
begin leak Test ...
Ref count0:a=3 b=3
_a is invalid!
_b is invalid!
Begin Collect
... Gc:uncollectable <cgcleaka 012aa110>
gc:uncollectable <cgcleakb 012aa0b0> gc:uncollectable
< Dict 012ac1e0>
gc:uncollectable <dict 012ac0c0> Unreachable object Num:4 Garbage object num:2<
c12/>#-----------------------------------------

There is a memory leak in the visible _a,_b object. Because the two are circular references, the garbage collector does not know how to recycle, that is, it does not know the __del__ () function that first invokes that object.

Use any of the following methods to break a ring reference to avoid a memory leak:

1. Comment out the test_code_2 statement in Make_circle_ref ();
2. Comment out the test_code_3 statement in Make_circle_ref ();
3. Uncomment the Test_code_4 statement in Make_circle_ref ().

The corresponding output turns to:

#-----------------------------------------
begin leak Test ...
Ref count0:a=2 B=3 # Note: The output here varies depending on the situation.
_a is invalid!
_b is invalid!
Begin Collect
... Unreachable object num:0
Garbage Object num:0
#-----------------------------------------

Conclusion: The Python GC has strong functions, such as setting Gc.set_debug (GC). Debug_leak) can be checked for memory leaks caused by circular references. If memory leaks are checked at development time, and you can ensure that no memory leaks are made when you publish, you can increase your efficiency by prolonging the garbage collection interval of Python, or even by actively shutting down the garbage collection mechanism.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.