Understand Python garbage collection mechanism and python garbage collection mechanism
I. Garbage Collection Mechanism
In Python, garbage collection is based on reference count, supplemented by generational collection. The defect of reference count is the issue of circular reference.
In Python, if the reference number of an object is 0, the Python virtual machine will reclaim the memory of this object.
#encoding=utf-8__author__ = 'kevinlu1010@qq.com'class ClassA(): def __init__(self): print 'object born,id:%s'%str(hex(id(self))) def __del__(self): print 'object del,id:%s'%str(hex(id(self)))def f1(): while True: c1=ClassA() del c1
Executing f1 () will output such results cyclically, and the memory occupied by the process will not change.
object born,id:0x237cf58object del,id:0x237cf58
C1 = ClassA () creates an object and stores it in 0x237cf58 memory. The c1 variable points to this memory. At this time, the reference count of this memory is 1.
After del c1, the c1 variable no longer points to the 0x237cf58 memory. Therefore, the reference count of this memory is reduced by one, which is equal to 0. Therefore, this object is destroyed and the memory is released.
1. Cause reference count + 1
- Object created, for example, a = 23
- The object is referenced, for example, B =.
- The object is passed into a function as a parameter, for example, func ()
- An object is stored in a container as an element, for example, list1 = [a, a]
2. case where the reference count is-1
- The object alias is explicitly destroyed, for example, del
- The object alias is assigned to a new object, for example, a = 24
- An object leaves its scope. For example, when the f function is executed, the local variable in the func function (the global variable does not)
- The container where the object is located is destroyed or the object is deleted from the container
Demo
def func(c,d): print 'in func function', sys.getrefcount(c) - 1print 'init', sys.getrefcount(11) - 1a = 11print 'after a=11', sys.getrefcount(11) - 1b = aprint 'after b=1', sys.getrefcount(11) - 1func(11)print 'after func(a)', sys.getrefcount(11) - 1list1 = [a, 12, 14]print 'after list1=[a,12,14]', sys.getrefcount(11) - 1a=12print 'after a=12', sys.getrefcount(11) - 1del aprint 'after del a', sys.getrefcount(11) - 1del bprint 'after del b', sys.getrefcount(11) - 1# list1.pop(0)# print 'after pop list1',sys.getrefcount(11)-1del list1print 'after del list1', sys.getrefcount(11) - 1
Output
init 24after a=11 25after b=1 26in func function 28after func(a) 26after list1=[a,12,14] 27after a=12 26after del a 26after del b 25after del list1 24
Question: Why does calling a function cause reference count + 2?
3. view the reference count of an object
Sys. getrefcount (a) can view the reference count of object a, but it is 1 larger than the normal count. Because a is input when the function is called, this will allow reference count of object a to + 1.
Ii. Memory leakage caused by cyclic reference
def f2(): while True: c1=ClassA() c2=ClassA() c1.t=c2 c2.t=c1 del c1 del c2
When f2 () is executed, the memory occupied by the process increases continuously.
object born,id:0x237cf30object born,id:0x237cf58
After c1 and c2 are created, the reference count of 0x237cf30 (memory corresponding to c1 is counted as memory 1) and 0x237cf58 (memory corresponding to c2 is counted as memory 2) is 1, after c1.t = c2 and c2.t = c1 are executed, the reference count of the two memories is changed to 2.
After del c1, the reference count of the object in memory 1 is changed to 1. Because it is not 0, the object in memory 1 will not be destroyed, therefore, the number of references for objects in memory 2 is still 2. After del c2, similarly, for objects in memory 1, the number of references for objects in memory 2 is 1.
Although both objects can be destroyed, the garbage collector will not recycle them due to loop reference, which may cause memory leakage.
Iii. Garbage Collection
Deff3 (): # print gc. collect () c1 = ClassA () c2 = ClassA () c1.t = c2 c2.t = c1 del c1 del c2 print gc. garbage print gc. collect () # explicitly execute the garbage collection print gc. garbage time. sleep (10) if _ name _ = '_ main _': gc. set_debug (gc. DEBUG_LEAK) # Set the log f3 () of the gc Module ()
Output:
gc: uncollectable <ClassA instance at 0230E918>gc: uncollectable <ClassA instance at 0230E940>gc: uncollectable <dict 0230B810>gc: uncollectable <dict 02301ED0>object born,id:0x230e918object born,id:0x230e9404
- The garbage collection objects will be placed in the gc. garbage list.
- Gc. collect () returns the number of inaccessible objects. 4 equals two objects and their corresponding dict
- Garbage collection is triggered in three cases:
1. Call gc. collect (),
2. When the counter of the gc module reaches the threshold value.
3. When the program exits
Iv. Analysis of common functions of the gc Module
The gc module provides an interface for developers to set the garbage collection option. As mentioned above, one defect in memory management using the reference counting method is circular reference, and one of the main functions of the gc module is to solve the problem of circular reference.
Common functions:
1,Gc. set_debug (flags)
Set the debug log of gc, which is generally set to gc. DEBUG_LEAK.
2. gc. collect ([generation])
For explicit garbage collection, you can enter a parameter. 0 indicates to only check the first generation of objects, 1 indicates to check the first generation of objects, 2 indicates to check the first, second, and third generation objects, if no parameter is set, a full collection is executed, that is, 2 is passed.
Returns the number of unreachable objects.
3. gc. set_threshold (threshold0 [, threshold1 [, threshold2])
Set the frequency of automatic garbage collection.
4. gc. get_count ()
Returns a list with a length of 3.
5. Automatic garbage collection mechanism of the gc Module
You must import the gc module and is_enable () = True to enable automatic garbage collection.
This mechanism is mainly used to discover and process inaccessible spam objects.
Garbage collection = garbage check + garbage collection
In Python, the generational collection method is used. Objects are divided into three generations. At the beginning, objects are stored in the first generation when they are created. If the objects survive in the first generation of spam, they are stored in the second generation, similarly, in a second-generation spam check, the object will survive and be put into three generations.
The gc module contains a list of three counters, which can be obtained through gc. get_count.
For example, (488, 3, 0), where refers to the number of memory allocated by Python minus the number of memory released by the previous generation spam check,Note that the memory allocation is not the increase in the reference count.For example:
print gc.get_count() # (590, 8, 0)a = ClassA()print gc.get_count() # (591, 8, 0)del aprint gc.get_count() # (590, 8, 0)
3 refers to the number of times that the last generation of spam checks and the first generation of spam checks are performed. Similarly, 0 refers to the number of times that the last generation of spam checks and the second generation of spam checks are performed.
Gc mode has a threshold value for automatic garbage collection, that is, 3 tuples obtained through the gc. get_threshold function, for example (, 10, 10)
Each time the counter is increased, the gc Module checks whether the increase count has reached the threshold value. If yes, the gc Module checks the corresponding algebra and resets the counter.
For example, assume that the threshold value is (, 10, 10 ):
- When the counter is increased from (699,3, 0) to (700,3, 0), the gc module will execute gc. collect (0): checks the garbage of a generation of objects and resets the counter to (, 0)
- When the counter is increased from (699,9, 0) to (700,9, 0), the gc module will execute gc. collect (1), that is, check the garbage of the first and second generation objects, and reset the counter to (, 1)
- When the counter is increased from (699,9, 9) to (700,9, 9), the gc module will execute gc. collect (2), that is, check the garbage of three generations of objects, and reset the counter to (0, 0, 0)
Others
If the _ del _ method is defined for both objects in the circular reference, the gc module will not destroy these reachable objects, because the gc module does not know which object's _ del _ method should be called first, the gc module will put the object in gc for safety. but the object is not destroyed.
5. Application
- Avoid circular references in projects
- Introduce the gc module and enable the gc module to automatically clean up the object mechanism of circular references.
- Due to generation collection, you need to centrally manage the variables that require long-term use and move them to the second generation as soon as possible to reduce the consumption during GC checks.
- The only thing that the gc module cannot handle is that the classes referenced cyclically have the _ del _ method. Therefore, you must avoid defining the _ del _ method in the project. If you must use this method, at the same time, it causes loop reference and requires the code to explicitly call gc. the _ del _ object in garbage breaks the deadlock.
The above is all the content of this article. I hope this will be helpful for your learning.
Articles you may be interested in:
- Detailed explanation and optimization of JVM's garbage collection mechanism
- Notes on Flex programming: Performance Optimization and garbage collection
- Brief description of PHP garbage collection mechanism
- Talking about the garbage collection mechanism of C #
- In-depth explanation of java garbage collection
- C # detailed introduction to the garbage collection mechanism
- Analysis of JavaScript garbage collection mechanism
- In-depth analysis of Python's garbage collection mechanism