Thoughts on python garbage collection mechanism and python garbage collection mechanism
I. Preface
Python is a high-level language, which is similar to a natural language. It is very convenient and convenient during development because Python has done a lot of things for us silently, one of which is garbage collection, to solve the problems of memory management and Memory leakage.
Memory leakage: when the program keeps running, some objects do not work, but the memory occupied is not released. As the server memory becomes fewer and fewer, the system crashes, therefore, memory leakage is a major concern.
Ii. Reference count
To mark an object in Python, you can use reference to count it. In the following cases, the Count of this object is + 1:
1. At Creation
2. When referenced
3. When passing in a function as a parameter
On the contrary, the following scenario is the count of this object-1:
1. del
2. rereference
3. Function execution completed
You can view the Count of an element through sys. getrefcount (). When the reference count is 0, the memory is released.
It can be thought that, compared with other garbage collection methods, Python has obvious advantages in terms of real-time performance. Python's gc module is an open interface for management.
It is also easy to guess that the disadvantage is that the performance is relatively low. After reading this report, instagram has improved the performance by 10% by disabling the gc module!
Iii. circular reference
In a special case, when two or more variables reference each other cyclically, the reference mechanism by Count cannot be processed.
a = []b = []a.append(b)b.append(a)print(a,b)
The reference count of a and B is 2, and the two memories cannot be recycled.
Iv. Solutions
1. Solve the Problem of loop calling through "mark-clear:
The garbage collector regularly searches for such loop calls and clears them.
Specifically, you can start searching from the root object set copy. The object count is not 0 and is not cleared.
Then, the detection is divided into reachable objects and inaccessible objects. The underlying layer is implemented through the data structure of the linked list, and the operation copy is used to clear the mark without affecting the original data, judge whether it is a loop call
Finally, the inaccessible objects are cleared and the memory is released, which is less efficient.
Garbage collection is triggered in three cases:
1. Callgc.collect()
,
2. When the counter of the gc module reaches the threshold value.
3. When the program exits
2. Use the "space-for-time" policy to improve efficiency in generation-based recovery:
Some memory blocks have a very short life cycle from the beginning to the end, so it is a waste to recycle them,
All objects are divided into zero generations. Python has three generations by default, and one generation is a linked list.
Objects in the young generation are preferentially processed, and the more spam is processed, the more old the qualification will rise and will be placed in the second generation.
Note:
Python's garbage collection mechanism determines whether to proceed by checking whether the quantity reaches the threshold.
In Python, the source code is written in c and cannot be understood for the moment. I will try again later to understand the linked list structure,
The gc module remains for future research.