JVM GC----Garbage tagging algorithm (ii)

Source: Internet
Author: User

In the previous article, we introduced the GC mechanism in which the GC was judged by what criteria the object could be tagged, and the most effective and most commonly used accessibility analysis method.
Today introduces another very common tagging algorithm, its application surface is quite extensive. This is:
Reference counting Method Reference counting
The essence of this algorithm, in fact, is in the last article to judge an object to be recycled in another way, that is, if no other object calls the current object, then the current object can be recycled. There are two ways to determine how many calls the current object, one is to look at other objects, and how many objects hold references to the current object. Another way is that the current object itself implements a counting mechanism. Counts the calls from outside references. The first approach is the initial idea of accessibility analysis in the previous article. The second approach is to introduce the initial idea of the reference counting method: We do not care who has saved our references, we only care about how many of the objects we reference are saved.
In reference counting, each object has a record of its own number of references, and when the object's counter has a value of 0 o'clock, that is, no longer has other objects holding references to the object, then there are no longer any objects that can invoke methods or variables of the current object. This moment is also the moment when the current object can be recycled.
InThe algorithm and implementation of garbage collectionIn this book, another very interesting description of the algorithm;
Each object is like a star. The size of the reference count of this object is like the popularity index of this star. When the star's popularity index is 0 o'clock, that is, the star sadly left the field.
In reference counting, the reference counter for each object is initially (anti-theft connection: This article starts from http://www.cnblogs.com/jilodream/) with a value of 0. Not when a new object holds a reference to the current object, the counter is incremented by 1. Not when an object that already holds a reference disappears, or discards a holding reference. The counter will be-1. When the value of the counter is 0 again, the object represented by this counter is recycled. (more precisely, the idle list holds its own reference, marking the memory space it occupies as an area that can be redistributed).
Next talk about the pros and cons of this algorithm:
Advantages:
1. Recycle rubbish anytime and anywhere
When the counter becomes 0, the current object is placed in the idle queue as a memory space that can be reallocated. Other GC algorithms, such as the accessibility analysis mentioned earlier, require a global cleanup to clean up all known garbage spaces in the cycle.
2, the maximum pause time is short
The reference count mage is calculated once each time the object is generated, destroyed, or changed, so the impact time on the program is very short.
While the other GC algorithms because of the need for unified clearance or replication, so the time will be longer pause, the impact on the program is also relatively long. Sometimes this time is too long to be able to tolerate the performance, it is necessary to constantly tuning, reduce the maximum length of a single pause.
3. The core idea is simple
Reference counting method. You do not need to start the traversal from the root node sequentially. Each object is only concerned about whether the object that directly holds its own reference has changed. In this way, when an object is recycled, it affects only the reference object that the object holds directly, not the object that directly affects the deeper path.
If a holds b/c,b holding d/e. When a is recycled, only the reference counter of the B,c object is affected. When the counter value of B is reduced to 0 o'clock, the d/e node is affected. The overall cost of computing is very low. Other GC methods need to be traversed from the root in turn. The whole process is very complex (for example, when it comes to a mesh reference relationship, it is especially important to stop meaningless traversal).
Disadvantages:
1, the calculation frequency is too fast
Each time a command executes, it can cause several changes in the reference count. In particular, the number of reference counts of objects held by some root nodes (from the root object) is even more dramatic. Therefore, the workload of the counter is very heavy.
2, the counter occupies a very large memory space
The reference count method. Each object requires a reference counter of its own, although this counter is stored with an unsigned type, but it can only save a bit of space. Since it is possible for all objects to hold an object's extreme condition, the maximum allowable value for the counter must be very large. (Anti-Theft connection: This article starting from http://www.cnblogs.com/jilodream/) the corresponding control will be very large. This is a typical space-for-time algorithm.
3, the implementation is very complex, once the consequences of error will be very serious
Although the advantage of this algorithm is that the idea is very simple, but it is more complicated to implement. Whenever an object is reclaimed, you need to refresh the counters for each object that is used. Once there is an error, there may be a memory leak problem that can never be repaired.
4. Cyclic dependence
This problem can be considered as the most difficult problem to deal with by reference counting: when two (or more) memory objects are dependent on each other, they do not have a referential relationship with the outside world, and thus form a relationship similar to the island chain. At this point, each object counter is not 0,GC, and the memory cannot be reclaimed. In this case, the typical reference counting method cannot be resolved. It is often necessary to combine other recovery algorithms to make improvements to solve problems.

Against these shortcomings. The industry offers a number of improved algorithms
1, delay reference counting method deffered Reference counting
Deferred [d? ' F.:d] adj. postponed, called;
This problem is very onerous for the computation of its counters because of very frequent updates to references from the root object. A special idea was raised: counters that do not maintain reference objects from the root are not maintained. The values for these counters are always 0. Other objects still use the reference counter in a normal manner. However, there is a problem that the GC cannot tell which objects are recyclable and which are not. So the object that needs to drop the counter to 0 (decr_ref_cnt function) is temporarily placed in a container, delaying its recycling. This container is called ZCT (Zero Count Table). It is designed to record the objects whose counters have been reduced to 0 by the reduction calculation.

Such as:

So when does it start to actually tag the garbage object?
In general, when we create the (New_obj function) object, we find that there is no free memory space to allocate. A ZCT scan (scan_zct function) is performed to clean up these objects. Then try to allocate memory again, and if it still doesn't succeed, then it's considered a memory overflow.
The steps for the ZCT scan (scan_zct function) are as follows:
(1) First, the counter referenced from the root is adjusted to the normal value;
Visible:

(2) then traverse ZCT, clearing the object with a value of 0 (the delete (obj) function) and placing it in the idle queue described earlier. (The space for these objects can then be used to assign new objects.)
(3) Then adjust all the counters referenced from the root back.

The advantage of this algorithm is that discarding counters from the root reference object is frequently refreshed with these meaningless, cumbersome and time-consuming operations, greatly reducing the burden on the processor.
Of course, its shortcomings are also obvious, memory will no longer be collected immediately. Only when the memory space is not enough (anti-theft connection: This article starting from http://www.cnblogs.com/jilodream/), only to start scanning zct, unified recycling. However, the time-consuming of scanning zct will increase with the increase of ZCT, which will lead to the maximum pause times of GC become larger. It is also possible to reduce the maximum pause time by adjusting the small zct, but this will allow the GC to perform ZCT scans more frequently (space and time cannot be combined). This results in a decrease in the throughput of memory reclamation processing. 2, Sticky Reference counting method
Sticky English [? st?ki] adj. viscous; Not moving;
The previous article has mentioned that the control of the counter occupies very large. This is to ensure that the counter can be used normally under extreme scene counts. But this algorithm is precisely the space occupied by the counter (the upper limit of the count) to narrow. This is because for most objects, the maximum value that can be reached in a counter is small, and objects are quickly recycled. It is a very foolish act to make sure that counters for each object in the counter do not overflow, and that each object is given a very large amount of space to count.
For an object that an individual counter will overflow:
(1) then let it overflow.
Anyway, it is so many objects quoted, the probability is that the basic is not very likely to be recycled, it can be said by default it is immortal object.
(2) If it is considered that the counter overflow is not good, you can add a disguised algorithm from the root address, the general idea is like this:
<1> Counter 0
<2> Add a reference count value by addressing it sequentially from the root
<3> cleans up objects with reference counters of 0
The benefits of this algorithm are:
<1> reduce the amount of space the counter occupies
<2> clean up a loop-referenced scene
3, 1-bit reference counting algorithm
This algorithm can be said to be an extreme embodiment of the sticky reference counting method, that is, the counter only 1 bits. Once there is a shared memory scenario, it "overflows".
Although the scene is extreme, he represents a lot of memory: These objects are only held by one object after creation, and there is no case where multiple objects are held together.
In this algorithm, the counter is more like a tag value, which marks the current object as being referenced by another object and is no longer a counter to an object.
4. Partial Mark removal algorithm
This algorithm can be said to be purely to solve the cyclic dependence. The idea of the algorithm is as follows: Mark the Memory object as four states:
A is definitely not the object of rubbish;
b Absolutely is the object of rubbish;
c There may be a circular reference to the object;
D the object that was searched for.
This makes it possible to calculate the object's state using a different recovery algorithm. Because most of the objects in memory are in the loop-referenced remote (anti-theft connection: This article is starting from http://www.cnblogs.com/jilodream/). Therefore, most of the objects still use the reference counting method to calculate. Only a small number of objects may exist in a circular reference, so only this part of the object is evaluated for root accessibility.
Although there are many drawbacks to the reference notation itself, there are still many areas where this recovery algorithm is used: Python GC, memory management for Flash player, and so on. Unfortunately, due to cyclic dependence and so on, so far, the mainstream JVM has not used the reference counting method as a GC recovery algorithm to use.

For the GC tagging algorithms mentioned in this article, and the implementation of these GC tagging algorithms, here is a book mentioned earlier: "Algorithm and implementation of garbage collection". This book is very detailed and the illustrations in the book are very vivid. Interested students can find to read under.

JVM GC----Garbage tagging algorithm (ii)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.