Java Theory and Practice: garbage collection in the JVM 1.4.1

Source: Internet
Author: User
Tags garbage collection

Last month, we analyzed reference counting, copying, tagging-clearing, and tagging-sorting out these classic garbage collection techniques. Each of these methods has its advantages and disadvantages under certain conditions. For example, when a lot of objects become garbage, replication can do a good job, but when there are many long-lived objects it becomes bad (to replicate them repeatedly). On the contrary, tag-collation can do well for long-lived objects (only once), but it is not as good when there are many short-lived objects. The technology used by the JVM 1.2 and later is called generational garbage collection (generational garbage collection), which combines the two technologies to combine the strengths of both, with the result that the object allocation overhead is very small.

Old objects and young objects

In any one application heap, some objects become garbage soon after they are created, while others remain alive during the entire run of the program. Empirical analysis shows that for most object-oriented languages, including the Java language, the vast majority of objects--up to 98% (depending on your metrics for young objects)--died when you were young. The life of an object can be calculated using the number of clock seconds, the total bytes allocated by the H memory management subsystem after the object is allocated, or the amount of garbage collected after the object has been allocated. But no matter how you measure it, the analysis shows the same thing-most objects die when they are young. The fact that most objects die at a young age is significant for the selection of collectors. In particular, when most objects die young, the replication collector can perform fairly well because the replication collector does not access dead objects at all, they simply copy the live objects to another heap area and then reclaim all remaining space at once.

Most of those who have survived the first garbage collection will become long-lived or permanent objects. Depending on the mix of short-lived objects and long-lived objects, the performance of different garbage collection policies can vary greatly. When most objects die when they are young, the replication collector works well because the objects that die when they are young never need to be copied. However, the replication collector handles long-lived objects badly, and it replicates them back and forth from one half space to the other half. Instead, the marker-finishing collector works well for long-lived objects, as long-lived objects tend to sink to the bottom of the heap, eliminating the need for duplication. However, tag-erase and tag-rounding collectors do a lot of extra work to analyze dead objects, because in the cleanup phase they must analyze each object in the heap.

Generational collection

The generational collector (generializational collector) divides the heap into multiple generations. Objects created in young generations that meet certain elevated standards, such as objects that have experienced a certain number of garbage collections, are promoted to the next older generation. The generational collectors are free to use different collection strategies for different generations, and garbage collection for each generation.

A small collection

One advantage of generational collection is that it does not collect all the generations at the same time, so the garbage collection pauses are shorter. When the allocator does not meet the allocation request, it first triggers a small collection (minor collection), which collects only the youngest generation. Because many of the objects in the younger generation have died, the replication collector does not have to analyze the dead objects at all, so small collection pauses can be quite short and can often reclaim a large amount of heap space. If a small collection frees up enough heap space, the user program can recover immediately. If it does not release enough heap space, it continues to collect the previous generation until enough memory is recycled. (It expands the heap or throws a outofmemoryerror when the garbage collector is unable to reclaim enough memory after all the collection has been collected.)

Inter-generational reference

Tracking garbage collectors, such as replication, tag-purge, and tag-collation, are scanned from the root set (root set), traversing references between objects until all live objects are accessed.

The generational trace collector begins with the root set, but does not traverse references to objects in the older generation, which reduces the size of the object graph to be tracked. But this also poses a problem-what if the older generation's objects refer to a younger object that cannot be reached by all other reference chains starting from the root?

To solve this problem, the generational collector must explicitly track references from old objects to young objects and add these young references to the root set of small collections. There are two ways to create a reference from an old object to a young object. Either modify the references contained in the old object to point to young objects, or elevate young objects that refer to other young objects to the older generation.

Track intergenerational references

Whether a young, old reference is created by elevation or pointer modification, the garbage collector needs to have all the older references in the small collection. One way to do this is to keep track of older generations, but that obviously has a lot of overhead. A better approach is to scan the old generation of linear scans to find references to young objects. This approach is faster and has a better culture (locality) than tracking, but there is still a lot of work to do.

Assignment functions (mutator) and garbage collectors can work together to maintain a complete list of old and young references when they are created. When an object is promoted to an older generation, the garbage collector can record all the old and young references created by this elevation, so that only the intergenerational references created by the pointer modification are tracked.

The garbage collector can have several ways to track the old and young references that are generated by modifying references in existing objects. It can track them by using the same methods of maintaining reference counts in reference count collectors, where the compiler can generate additional instructions that are assigned around pointers, or using virtual memory protection on older generations to capture writes to old objects. Another potentially more efficient method of virtual memory is to use the page modification dirty bit (page modification dirty bit) in the older generation to determine which block to scan when finding an object that contains a young pointer.

With a little trick, you can avoid tracking the cost of each pointer modification and checking whether it crosses a generational boundary. For example, you do not need to track storage for local or static variables because they are already part of the root set. You can also avoid tracing pointers stored in some constructors, which are used only to initialize the fields of the new object (that is, the so-called initialization store (initializing stores)), because (almost) all objects are assigned to the younger generation. In any case, the runtime must maintain a reference set of old objects to young objects and add those references to the root set when collecting young generations.

In Figure 1, the arrows represent references between objects in the heap. The Red Arrows indicate the old and young references that must be added to the root set for small collection use. The blue arrows represent references from the root set or from young to old objects and do not need to be tracked when only young generations are collected.

Figure 1. Inter-generational reference

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.