Java Memory Recycling (2)--garbage collection algorithm

Source: Internet
Author: User
Tags closure

If you haven't seen your first friend, please step: Java Memory Recycling (1)-Java garbage collection mechanism
 

Any garbage collection algorithm must do two things. First, it must detect the garbage object. Second, it must reclaim the heap space occupied by the garbage object and make it available to the program again.

Garbage detection is typically implemented by defining a set of root references and calculating how well they can be set of objects. An object that is known to be reachable (reachable) if it can be accessed by the executing program through a reference path that begins at the root reference. The root reference is always accessible to the program. An object is known as an active object if it is accessible, otherwise it is called garbage because it no longer has any effect on the future execution of the program.
The composition of the root reference set depends on the specific implementation of the JVM, but always contains references to the local and operand stacks in all stack frames and to the references held in static variables. Another source of the root reference is a constant pool, which may hold a reference to some strings in the heap, often the class name, parent class name, parent interface name, domain name, domain signature, method name, method signature, and so on. One possible source of root references is those that are passed to the local law (native method) but have not yet been released. Another potential source of the root reference is the runtime data area of the JVM, because some implementations place a portion of the JVM Runtime data area on the heap, such as the class data itself in the method area.
The set of Reachable objects contains all objects that can be accessed directly or indirectly by the program through the root reference, and technically the set of objects is a transitive closure of the root reference set under the point-to-relationship.
The two basic ways to differentiate between active objects and garbage are reference counting and tracing. The standard garbage collector in the JDK all uses a tracking approach, although the exact form varies.
  

One, reference counting method (Reference counting Collector)

Reference counting is an early strategy for garbage collection. In this approach, each object in the heap has a reference count. When an object is created and points to its reference is assigned to a variable, the object's reference count is set to 1. The reference count of the object is incremented by 1 whenever its reference is assigned to a different variable. When the variable that holds the object reference leaves its scope or is assigned a new value, the object's reference count is reduced by 1. Any object becomes garbage once its reference count becomes 0. Once an object is garbage collected, the reference count of all other objects it references must be decremented accordingly. In this way, garbage collection of one object can cause a continuous garbage collection of other objects.
The advantage of this method is that the reference count collector algorithm is simple and suitable for incremental collection, which is especially suitable for the real-time environment where the program cannot be interrupted for a long time, and the collection process helps to improve the locality of reference. The downside is that the reference count cannot detect an unreachable loop structure (two or more objects referencing each other) because their reference count will never be 0. Another disadvantage is that each increment and decrement of the reference count brings additional overhead, and the algorithm requires a high level of compiler coordination. Because of these inherent flaws, the reference counting algorithm is seldom used in production environments.
 

Second, tracing algorithm (tracing Collector)

The trace collector starts by exploring and painting the object reference graph from the root reference. Objects encountered during the exploration process are flagged in some way. Typically, the tag can be stored either in the object itself or in a separate bitmap. Objects that are not marked after the search are unreachable are objects that can be garbage collected.
The basic tracking algorithm is called "Mark and Clean Up" (Mark and Sweep). The name identifies two phases of the garbage collection process. During the tagging phase, the garbage collector walks through the reference tree and marks each encountered object. During the cleanup phase, unmarked objects are freed and the corresponding memory is returned for use. In the JVM, the cleanup phase must include the object's closure (finalization).
The markup and cleanup algorithm is simple to recycle, and there is no additional overhead and dependency on the compiler to maintain the reference count. But it also has shortcomings, the biggest problem is that in the cleanup phase, all objects in the heap, whether reachable or not, will be accessed. On the one hand, this has a very negative performance impact on the virtual storage system on which the heap of pages may have dependencies, and, on the other hand, because a large part of the objects may be garbage, which means that the garbage collector spends a lot of effort on checking and disposing of garbage. From any point of view, the algorithm can lead to the problem that the collection pause time is too long and the collection overhead is too large. Another disadvantage of marking and cleaning up the collector is that it can easily cause fragmentation of the heap, causing problems with reference locality or large object allocation failures.
The mainstream implementation of the mainstream Business programming language (Java, C #, and even the old Lisp mentioned earlier) is called the reachability analysis to determine whether an object survives. The basic idea of this algorithm is to use a series of objects called "GC Roots" as the starting point, starting from these nodes to search down, the path of the search is called the reference chain (Reference Chain), when an object to the GC Roots no reference chain connected (in the case of graph theory, Is that this object is not available when the GC roots to this object is unreachable). As shown in 3-1, objects 5, 6, and object 7 are associated with each other, but they are not accessible to GC roots, so they will be judged as recyclable objects.

In the Java language, the objects that can be used as GC roots include the following:

The object referenced in the virtual machine stack (the local variable table in the stack frame).

The object referenced by the class static property in the method area.

The object referenced by the constant in the method area.

The object referenced by JNI (that is, the general native method) in the local method stack

Three, compacting algorithm (compacting Collector)

The JVM's garbage collector is likely to have a strategy against heap fragmentation. The two strategies commonly used to mark and clean collectors are compression or copying. Both of these methods reduce heap fragmentation by quickly moving objects. The compression collector slides the active object across the free memory area to one end of the heap, in which the other end of the heap becomes a large contiguous idle area. All references to the moved object are also updated to point to the new location.
In order to better understand the compression process, the heap can be compared to a shelf, part of which is filled with different thickness of books. Free space is the gap between books. Compression is the move of all books in one Direction to bridge all gaps. It starts with the book closest to the bulkhead, pushes it to the bulkhead, then pushes the book near the bulkhead to the first book, then pushes the third book to the second book, and so on. Finally, all books at one end, all free space at the other end.
Reference updates for moved objects can be simplified by adding a layer of addresses to the object reference. An object reference no longer points directly to an object in the heap, but instead to a table item in the object's handle table, the object handle in the table item directly points to the actual object in the heap. This way, when an object is moved, it is only necessary to update its handle in the object handle table, and all references to that object in the executing program do not have to be updated. This approach simplifies the effort to eliminate heap fragmentation, but increases the overhead of every object access.
  

Iv. Copying algorithm (coping Collector)

The copy collector also uses tracking technology, which moves all active objects to a new area, and the original area becomes free space. Because the moved objects are placed next to each other in the new area, the possible gaps between objects in the original area are also eliminated. The copy of the object can be done instantly during the tracking process without having to mark and clean two separate phases. When an object is copied to a new region in real time, its copy in the original area is replaced by a steering pointer (forwarding pointer) that points to the copy of the object in the new region. The steering pointer lets the garbage collector detect those references (the objects that they refer to have been moved to the new area) and update them with the values of the steering pointer so that they point to the object's new location.
A generic copy collector algorithm is known as "Stop and copy" (Stop and copy). In this scenario, the heap is divided into two regions, using only one of the areas at any one time. Objects are allocated in the same region until all the space in the zone is exhausted. At this point, the execution of the program is aborted, the heap is traversed, and the active object encountered during the traversal is copied to another area of the heap. When the stop and copy process is complete, the program resumes execution, and the object's memory is allocated from this new area of the heap until it is exhausted. At that time, the program will be aborted again, the heap is traversed, and the active object is copied back to the original area. The cost of this scenario is that the required memory is twice times the specified heap space, since only half of the memory is used at any one time.
The advantage of the Copy collection algorithm is that only the active object is accessed, the garbage object is not checked, and naturally it does not have to be paged to memory or cached, and the time spent in the collection process depends only on the number of active objects. This avoids unnecessary collection overhead and minimizes the time to collect pauses. However, in addition to the additional memory consumption, the copy collector also has to bear the cost of object copy and reference updates. This is more noticeable when there are more longevity objects, because they are copied back and forth each time they are collected.
  

V. Generation algorithm (generational Collector)

Through observation and experimentation, for most applications written in multiple languages, the objects they create have the following characteristics: 1) Most objects have a short lifespan, but there are always some objects that live long enough; 2) older objects seldom refer to young objects. The above facts are also referred to as the "weak generation hypothesis" (weak generational hypothesis), which is the precondition and basis of the generational collection algorithm.
In this method, the object is divided into groups (generations) by age, and the heap is divided into two or more sub-heaps, each serving a generation of objects, so the sub-heap is often referred to as a generation. The most frequent garbage collection is carried out in the youngest generation. Because most objects have very short life spans, only a small fraction of the youngest objects can survive the first collection. If one of the youngest objects survives after several garbage collections, it will be promoted to a higher-life generation (moved to another sub-heap). Compared with younger generations, the frequency of garbage collection for older generations is always reduced. As objects continue to mature in their current generation (through multiple garbage collection without dying), they are eventually moved to the older generation than their current generation.
The generational collection technique can be applied to the copy algorithm to solve the problem that it is inefficient in dealing with long-lived objects, and can also be applied to mark and clean up the algorithm. In either case, partitioning the heap into object generations helps to improve the efficiency of the most basic garbage collection algorithms.
 

Vi. Adaptive Algorithm (Adaptive Collector)

Adaptive collection algorithms take advantage of the fact that some collection algorithms work better in some cases, while others work better in other situations. The adaptive collector monitors the current state of the heap and adjusts its garbage collection techniques accordingly. It may only make adjustments to the parameters of a single collection algorithm while the program is running, or it can quickly switch from one algorithm to another, or perhaps even divide the heap into sub-heaps and use different algorithms simultaneously on different sub-heaps.
Using an adaptive approach, the designer of the JVM implementation no longer has to choose a specific garbage collection technology. They can use a variety of techniques to assign each algorithm the job that best suits it.
In fact, in the modern JVM implementation, most of the garbage collection subsystem has some degree of adaptive ability, many times we only need to select the strategy and set the target to achieve satisfactory results, as for the specific collection algorithm selection and parameter configuration can be left to the garbage collection subsystem to handle itself. This is how the garbage disposal subsystem in Sun's hotspot JVM works, and the specifics are time-specific.
Reference:
1) Cheney ' s algorithm
2) Garbage Collection-chapter 9 of Inside the Java Virtual machine
3) Java theory and practice:a Brief History of garbage collection

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Java Memory Recycling (2)--garbage collection algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.