J Learn more about Ava virtual machines--garbage collection algorithms and garbage collectors

Source: Internet
Author: User
Tags throwable

Today we will study Java virtual machines using garbage collection algorithms and common garbage collectors. The program counters, virtual machine stacks, and local method stacks of the Java Virtual Machine memory area are 3 regions that are born with threads, are out of thread, and stack frames in the stack follow the method's entry and exit stack and stack. The amount of memory allocated in each stack frame is basically known at the time the class structure is determined, so there is certainty in memory allocation and recycling in these areas, where there is no need to think too much about recycling, because when the method ends or the thread ends, the memory is naturally recycled. While the Java heap and the method area are different, multiple class implementations in an interface may require different memory, and multiple branches in one method may require different memory, and we can only know which objects are created when the program is running, and this part of memory and recycling is dynamic, The garbage collector is concerned about this part of the memory.


First, determine whether the object is alive

Before the garbage collector reclaims objects, it is first necessary to determine which objects are alive.

1. Reference counting algorithm

Add a reference counter to the object, and whenever there is a place to refer to it, the counter adds 1, and when the reference fails, the counter value is reduced by 1, and the object with the counter 0 at any time is impossible to use. However, the problem of the existence of such algorithms is that it is difficult to solve the problem of circular references between objects. None of the current mainstream Java virtual machines employ such an algorithm. We look like the following sub: TESTGC () method execution, Obja and OBJB will not be GC?


Package Gc;public class Referencecountinggc{public Object instance =null; private static final int _1mb=1024*1024;private Byte[] bigsize=new BYTE[2*_1MB];p ublic static void Testgc () {REFERENCECOUNTINGGC obja=new referencecountinggc (); REFERENCECOUNTINGGC objb=new REFERENCECOUNTINGGC (); obja.instance=objb;objb.instance=obja;obja=null;objb=null; System.GC ();} public static void Main (string[] args) {TESTGC ();}}

GC Log Output results:

[GC (System.GC ()) [Psyounggen: <strong>5735K->584K</strong> (18944K)] 5735k->592k (62976K), 0.0008309 secs] [times:user=0.00 sys=0.00, real=0.00 secs] [Full GC (System.GC ()) [psyounggen:<strong> 584K->0K </strong> (18944K)] [paroldgen:8k->514k (44032K)] 592k->514k (62976K), [metaspace:2502k->2502k ( 1056768K)], 0.0058089 secs] [times:user=0.02 sys=0.00, real=0.01 secs] Heap psyounggen total 18944K, used 491K [0x00 000000eb400000, 0x00000000ec900000, 0x0000000100000000) Eden Space 16384K, 3% used [0x00000000eb400000, 0x00000000eb47aff0,0x00000000ec400000) from space 2560K, 0% used [0x00000000ec400000,0x00000000ec400000,       0x00000000ec680000) to space 2560K, 0% used [0x00000000ec680000,0x00000000ec680000,0x00000000ec900000] ParOldGen Total 44032K, used 514K [0x00000000c1c00000, 0x00000000c4700000, 0x00000000eb400000) object space 44032K, 1% used [0x0000 0000c1c00000,0x00000000c1c808e8,0x00000000c4700000) Metaspace used 2511K, capacitY 4486K, committed 4864K, reserved 1056768K class space used 273K, capacity 386K, committed 512K, reserved 1048576K 

Object Obja and Object objb all have field instance, OBJA.INSTANCE=OBJB and Objb.instance=obja assignment, in fact two objects have no reference, they refer each other to each other, resulting in the reference count is not 0, The reference counting algorithm Cannot notify the GC collector to reclaim them. But the result is 5735k->584k, so our virtual machines are not using reference counting algorithms.


2. Accessibility Analysis algorithm

In the mainstream implementation of the mainstream commercial programming language, it is possible to determine whether an object survives through an analytic algorithm. The basic idea of this algorithm is to use a series of objects that become "GC Roots" as the starting point, starting from these nodes to search down, searching through the path becomes the reference chain (Reference Chain), when an object to the GC Roots no reference chain is, it proves that the object is not available , it will be judged as a recyclable object, but will not be recycled immediately, and will need to be tagged two times before it is recycled.

In the 2.1 Java language, objects that can be used as GC roots are:

1) The object referenced in the virtual machine stack (local variable table in the stack frame).

2) The object referenced by the class static property in the method area.

3) The object referenced by the constant in the method area.

4) The object referenced by JNI (generally referred to as the native method) in the local method stack.


Ii. references

Whether it is the reference counting method to determine the number of references to the object, or the accessibility analysis algorithm to determine whether the object's reference chain can be reached, to determine whether the object's survival is related to "reference." The references in Java are divided into strong references (strong Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference) 4, and this reference strength is gradually weakened.

1) Strong references are commonly found in program code, such as "Object Obj=new object ()" such as a reference, as long as a strong reference exists, the garbage collector will never reclaim the referenced object.

2) Soft references are used to describe some objects that are useful but not necessary. For objects associated with soft references, these objects are then listed in the collection scope for a second collection before the system is about to occur with a memory overflow exception. If this collection does not have enough memory, the memory exception will be thrown. After JDK1.2, the SoftReference class is provided to implement the soft reference.

3) A weak reference is also used to describe a non-mandatory object, but its strength is weaker than a soft reference, and the object associated with the weak reference only survives until the next garbage collection occurs. When the garbage collector is working, the objects associated with the weak reference are reclaimed, regardless of the current memory, and after JDK1.2, the WeakReference class is provided to implement the weak reference.

4) A virtual reference is the weakest reference relationship. Whether an object has all virtual references exists, does not affect its time-to-live, and cannot obtain an object instance through a virtual reference. Set a virtual reference association for an object the only purpose is to be able to get a system notification when this object is reclaimed by the collector. After JDK1.2, the Phantomreference class is provided to implement the virtual reference.


Third, the object of self-salvation

Even objects that are unreachable in an analytic algorithm are not immediately recycled and need to be marked two times. If the object finds no reference chain connected to the GC roots after the accessibility analysis, it will be marked for the first time and filtered for the condition that this object is necessary to execute the Finalize () method. When the object does not overwrite the Finalize () method or the Finalize () method has been called by the virtual machine, both of the virtual machines are considered to be not required to be executed.

If this object is judged to be necessary to execute the Finalize () method, then the object will be placed in a f-queue queue and executed by a low priority finalizer thread that is automatically established by a virtual machine later. If an object is re-associated with any object on the reference chain in the Finalize () method, such as assigning itself (the This keyword) to a variable of a class or to a member variable of an object, the second token is that it will be removed from the collection that is about to be recycled, and if the object has not escaped at this time, That basically he was recycled. Consider the following code:


Package gc;/** * This code demonstrates two points * 1, the object can be saved by GC * 2, this self-help opportunity only once * @author Administrator * */public class Finalizeescapegc{publi c static finalizeescapegc save_hook=null;public void IsAlive () {System.out.println ("Yes,i am still Alive");} @Overrideprotected void Finalize () throws Throwable{super.finalize (); System.out.println ("Finalize Method executed"); Finalizeescapegc.save_hook=this;} public static void Main (string[] args) throws Throwable{save_hook=new finalizeescapegc ();//objects save themselves for the first time save_hook=null; System.GC ();//Because the Finalize method has a low priority, pausing for 0.5 seconds waits for it to Thread.Sleep, if (save_hook!=null) {save_hook.isalive ();} Else{system.out.println ("No, I Am Dead");} Save again, fail Save_hook=null; System.GC (); Thread.Sleep, if (save_hook!=null) {save_hook.isalive ();} Else{system.out.println ("No, I Am Dead");}}}

Output Result:

Finalize method Executedyes,i am still Aliveno, I am dead

The first time to save success, the second time failed. The Finalize () method of any object is automatically called only once by the object, and if the object faces the next collection, its Finalize () method is executed again. When you need to be aware, it's best not to use this method to save objects.


Iv. Recovery Method Area

In the heap, especially in the Cenozoic, a garbage collection of conventional applications can generally reclaim 70%~95% space, while the garbage collection efficiency in the method area is much lower. Garbage collection in the method area mainly recycles two parts: obsolete constants and useless classes. Reclaiming obsolete constants is very similar to recovering objects in the Java heap. Take the collection of literals in a constant pool as an example, for example, a string "ABC" has entered a constant pool, but the current system does not have any string object called "abc", in other words, there is no string object referencing the "ABC" constant in the constant pool, There is no other reference to this literal, if a memory recycle occurs, and if necessary, the "ABC" constant will be cleaned out of the constant pool by the system. Symbolic references to other classes, methods, and fields in a constant pool are similar.

Determine 3 conditions for a class as a useless class:

1) All the power of this class has been recycled, i.e. no instances of the class exist in the Java heap.

2) The ClassLoader that loaded the class have been recycled.

3) The corresponding Java.lang.Class object of this class is not referenced anywhere, and the method of accessing the class can be accessed anywhere by reflection.


Five, garbage collection algorithm

1. Mark-Sweep algorithm

The algorithm is divided into two phases: Mark the object to be reclaimed, and collect all the tagged objects uniformly after the mark is complete, and the tagging process is the above mentioned. This algorithm mainly has two shortcomings: first, the efficiency problem, the marking and elimination of two processes is not efficient; the other is a spatial problem, after the mark is cleared, there will be a lot of discontinuous memory fragmentation, too much space fragmentation may cause later in the program to run the process of allocating objects, Unable to find enough contiguous memory and had to start another garbage collection action in advance.

2. Copy algorithm

It divides the possible memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is exhausted, copy the surviving object to the other piece, and then clean up the used memory space once. This makes every time the entire half of the memory collection, memory allocation will not consider the complexity of memory fragmentation, as long as moving the heap top pointer, sequentially allocated memory. This algorithm is efficient, at the cost of compressing the memory to half the original. Today's commercial virtual machines are using this mobile algorithm to reclaim the new generation . Hotspot virtual Machine The default Eden and Survivor size ratio is 8:1, of course, there is no way to ensure that only a few more than 10% objects per collection to survive, when the survivor space is not enough, you need to rely on other memory (old age) to allocate security.

3. Marking-Sorting algorithm

The replication collection algorithm with higher object survival rate is the need for more replication operations, the efficiency will be lower. According to the characteristics of the old age, a marker-collation algorithm is proposed, and the tagging process is still the same as the "tag-purge algorithm", but the next step is not to clean up the recyclable objects directly, but to let all surviving objects move toward one end, and then directly clean out the memory outside the end boundary.

4, Generational collection algorithm

The garbage collection algorithm of contemporary commercial virtual machine uses the "generational collection algorithm", which divides the memory into several blocks according to the different life cycle of the object. The Java heap is generally divided into the new generation and the old age, so that according to the characteristics of each age to adopt the appropriate age collection algorithm. In the new generation, each garbage collection is found to have a large number of objects died, only a small number of survival, then choose the replication algorithm only need to pay a small number of surviving objects of the replication cost can be completed collection. In the old age, because of the high survival rate of the object, there is no additional space to guarantee it, you must use the "tag-cleanup algorithm" or "tag-collation algorithm" to achieve recycling.


Vi. garbage collector (for hotspot virtual machines)

1. Serial Collector

When this collector is a single-threaded collector, when it is garbage collected, all other worker threads must be paused until the garbage collection ends. Currently used for the default Cenozoic collector in client mode. The advantage is that it is simple and efficient (compared to the single thread of other collectors), and for environments that limit a single CPU, the serial collector is highly efficient because it does not have the overhead of thread interaction.

2. Parnew Collector

The multi-threaded version of the serial collector when the Parnew collector, in addition to using multithreading for garbage collection, the rest of the behavior includes all the control parameters available to the serial collector, the collection algorithm, Stop the world, object assignment rules, and so on with the serial collector. There are many new generation collectors that are preferred in virtual machines running in server mode, one of which is performance-independent but important because, in addition to the serial collector, only he can work with a CMS (Concurrent Mark Sweep) collector.


3. Parallel Scavenge Collector

The Parallel scavenge collector is a new generation collector that also uses a collection of replication algorithms and is a parallel multi-threaded collector. The goal of the collector is to achieve a manageable throughput. The so-called throughput is the CPU used to run the user code time and the total CPU time of the wallpaper, that is, throughput = Run user code time/(user code run time + garbage collection time), the total number of virtual machines run for 100 minutes, where garbage collection took 1 minutes, the throughput is 99%.

High throughput can efficiently utilize CPU time, complete the task of the program as soon as possible, mainly suitable for in the background operation and do not need too many interactive tasks. The Parallel scavenge collector provides two parameters for precise control of throughput, respectively, to control the maximum garbage collection pause time-xx:maxgcpausemillis parameters and to directly set throughput size-xx:gctimeratio parameters.


4. Serial Old Collector

Serial old Collector, the older version of the Serial collector, is also a single-threaded collector, using the "tag-grooming algorithm". The main meaning of this collector is also to use the virtual machine in client mode. In the case of server mode, it has two major uses: one for use with the parallel scavenge collector in JDK1.5 and previous versions, and the other for a backup plan for the CMS collector, where the concurrent collection occurs concurrent Mode used when failure.


5. Parallel Old Collector

Parallel old is an older version of the Parallel scavenge collector, using multithreading and the "mark-and-organize" algorithm. The parallel scavenge plus parallel old collector can be prioritized in situations where throughput and CPU resources are most grateful.


6. CMS Collector

The CMS (Concurrent Mark Sweep) collector is a collector that targets the shortest recovery pause time. At present, a large part of the Java application focus on the Internet or B/s system services, such applications pay particular attention to the response speed of the service, hope that the system pauses for a short time, in order to bring a good experience for users. CMS is based on the "mark-and-sweep" algorithm implementation, the operation process is divided into four steps:

1) Initial Mark

2) Concurrent tagging

3) Re-mark

4) Concurrent Cleanup

Where the initial tag, re-tagging these two steps still need "stop the World". The initial tag is simply a token of the object that GC Roots can directly relate to, fast, and the concurrent tagging phase is the process of GC Roots tracing. The re-tagging phase is intended to modify the tag record of the part of the object that caused the tag to change during the concurrent tagging period because the user program continues to work, and the pause time of this phase is generally slightly longer than the initial marking phase, but is much shorter than the concurrent tag time.

The memory reclamation process for the CMS collector, in general, is performed concurrently with the user thread, because the entire process takes the longest time for concurrent markup and the concurrent purge process collection threads to work with the user thread.


Disadvantages are as follows:

1) The CMS collector is very sensitive to CPU resources, and consuming CPU resources during concurrency can cause the application to slow down, and the total throughput will be reduced, which can be noticeable in the case of a small number of CPUs.


2) The CMS collector cannot handle floating garbage, and the "Concurrent Mode Failure" failure may occur causing another full GC to be generated. Because the CMS concurrent cleanup phase of the user thread is still running, along with the program run naturally there will be new garbage generation, this part of the garbage appears after the tagging process, the CMS can not be processed in the secondary collection of them, this part is called "floating garbage."


3) because the CMS is based on the "tag-clear" algorithm implementation of the Collector, which will inevitably produce a lot of space debris, will give large object allocation of great trouble, often the old age there is still a lot of space remaining, but can not find enough space to allocate the current object, have to start again full GC. To solve this problem, the CMS collector provides a-xx:+usecmscompactatfullcollection switch parameter (which is on by default), which is used by the CMS collector to make the full GC a consolidation process that turns on memory fragmentation. The process of memory collation is not concurrent, the space debris problem is not, but the pause time must not be long.


7. G1 Collector

G1 (Garbage-first) collector is one of the most cutting-edge results of today's collector technology development. G1 is a garbage collector for service-side applications, and G1 has the following features compared to other GC collectors:


1) Parallel and concurrent: G1 can take advantage of the hardware advantage of multi-CPU, multi-core environment, use multiple CPUs to shorten the time of Stop-the-world pause, some other collectors would have to pause the Java thread to perform the GC action, The G1 collector can still have Java programs continue to execute in a concurrent manner.


2) Generational collection: As with other collectors, generational concepts remain in the G1. Although G1 can manage the entire GC heap independently without the need for additional collectors, it can handle newly created objects in different ways and old objects that have survived for a period of time to get better collection results.


3) Spatial Integration: Unlike the CMS's tag-purge algorithm, G1 is based on the overall view of the collector based on the tag-collation algorithm, which is implemented locally on the basis of the replication algorithm. Neither of these algorithms generates memory space fragmentation during the G1 run and can provide regular, usable memory after collection.


4) Predictable pauses: this is another advantage of G1 relative to the CMS, reducing the pause time is the common concern of G1 and CMS, but G1 in addition to the pursuit of low pauses, but also to establish a predictable pause time model, can let the user explicitly specified in a length of M milliseconds in a time fragment, The time spent on garbage collection must not exceed n milliseconds.

The collection of other collectors prior to G1 is the whole new generation or the old, and G1 is no longer the case. When using the G1 collector, the memory layout of the Java heap is very different from the rest of the hand machine, and he divides the entire Java heap into separate, equal-sized regions (region), although it retains the concept of the Cenozoic and the old, but the Cenozoic and the old are not physically isolated, They are all part of the region's collection.

G1 collector value So a predictable pause-time model can be established because it can be planned to avoid full-area garbage collection throughout the Java heap. G1 tracks the value of the garbage accumulation in each region, maintains a prioritized list in the background, and takes priority to reclaim the region of value for each time it is allowed to collect.


About the garbage collector is here, the details of the place is not here to say more.

Reference:

"Deep Java Virtual machine"































J Learn more about Ava virtual machines--garbage collection algorithms and garbage collectors

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.