Overview of the Java garbage collection mechanism

Source: Internet
Author: User

Summary:

The automatic memory management advocated in the Java technology system can ultimately be attributed to automating two problems: allocating memory to Objects and reclaiming memory allocated to objects. And the area of memory that these two issues target is the heap area in the Java memory model. On the issue of object allocation memory, the author's blog "The JVM Memory Model Overview" has elaborated how to divide the free space and its related thread security problem, this article will combine the garbage collection strategy to give the memory allocation rule further. The introduction of garbage collection mechanism can effectively prevent memory leaks, ensure the efficient use of memory, and greatly liberate the hands of Java programmers, so that they do not need to consider memory management when writing programs. This paper focuses on two classical algorithms for judging whether an object can be recycled, and details the basic idea of four typical garbage collection algorithms and its direct application-garbage collector, finally, the memory allocation rule is introduced with the memory recovery strategy.

Copyright Notice:

This article original nerd Rico
Author Blog address: http://blog.csdn.net/justloveyou_/

Friendly tips:

In order to better understand the Java garbage collection mechanism, the author suggests that the reader should first understand and grasp the JVM memory model. If the reader wants to have a thorough understanding of the JVM memory model, please go to my blog, "JVM memory Model Overview".

I. The significance of garbage collection mechanism

As mentioned in the author's previous blog, JVM memory model Overview, the JVM memory model consists of three parts: the heap (Java-code Java heap and the method area used by the JVM itself), the stack (the virtual machine stack for the service Java method and the local method stack of the service native method), and the guarantee program A program counter that can be executed continuously in a multithreaded environment. In particular, we mentioned that the Java heap is the main area for garbage collection, so it is also called the GC heap, and the method area has a less rigorous statement, that is, the permanent generation. In general, the heap (including the Java heap and method area) is the primary object of garbage collection, especially the Java heap.

In fact, the automatic memory management advocated in the Java technology system can ultimately be attributed to automating two problems: allocating memory to Objects and reclaiming memory allocated to Objects . And the area of memory that these two issues target is the heap area in the Java memory model. On the issue of object allocation memory, the author's blog "The JVM Memory Model Overview" has elaborated how to divide the free space and its related thread security problem , this article will combine the garbage collection strategy to give the memory allocation rule further. In addition, we know that garbage collection mechanism is a significant feature of the Java language, which can effectively prevent memory leaks and ensure the efficient use of memory, so that Java programmers do not need to consider memory management issues when writing programs. Java garbage collection mechanism to consider the problem is very complex, this article elaborated its three core issues, including:

    • which memory needs to be recycled? (two classic algorithms for whether an object can be recycled: reference counting and accessibility analysis algorithms)

    • When do I recycle? (The new generation of the heap, the old age, the permanent generation of garbage collection time, MINORGC and FULLGC)

    • How to recycle? (three classic garbage collection algorithms (Marker clearing algorithm, replication algorithm, tagging algorithm) and generational collection algorithm and seven garbage collectors)

Before we explore the Java garbage collection mechanism, we should first remember one word:stop-the-world. Stop-the-world means that the JVM stops execution of the application because it is performing a GC, and this situation occurs in any GC algorithm. when stop-the-world occurs, all threads wait until the GC task finishes, except for the thread required by the GC. In fact, GC optimization often means reducing the time of stop-the-world, so that the system has high throughput and low pauses .

Ps: Memory leak refers to the memory space is not recycled after use, in the general case, without involving complex data structures, Java memory leaks appear to be a memory object life cycle beyond the length of time the program needs it.

Two. How do I determine if an object can be recycled?

1. Reference counting algorithm: Judging the number of references to an object

  The reference counting algorithm determines whether an object can be recycled by judging the number of references to the object.

The reference counting algorithm is an early policy in the garbage collector. In this approach, each object instance in the heap has a reference count. When an object is created, and the object instance is assigned to a reference variable, the reference count of the object instance is set to 1. When any other variable is assigned a reference to the object, the reference count of the object instance is incremented by 1 (a = B, the counter of the object instance referenced by B is incremented by 1), but when a reference to an object instance exceeds the lifetime or is set to a new value, the reference count of the object instance is reduced by 1. In particular, when an object instance is garbage collected, the reference counter of any object instance it references is reduced by 1. any object instance that has a reference count of 0 can be garbage collected.

The reference count collector can be executed very quickly, and interleaved in the program running, it is advantageous to the real-time environment that the program needs not to be interrupted for a long time, but it is difficult to solve the problem of circular reference between objects. As shown in the following program, the reference count between the object Obja and OBJB can never be 0, and the two objects will never be recycled.

            

 public  class  REFERENCECOUNTINGGC  { public  Object instance = null ; public  static  void  TESTGC  () {REFERENCECOUNTINGGC obja = new  REFERENCECOUNTINGGC (); REFERENCECOUNTINGGC OBJB = new  REFERENCECOUNTINGGC (); //objects are circular references to each other, the reference count between the object Obja and OBJB can never be 0  objb.instance = obja; Obja.instance = OBJB; Obja = null ; OBJB = null ; System.GC (); }}

The last two sentences of the code above Obja and OBJB are assigned null, that is, Obja and OBJB point to the object is no longer accessible, but because they reference each other, resulting in their reference counters are not 0, the garbage collector will never reclaim them.

2, accessibility analysis algorithm: To determine whether the reference chain of the object can reach

  The accessibility analysis algorithm determines whether an object can be recycled by judging whether the object's reference chain is up or down.

The Accessibility analysis algorithm is introduced from the graph theory of discrete mathematics, and the program regards all referential relationships as a graph, starting with a series of objects called "GC Roots", from which the search is traversed by a path called a reference chain (Reference Chain). When an object is not connected to the GC Roots (in the case of graph theory, which is from the GC Roots to the object unreachable), it proves that this object is not available, as shown in. In Java, the objects that are available as GC Root include the following:

    • The object referenced in the virtual machine stack (the local variable table in the stack frame);

    • The object referenced by the class static property in the method area;

    • The object referenced by the constant in the method area;

    • The object referenced by the native method in the local method stack;

              

Three. Garbage collection algorithm

1. Mark Clear Algorithm

  the tag-purge algorithm is divided into two stages: Mark and clear. The algorithm first scans the root collection, tags the surviving object objects, and then scans the entire space for unmarked objects and recycles them, as shown in.

            

There are two major deficiencies in the tag-purge algorithm:

    • efficiency problem: the efficiency of marking and clearing two processes is not high;

    • Space issues: The tag-purge algorithm does not need to move the object, and only the objects that are not alive, so that after the tag is purged, there is a lot of discontinuous memory fragmentation, and too much space fragmentation can cause future allocations of larger objects during program run. Unable to find enough contiguous memory and had to trigger another garbage collection action ahead of time.  

                  

2. Copy algorithm

  The replication algorithm divides the available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is exhausted, copy the surviving object to the other piece, and then clean up the used memory space once. This algorithm is suitable for scenarios with low object survival, such as the new generation. This makes every time the entire half of the memory collection, memory allocation will not consider the complexity of memory fragmentation, as long as the mobile heap top pointer, in order to allocate memory, easy to implement, efficient operation. The algorithm is as follows:

           

  in fact, commercial virtual machines are now using this algorithm to reclaim the new generation. because the study found that the new generation of objects in each recovery is basically only about 10% of the objects survive, so the need to replicate the few objects, the efficiency is good. As described in the blog "JVM Memory Model Overview", in practice it divides the Cenozoic memory into a larger Eden space and two smaller survivor spaces (as shown), each using Eden and one of the survivor. When recycled, objects that are still alive in Eden and survivor are copied to another survivor space at a time, finally clearing out Eden and the survivor space just used. The default Eden and survivor size ratio of the hotspot virtual machine is 8:1, that is, each new generation of available memory space for the entire Cenozoic capacity of 90% (80%+10%), only 10% of the memory will be "wasted."

                

3. Labeling and Sorting algorithm

The replication collection algorithm will perform more replication operations when the object has a higher survival rate, and the efficiency is reduced. More crucially, if you do not want to waste 50% of space, you need to have additional space to allocate security, in order to deal with all the objects in the memory used in 100% survival extreme situation, so in the old age generally can not directly select this algorithm. the tagging process of the tagging algorithm is similar to the tag cleanup algorithm, but the next step is not to clean up the recyclable objects directly, but rather to have all the surviving objects move toward one end, and then directly clean out the memory outside the end boundary, similar to the disk grooming process, The garbage collection algorithm is suitable for scenes with high object survival rate (old age), and its action principle is as shown.

            

The most notable difference between the tag grooming algorithm and the Mark clearing algorithm is that the tag cleanup algorithm does not move the object, and only the objects that do not survive, and the tag grooming algorithm moves all the surviving objects to one end and processes the non-surviving objects so that they do not produce memory fragmentation. The function of the marker grooming algorithm is as follows:

            

4, Generational collection algorithm

For a large system, when the object and method variables are created, the objects in the heap memory will be much larger, and if the object is recycled, then it is bound to cause inefficiency. The generational collection algorithm is based on the fact that the life cycle (survival) of different objects is not the same, and that objects of different lifecycles are located in different areas of the heap, so it is possible to improve the performance of the JVM by using different policies to recycle the different areas of the heap memory. The modern commercial virtual machine uses the Generational collection algorithm: The new Generation object survival rate is low, uses the replication algorithm, the old age survival rate is high, uses the mark clears the algorithm or the marker collation algorithm. Java heap memory can generally be divided into the new generation, the old age and the permanent generation of three modules, as shown in:

               

1). New Generation (young Generation)

  The goal of the new generation is to collect as fast as possible those objects whose life cycle is short, and in general, all newly generated objects are first placed in the Cenozoic. The Cenozoic memory is divided into one Eden area and two Survivor (Survivor0,survivor1) regions according to 8:1:1, most of which are generated in the Eden area. In the case of garbage collection, the Eden Zone survivor is first copied to the Survivor0 area, then the Eden area is emptied, and when the Survivor0 area is full, the Eden and Survivor0 zone survivors are copied to the Survivor1 area. Then empty the Eden and the Survivor0 area, where the Survivor0 area is empty and then swap the roles of the Survivor0 and Survivor1 zones (that is, the Eden and Survivor1 areas will be scanned for the next garbage collection). That is to keep the Survivor0 area empty, so reciprocating. In particular, when the Survivor1 area is not sufficient to store the surviving objects in the Eden and Survivor0 areas, the surviving objects are stored directly in the old age. If the old age is also full, it will trigger a FULLGC, that is, the new generation, the old generation are recycled. Note that the occurrence of a new generation of GC is also called MINORGC,MINORGC frequency is relatively high, not necessarily waiting for the Eden area full before triggering.

2). old Age (Generation)

  older generations have been storing objects of longer life cycles, as described above, and the objects that survived after n garbage collection in the new generation were placed in the old age. In addition, the old age of memory is also much larger than the new generation (about 1:2), when the old age will trigger major GC (full GC), the old age object survival time is relatively long, so the frequency of FULLGC occurs relatively low.

3). Permanent generation (Permanent Generation)

  permanent generations are primarily used to store static files, such as Java classes, methods, and so on. The permanent generation has no significant impact on garbage collection, but some applications may dynamically generate or invoke some classes, such as the use of bytecode frameworks such as reflection, dynamic proxies, cglib, and so on, when it is necessary to set a relatively large permanent generation space to hold the new class in these runs.

5. Summary

           

Because objects are processed in a generational way, garbage collection areas and times are different. There are two types of garbage collection,Minor GC and full GC.

    • Minor GC: the recovery of the new generation will not affect the old generation. because most of the new generation of Java objects die frequently, so Minor GC is very frequent, generally used here fast, high efficiency algorithm, so that garbage collection can be completed as soon as possible.

    • Full GC: also called Major GC, the entire heap is recycled, including the new generation, the old age and the permanent generation. because the full GC needs to be recycled for the entire heap, it is slower than the minor GC, so the number of full GC should be reduced as much as possible, including: the old age is filled, the permanent generation (Perm) is written full, and System.GC () is explicitly called.

Four. Garbage collector

If garbage collection algorithm is the method of memory recycling, then garbage collector is the implementation of memory recycling. 7 collectors for different generations were shown, including serial, Pranew, Parallel scavenge, collectors for the recovery of the new generation, including serial old, Parallel, and CMS, There are also G1 collectors for reclaiming the entire Java heap. The lines between the different collectors indicate that they can be used with each other.

              

    • Serial collector (copy algorithm): The new generation of single-threaded collector, marking and cleanup are single-threaded, the advantage is simple and efficient;

    • Serial Old collector (marker-collation algorithm): The older generation of single-threaded collectors , the old version of the Serial collector;

    • Parnew collector (copy algorithm): cenozoic receiver parallel collector , in fact, serial collector of multi-threaded version, in the multi-core CPU environment has better performance than serial;

    • Parallel Scavenge collector (replication algorithm): The new generation of parallel collectors , the pursuit of high throughput , efficient use of the CPU. throughput = user thread time/(user thread time +GC thread time), high throughput can efficiently utilize CPU time, complete the operation of the program as soon as possible, suitable for background applications and other requirements for the interaction of the scene is not high;

    • Parallel Old collector (marker-collation algorithm): older generation parallel collector , throughput priority , Parallel scavenge collector's old version;

    • CMS (Concurrent Mark Sweep) collector (Tag-purge algorithm): The old-time parallel collector , the collector with the goal of obtaining the shortest payback period, is characterized by high concurrency and low pauses. , the pursuit of the shortest GC recovery pause time.

    • G1 (garbage first) collector (marker-collation algorithm): java heap parallel collector , G1 collector is a new collector provided by JDK1.7, G1 collector is based on "tag-collation" algorithm implementation, This means that no memory fragmentation occurs. In addition, an important feature of the G1 collector differs from the previous collector: the scope of G1 recovery is the entire Java heap (including the new generation, the old age), and the first six collector collections are limited to the new generation or the old age.

Five. Memory allocation and recovery policy

  The automatic memory management advocated in the Java technology architecture can ultimately be attributed to automating two problems: allocating memory to Objects and reclaiming memory allocated to Objects . in general, objects are primarily allocated on the new generation of Eden, and if the local thread allocation cache (Tlab) is started, it will be assigned on a thread-first basis on Tlab. In a few cases it may also be distributed directly in the old age. In general, memory allocation rules are not the same, and the details depend on which garbage collector combination is currently in use, and the settings of memory-related parameters in the virtual machine.

1) The object takes precedence over Eden allocation, and the virtual machine initiates a MINORGC when there is not enough space for the Eden Zone to allocate. today's commercial virtual machines typically use a replication algorithm to reclaim the new generation, dividing the memory into a larger Eden space and two smaller survivor spaces, each using Eden and one of the survivor. When garbage collection is done, objects that are still alive in Eden and survivor are copied to another survivor space at once, and the Eden and the survivor space just now are disposed of. (The size ratio of the hotspot virtual machine default Eden and Survivor is 8:1) when the survivor space is not enough, you need to rely on the old age for allocation guarantees.

2) large objects go straight into the old age. a Large object is a Java object that requires a lot of contiguous memory space, and the most typical large object is the long string and array.

3) long-term survival of the object will enter the old age. When an object has experienced a certain number of times in the Cenozoic (by default, 15), the minor GC is promoted to the old age.

4) Dynamic object age determination. in order to better adapt to the memory situation of different programs, the virtual machine does not always require the object age must reach the maxtenuringthreshold to promote the old age, If the sum of all objects of the same age in the survivor space is greater than half the size of survivor space, objects older than or equal to that age can enter the old age without waiting for the age required in Maxtenuringthreshold.

It is important to note thatJava's garbage collection mechanism is the ability provided by the Java Virtual machine to dynamically reclaim the memory space occupied by objects without any references in an idle time, in an unscheduled manner. That is, the garbage collector reclaims the memory space occupied by objects without any reference, not the object itself.

Six. Memory leak issues in Java

Although Java has a garbage collection mechanism, there are also memory leak issues, such as the following:

(1). Static use of collection classes such as HashMap, vectors, and so on, is most prone to memory leaks because the lifetime of these static variables is consistent with the application, and all object objects cannot be freed because they will also be used by vectors.

private  static  vector v = new  vector (); public  void  test  (Vector v) {for  (int  i = 1 ; I<100 ;         i++) {Object o = new  Object ();         V.add (o);     o = null ; }}

In this example, a reference to the V and object objects of the depositary Vector object in the virtual machine stack is O. In the For loop, we constantly generate new objects, add them to the Vector object, and then empty the O reference. The problem is that while we are emptying the O reference, the object objects we create cannot be recycled when garbage collection occurs. Because garbage collection finds a V reference when it traces a reference in the code stack, and continues to trace, a reference to the object is found in the memory space pointed to by the V reference. That is, although the O reference is already empty, there are still other references to object objects that can be accessed, so the GC cannot release it. If after this loop the object object has no effect on the program, we assume that the Java program has a memory leak.

(2). Various resource connections, including database connections, network connections, IO connections, and so on, do not explicitly call close to shut down, and are not caused by GC reclamation causing a memory leak.

(3). The use of the listener can also cause memory leaks when the object is disposed without a corresponding delete listener.

Seven. Replenishment of knowledge points

1. References

1). Reference overview

  Whether the reference count of the object is judged by the reference counting algorithm or whether the reference chain of the object can be reached by the accessibility analysis algorithm, it is related to "reference" to determine whether the object is alive or not. prior to JDK 1.2, the definition of references in Java was traditional: if the value stored in the data of the reference type represents the starting address of another piece of memory, it is said that this memory represents a reference. This definition is pure, but too narrow, an object in this definition is only referenced or not referenced in two states, for how to describe some "tasteless, discard" object is powerless. we want to describe a class of objects that can be kept in memory when the memory space is sufficient, and if they are still very tense after being garbage collected, you can discard those objects. many of the system's caching capabilities are consistent with this scenario.

For this reason, after JDK 1.2, Java extends the concept of references into strong references (strong Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference) Four species, these four kinds of reference strength gradually weakened.

2). Types of references and their definitions
  
  A strong reference is a kind of reference that is common in program code, similar to "Object obj = new Object ()". as long as a strong reference exists, the garbage collector never reclaims the referenced object.

  soft references are used to describe some objects that are also useful, but are not required. for objects associated with soft references, these objects are listed in the collection scope and recycled a second time before the system will have a memory overflow exception. If this collection still does not have enough memory, the memory overflow exception will be thrown. After JDK 1.2, the SoftReference class was provided to implement soft references.

  A weak reference is also used to describe a non-required object, but its strength is weaker than a soft reference, and the object associated with the weak reference only survives until the next garbage collection occurs. when the garbage collector is working, the objects associated with a weak reference are reclaimed regardless of whether the current memory is sufficient. After JDK 1.2, the WeakReference class was provided to implement weak references.

  a virtual reference is the weakest reference relationship. Whether an object has a virtual reference exists, does not affect its lifetime at all, and cannot obtain an object instance through a virtual reference. The only purpose of setting a virtual reference association for an object is to expect to receive a system notification when the object is reclaimed by the collector. after JDK 1.2, the Phantomreference class is provided to implement the virtual reference.

2, the method area recovery

The memory reclamation target of the method area is primarily for the collection of constant pools and the unloading of types . Reclaiming obsolete constants is very similar to reclaiming objects in the Java heap. For example, if a string "ABC" has entered a constant pool in the case of a constant pool literal, the current system does not have any string object called "abc", in other words, there is no string object referencing the "ABC" constant in the constant pool. There is no other place to quote this literal, and if a memory recycle occurs at this time, and if necessary, the "ABC" Constant will be "please" out of the constant pool. The symbolic references to other classes (interfaces), methods, and fields in a constant pool are similar.

It is relatively straightforward to determine whether a constant is an "obsolete constant", and the condition to determine whether a class is a "useless class" is much more harsh. Classes need to meet the following 3 conditions to be considered "useless classes":

    • All instances of the class have been reclaimed, i.e. no instances of the class exist in the Java heap;

    • The ClassLoader that loaded the class have been recycled;

    • The corresponding Java.lang.Class object of this class is not referenced anywhere and cannot be used to access the class's methods at any place.

      A virtual machine can recycle (unload) a useless class that satisfies the above 3 conditions, and this is simply "yes", not the same as the object, which is bound to be recycled. In particular, scenarios where a large number of bytecode frameworks such as reflection, dynamic proxies, cglib, and dynamic generation of custom classloader such as JSPs and OSGi are used, require the virtual machine to have class offload functionality to ensure that the permanent generation does not overflow.

Eight. More

For more information about the structure of the JVM's memory model, the creation of Java objects in virtual machines, the positioning process, and the analysis of memory anomalies, please crossing my blog, please visit my blog, "JVM memory Model Overview".

For more information on the advanced aspects of Java SE, please follow my column "Java SE advanced path". This column mainly on the JVM Foundation, Java source code and design patterns, such as Java advanced knowledge, from the beginning to the advanced summary, analysis of the internal logic of the knowledge points, throughout, covering the entire Java knowledge, in a step-by-step improvement, improve their own at the same time, the Java learning to share to everyone. Towering high rises on the ground, the basis to determine your limit, let us join hands to climb the top of Java ...

References:

In-depth understanding of Java garbage collection mechanism

Overview of the Java garbage collection mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.