Java Garbage Collector standard explanation and usage, java garbage collection

Source: Internet
Author: User

Java Garbage Collector standard explanation and usage, java garbage collection
Overview

When talking about Garbage Collection (GC), most people regard this technology as a companion product of the Java language. In fact, GC has a long history than Java. The Lisp, which was born on MIT in 1960, is the first language that truly uses the dynamic memory allocation and garbage collection technology. When Lisp was still in the embryonic period, people were thinking:

  Three tasks to be completed by GC:

Which memory needs to be recycled?

When will it be recycled?

How to recycle it?

After half a century of development, the technology of dynamic memory allocation and memory recovery has become quite mature, and everything seems to have entered the "Automated" era, so why do we need to know about GC and memory allocation? The answer is simple: when you need to troubleshoot various problems of memory overflow and Memory leakage, when garbage collection becomes the bottleneck for the system to reach a higher concurrency, we need to implement necessary monitoring and adjustment for these "automated" technologies.

The time was transferred back to the present half a century ago and back to the familiar Java language. Chapter 2 describes the various parts of the Java memory runtime region. The program counters, virtual machine stacks, and local method stacks are generated with threads and destroyed with threads; stack frames in the stack are methodically executed as methods enter and exit. The amount of memory allocated in each stack frame is basically known when the class structure is determined (although the JIT compiler will perform some optimizations at runtime, in this chapter, the concept model-based discussion, in general, it can be considered as known during the compilation period ),Therefore, the memory allocation and recovery in these regions are determined.In these regions, there is no need to worry too much about recycling, because when the method ends or the thread ends, the memory will naturally be recycled.The Java heap and method zones are different. The memory required by multiple implementation classes in one interface may be different, and the memory required by multiple branches in one method may be different, only when the program is running can we know what objects will be created. The allocation and recycling of this part of memory are dynamic, and the garbage collector focuses on this part of memory, in the subsequent discussions in this book, "Memory" Allocation and recovery only refer to this part of memory.

 

The object is dead?

Almost all object instances in the Java World are stored in the heap. Before the Garbage Collector recycles the heap, the first thing to do is to determine which objects are "alive, which objects are "dead" (that is, objects that cannot be used by any means ).

  

Reference Counting Algorithm

Many textbooks use the following algorithm to determine whether an object is alive: Add a reference counter to the object, and Add 1 to the counter value whenever a reference is made; when the reference fails, the counter value is reduced by 1. Objects whose counters are 0 at any time cannot be used again. I have interviewed many new graduates and developers who have many years of working experience. They all give this answer to this question.

Objectively speaking, the implementation of Reference Counting is simple and highly efficient. In most cases, it is a good algorithm and has some well-known application cases, for example, Microsoft's COM (Component Object Model) technology, the Flash Player using ActionScript 3, The Python language, and Squirrel, which is widely used in the field of game scripts, all use the reference counting algorithm for memory management. However,The reference counting algorithm is not used in Java to manage memory. The main reason is that it is difficult to solve the issue of cross-cycle reference between objects.

For a simple example, see the testGC () method in the following code: the object objA and objB both have the field instance, and the value assignment order objA. instance = objB and objB. instance = objA. In addition, the two objects have no reference. In fact, they are no longer accessible, but because they reference each other, as a result, their reference count is not 0, so the reference counting algorithm cannot notify the GC collector to recycle them. The Code is as follows:

1 public class ReferenceCountingGC {2 public static void main (String [] args) {3 testGC (); 4} 5 public Object instance = null; 6 private static final int _ 1 MB = 1024*1024; 7/** 8 * the unique meaning of this Member attribute is that it occupies the memory, so that you can see in the GC log whether it has been recycled 9 */10 private byte [] bigSize = new byte [2 * _ 1 MB]; 11 public static void testGC () {12 ReferenceCountingGC objA = new ReferenceCountingGC (); 13 ReferenceCountingGC objB = new ReferenceCount IngGC (); 14 objA. instance = objB; 15 objB. instance = objA; 16 objA = null; 17 objB = null; 18 // If GC occurs in this line, can objA and objB be recycled? 19 System. gc (); 20} 21}

  Running result:

[Full GC (System) [Tenured: 0K->210K(10240K), 0.0149142 secs] 4603K->210K(19456K), [Perm : 2999K->2999K(21248K)], 0.0150007 secs] [Times: user=0.01 sys=0.00, real=0.02 secs] Heap def new generation total 9216K, used 82K [0x00000000055e0000, 0x0000000005fe0000, 0x0000000005fe0000) Eden space 8192K, 1% used [0x00000000055e0000, 0x00000000055f4850,0x0000000005de0000) from space 1024K, 0% used [0x0000000005de0000, 0x0000000005de0000, 0x0000000005ee0000) to space 1024K, 0% used [0x0000000005ee0000, 0x0000000005ee0000, 0x0000000005fe0000) tenured generation total 10240K, used 210K [0x0000000005fe0000, 0x00000000069e0000, 0x00000000069e0000) the space 10240K, 2% used [0x0000000005fe0000, 0x0000000006014a18, 0x0000000006014c00, 0x00000000069e0000) compacting perm gen total 21248K, used 3016K [0x00000000069e0000, 0x0000000007ea0000, 0x000000000bde0000) the space 21248K, 14% used [0x00000000069e0000, 0x0000000006cd2398, 0x0000000006cd2400, 0x0000000007ea0000) No shared spaces configured.

From the running results, we can clearly see that the GC log contains "4603 K-> 210K", which means that the virtual machine does not recycle the two objects because they are referenced by each other, this also shows that the virtual machine does not reference the counting algorithm to determine whether the object is alive.

 

Root Search Algorithm

In mainstream commercial programming languages (Java, C #, and even the old Lisp mentioned above ),The root search algorithm (GC Roots Tracing) determines whether the object is alive.. The basic idea of this algorithm is to use a series ofGC RootsThe object is used as the starting point to start from these nodes. The path that is searched is called Reference Chain ), when an object to GC Roots is not connected by any reference chain (in graph theory, it is impossible to reach this object from GC Roots), it proves that this object is unavailable. As shown in 3-1, although object 5, object 6, and object7 are correlated with each other, they are not reachable to GC Roots, so they will be determined to be recyclable objects.

  In Java, the following types of objects can be used as GC Roots objects:

    The referenced object in the VM stack (the local variable table in the stack frame.

The object referenced by the class static attribute in the method area.

The object referenced by constants in the method area.

Objects referenced by JNI (Native method) in the local method stack.

        

 

Reference again

Whether the reference count algorithm is used to determine the number of objects referenced, or the root search algorithm is used to determine whether the object's reference chain is reachable, it is related to "Reference" to determine whether the object is alive. Before JDK 1.2, the reference definition in Java is very traditional: If the value stored in the reference data represents the starting address of another memory, this memory represents a reference. This definition is pure, but too narrow. An object can only be referenced or not referenced in this definition. For how to describe something "tasteless, the object is powerless.We want to describe such an object: When memoryWhen the space is sufficient, the objects can be kept in the memory. If the garbage collection is still very tight, you can discard these objects.Many system cache functions comply with such application scenarios.

After JDK 1.2, Java expanded the reference concept and divided the referenceStrong Reference, Soft Reference, and WeakReference and Phantom ReferenceThese four types of references gradually weaken.

Strong references are commonly used in program code, such as references such as "Object obj = new Object,As long as a strong reference still exists, the garbage collector will never recycle the referenced objects.

  Soft references are used to describe some useful but not necessary objects.For soft reference associated objects, these objects will be listed in the recycle range and recycled for the second time before the system will encounter a memory overflow exception. If the recovery still does not have enough memory, a memory overflow exception will be thrown. After JDK 1.2, the SoftReference class is provided for soft reference.

Weak references are used to describe non-essential objects, but they are weaker than soft references. objects associated with weak references can only survive until the next garbage collection. When the spam collector is working, only objects associated with weak references will be reclaimed no matter whether the current memory is sufficient. After JDK 1.2, the WeakReference class is provided to implement weak references.

A Virtual Reference is also called a ghost reference or phantom reference. It is the weakest reference relationship. Whether an object has a virtual reference does not affect its survival time, nor can it be used to obtain an object instance through virtual reference. The only purpose of setting a Virtual Reference Association for an object is to receive a system notification when the object is recycled by the Collector. After JDK 1.2, the PhantomReference class is provided to Implement Virtual Reference.

 

Survival or death?

InRoot Search AlgorithmObjects that cannot be reached in are not "non-dead". At this time, they are temporarily in the "probation" stage. To truly declare the death of an object, they must go through at least two marking processes: if the object does not find a reference chain connected to GC Roots after root search, it will be marked for the first time and filtered, the filtering condition is whether it is necessary to execute the finalize () method for this object. When the object does not overwrite the finalize () method, or the finalize () method has been called by the virtual machine, the virtual machine regards both cases as "unnecessary execution ".

If this object is determined to be necessary to execute the finalize () method, the object will be placed in a Queue named F-Queue, later, it will be executed by a low-priority Finalizer thread automatically Established by the virtual machine. The so-called "execution" refers to the virtual opportunity to trigger this method, but it does not promise to wait until it ends. The reason for this is that if an object is slow to execute in the finalize () method, or an endless loop (more challenging) occurs ), it is likely that other objects in the F-Queue will be permanently waiting, or even cause the entire memory recovery system to crash. The finalize () method is the last chance for an object to escape the fate of death. Later, GC will mark the object in F-Queue for a second small scale, if the object is to be in finalize () -You only need to re-associate with any object on the reference chain, for example, assigning yourself (this keyword) to a class variable or a member variable of the object, in the second tag, it will be removed from the "to be recycled" set; if the object has not escaped yet, it is not far from death. From code listing 3-2, we can see that the finalize () of an object is executed, but it can still survive.

From the running result of code listing 3-2, we can see that the finalize () method of the SAVE_HOOK object is indeed triggered by the GC collector and escaped before being collected.

1 public class FinalizeEscapeGC {2 public static FinalizeEscapeGC SAVE_HOOK = null; 3 public void isAlive () {4 System. out. println ("yes, I am still alive! "); 5} 6 protected void finalize () throws Throwable {7 SAVE_HOOK = new FinalizeEscapeGC (); 8 // The object is successfully saved for the first time; 9 SAVE_HOOK = null; 10 System. gc (); 11 // because the Finalizer method has a low priority, pause for 0.5 seconds to wait for 12 threads. sleep (500); 13 if (SAVE_HOOK! = Null) {14 SAVE_HOOK.isAlive (); 15} else {16 System. out. println ("no, I am dead! "); 17} 18 // the following code is exactly the same as above, but this self-help failed 19 SAVE_HOOK = null; 20 System. gc (); 21 // because the Finalizer method has a low priority, pause for 0.5 seconds to wait for 22 threads. sleep (500); 23 if (SAVE_HOOK! = Null) {24 SAVE_HOOK.isAlive (); 25} else {26 System. out. println ("no, I am dead! "); 27} 28} 29}

Running result:

1 finalize method executed!2 yes, i am still alive!3 no, i am dead!

The other thing worth noting is that there are two identical code segments in the code, but the execution result is an escape success and a failure, because the finalize () of any object () methods are automatically called only once by the system. If the object faces the next recycle, its finalize () method will not be executed again, so the self-rescue action of the second Code fails. It should be particularly noted that the above description of the finalize () method for object death may have a tragic artistic color. I do not encourage you to use this method to save the object. On the contrary, I suggest you avoid using it as much as possible because it is not a destructor in C/C ++, but a compromise made by Java to make it easier for C/C ++ programmers to accept. It runs at a high cost and has a high uncertainty. It cannot guarantee the call sequence of each object. Some textbooks mentioned that it is suitable for "disabling external resources" and other work, which is completely a kind of self-comfort for the purpose of this method. All the work that finalize () can do can be done better and more timely using try-finally or other methods. you can completely forget the existence of this method in Java.

 

Recycling Method Area

Many people think that there is no garbage collection in the Method Area (or permanent generation in the HotSpot virtual machine,The Java virtual machine specification does not require the virtual machine to implement garbage collection in the method area.In addition, the "cost-effectiveness" of garbage collection in the method area is generally relatively low: in the heap, especially in the new generation, regular applications can collect garbage 70% ~ 95% of the space, while the permanent generation of garbage collection efficiency is far lower than this.

  Permanent garbage collection mainly recycles two parts: Waste constants and useless classes.Recycling obsolete constants is similar to recycling objects in the Java heap. Take the literal collection in the constant pool as an example. If a String "abc" has already entered the constant pool, but no String object in the current system is called "abc, in other words, there is no String object that references the "abc" constant in the constant pool, and there is no reference to this literal in other places. If memory is recycled at this time and necessary, this "abc" constant will be "please" out of the constant pool by the system. Symbol references of other classes (interfaces), methods, and fields in the constant pool are similar. It is relatively easy to determine whether a constant is an "discard constant", but it is much more harsh to determine whether a class is a "useless class. Class must satisfy the following three conditions at the same time to be considered "useless class ":

All instances of this class have been recycled, that is, no instances of this class exist in the Java heap.

ClassLoader for loading this class has been recycled.

The java. lang. Class Object corresponding to this Class is not referenced anywhere and cannot access this Class through reflection anywhere.

Virtual machines can recycle useless classes that meet the preceding three conditions. This is just "yes", rather than recycling them if they are not used. Whether to recycle classes. The HotSpot virtual machine provides-Xnoclassgc parameter control. You can also use-verbose: class and-XX: + TraceClassLoading,-XX: + TraceClassUnLoading.

In scenarios where reflection, dynamic proxy, CGLib, and other bytecode frameworks are widely used, and scenarios where custom ClassLoader, such as JSP and OSGi, is generated dynamically require the virtual machine to have the class uninstallation function, to ensure that the permanent generation will not overflow.

 

Garbage collection Algorithm

The implementation of the garbage collection algorithm involves a large number of program details, and the methods for operating the memory on virtual machines on different platforms are different. Therefore, this section does not intend to discuss the implementation of the algorithm too much, it only introduces the idea of several algorithms and its development process.

 

Mark-clear Algorithm

The most basic collection algorithm isMark-Sweep AlgorithmLike its name, the algorithm is divided into two stages: "mark" and "clear": first, mark all objects to be recycled, after the tag is complete, all marked objects are recycled. The tag process of the marked object is described in the previous section. It is the mostBasic Collection AlgorithmsThis is because the subsequent collection algorithms are based on this idea and the shortcomings are improved. It has two main disadvantages:One is efficiency, the marking and clearing processes are less efficient, and the other is space.A large number of discontinuous memory fragments are generated. Too many space fragments may lead, when the program needs to allocate large objects in the future running process, it cannot find enough continuous memory and has to trigger another scheme in advance.Collect actions.The execution process of mark-clearing algorithm is 3-2.

        

 

Copy Algorithm

  To solve efficiency problems, A collection algorithm called "Copying" appears,It divides the available memory into two equal-size blocks by capacity and uses only one of them at a time. WhenWhen the memory is used up, the stored objects are copied to the other part, and the used memory space is cleared once.In this way, the memory is recycled every time, so you don't need to consider the memory fragmentation and other complex situations when allocating the memory. You only need to move the heap top pointer and allocate the memory in sequence,Easy to implement and efficient to run. The cost of this algorithm is to reduce the memory to half of the original size., Not too high. The execution process of the replication algorithm is 3-3.

        

Currently, commercial virtual machines use this collection algorithm to recycle the new generation. IBM's special research shows that 98% of the new generation's objects are born and died overnight, therefore, you do not need to divide the memory space according to the ratio of, but divide the memory into a large Eden space and two smaller memory vor spaces, each time you use Eden and one of them is used vor. When recycling, copy the objects stored in Eden and vor to another vor space at one time, and finally clear the space of Eden and the used vor. By default, the ratio of Eden to VOR is, that is, the available memory space of each new generation is 90% (80% + 10%) of the total capacity of the new generation ), only 10% of the memory is wasted. Of course, 98% of objects can be recycled only in general scenarios. We cannot ensure that no more than 10% of objects are retained at a time. When vor space is insufficient, you need to rely on other memory (in the old age) for the distribution guarantee (Handle Promotion ).

The Guarantee for memory allocation is like that we borrow money from a bank. If we have a good reputation, we can pay off the loan on time in 98% cases. Therefore, the bank may default that we can pay off the loan on time and by volume, you only need a guarantor to ensure that, if I cannot pay back, I can deduct money from his account, then the bank will think there is no risk. The same is true for the memory allocation guarantee. If the other vor space does not have enough space to store the surviving objects collected by the last generation, these objects will enter the old age through the allocation guarantee mechanism. This chapter will explain in detail the content of the distribution guarantee for the new generation.

 

Tag-Sorting Algorithm

The replication collection algorithm performs many replication operations when the object survival rate is high, which reduces the efficiency. More importantly, if you do not want to waste 50% of the space, you need to allocate extra space for guarantee to deal with the extreme situation where 100% of all objects in the memory to be used survive, therefore, this algorithm cannot be directly used in the old age.

According to the characteristics of the old age, someone proposed another Mark-Compact algorithm, which is still the same as the Mark-clear algorithm, however, the subsequent steps do not directly clean the recyclable objects, but move all the surviving objects to one end, and then directly clear the memory outside the end boundary, 3-4 of the "tag-sort" algorithm. 3.3.4

         

 

Generational collection Algorithm

Currently, commercial Virtual Machine garbage Collection uses the Generational Collection algorithm, which has no new idea, the memory is divided into several parts based on the lifecycle of the object. Java heap is generally divided into the new generation and the old generation, so that the most appropriate collection algorithm can be used according to the characteristics of each generation.In the new generation, a large number of objects are found to die during each garbage collection, onlySmallIf the object is alive, the copy algorithm is used. You only need to pay the copy cost of a small number of surviving objects to complete the collection. In the old age, because the object has a high survival rate and no extra space to allocate a guarantee for it, you must use the "tag-clean" or "tag-sort" algorithm for recovery.

 

Garbage Collector

  If the collection algorithm is the method of memory collection, the garbage collector is the specific implementation of memory collection.Java Virtual Machine specifications have no rules on how to implement the Garbage Collector. Therefore, the garbage collector provided by different vendors and virtual machines of different versions may be very different, in addition, parameters are generally provided for you to combine the collectors used in various years based on your application characteristics and requirements. The collectors discussed here are based on Sun HotSpot virtual machine version 1.6 Update 22, which contains 3-5 collectors.

        

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.