Garbage collection and memory allocation policies in Java Virtual machine sequence Java

Source: Internet
Author: User

In the JVM overview of the Java Virtual machine Family Java Virtual Machine series we have explained in detail the memory models in Java and have learned the basics of memory management in the JVM, and this blog will take you through the garbage collection and memory allocation policies in Java.


Garbage collection (garbage COLLECTION,GC) is a feature of the Java language, where programmers do not need to be concerned with memory dynamic allocations and garbage collection issues, all of which are given to the JVM. In C + +, the programmer is required to release the initiative, and in Java is given to the JVM automatic completion, since it is given to the program automatic execution, then this must be done in the following several things:


1 What memory needs to be recycled? (That is, which objects can be considered "junk")

2 How to recycle? (i.e., commonly used garbage collection algorithms)

3 Memory allocation policy

Next, follow the three questions raised above for a detailed explanation.


What memory needs to be recycled?

The JVM of the Java Virtual Machine series in front of the preceding "Java VM Family" We know that the Java memory area mainly refers to the Java Runtime Data area, in this memory area, the program counter, Java stack, the local method stack 3 regions with the production of threads, Dies as the thread dies. So these places do not need to think too much about memory reclamation, because the memory is naturally recycled after the thread ends, and the Java heap and method areas require more memory than is possible for multiple implementation classes in an interface, and only when the program is running to know which objects are created. Therefore, the recovery and allocation of this part of memory is dynamic, which is also the memory area of garbage collection attention.


So which objects in the heap or method area can be considered "junk"? Which objects should be recycled?

This involves the JVM's garbage determination algorithm, commonly used in the garbage decision algorithm includes: Reference counting algorithm, the accessibility analysis algorithm. Here are a few


Reference counting algorithm:

We know that memory in the Java heap is accessed by reference, that is, every heap memory corresponds to a reference that can access that memory address, so when a reference points to the heap memory, the counter is incremented by 1, and when the reference to the heap memory is invalid, the counter is 1, so it is clear that the counter of the object is 0 o'clock , which means that the object can be considered "garbage" to be recycled.

Through its principle can know, the algorithm is simple, the decision efficiency is high, but the current mainstream JVM has not adopted this algorithm to manage memory, the main reason is that the algorithm is difficult to solve the circular reference between objects. Here's an example:

Class testx{public testy y;} Class testy{public testx x;}        public class main{public static void Main (string[] args) {x x = new X ();        Y y = new Y ();        X.y=y;        y.x=x;//the two lines are assigned, x and y are mutually referenced x = null; y = Null;<span style= "white-space:pre" ></span>    System.GC ();//Notify Virtual machine Recycle}}
Although by x = null; y = null; the two-line statements empty the reference to X and Y, indicating that x and y in the current heap have no references to them, so they cannot be accessed and should be reclaimed by the garbage collector, but because X and Y refer to each other, the reference counter for x and Y is not 0, so if the reference counter algorithm is used, The memory of both objects cannot be reclaimed. Running the program, viewing the results of the run, can be seen from memory analysis, in fact, the memory of the two objects are recycled, which also shows from the side of the current mainstream JVM is not using the program counter algorithm as a garbage decision algorithm.


Accessibility Analysis algorithm:

Accessibility analysis algorithm is the Java language used to determine whether the object is the survival algorithm, the basic idea of the algorithm is through a series of objects called "GC Roots" as the starting point, starting from these nodes down search, the path of the search is called the reference chain (Reference Chain), This object is not available when an object is connected to the GC roots without any reference chain (in the case of graph theory, which is from the GC roots to the object unreachable). , objects 5, 6, and object 7 are associated with each other, but they are not accessible to GC roots, so they will be judged as recyclable objects.


So the key to this algorithm is the selection of "GC Roots" objects, in the Java language, the objects that can be used as GC Roots include the following:

The object referenced in the virtual machine stack (the local variable table in the stack frame).

The object referenced by the class static property in the method area.

The object referenced by the constant in the method area.

The object referenced by JNI (that is, generally speaking, the native method) in the local method stack.


Note: In the accessibility analysis algorithm, not to reach the object, is not "dead", this time they are temporarily in the "probation" stage, to truly declare an object to die, at least two times to go through the marking process: if the object after the accessibility analysis found that there is no reference chain connected to the GC roots, Then it will be marked for the first time and filtered, and the criteria for filtering is whether this object needs to execute the Finalize () method. When the object does not overwrite the Finalize () method, or the Finalize () method has been called by the virtual machine, the virtual machine treats both cases as "no need to execute." Note that the Finalize () method of any object will only be performed automatically by the system 1 times.

If the object is judged to need to execute the Finalize () method, the object will be placed in a queue called F-queue and then executed by a low-priority finalizer thread that is automatically created by the virtual machine at a later time. The so-called "execution" here means that the virtual opportunity triggers this method, but does not promise to wait for it to end, because if an object executes slowly in the Finalize () method, or if a dead loop (more extreme) occurs, Will most likely cause other objects in the F-queue queue to be permanently waiting, even causing the entire memory-recycling system to crash. Therefore, calling the Finalize () method does not mean that the code in the method can be fully executed.


The Finalize () method is the last chance for an object to escape the fate of death, and later the GC will make a second small-scale mark on the object in F-queue, if the object is to successfully save itself in Finalize ()-just re-associate with any object on the reference chain. For example, assigning yourself (the This keyword) to a class variable or to a member variable of an object, it will be removed from the collection that is "about to be recycled" at the second mark, and if the object has not escaped at this time, it is basically recycled. From the code below we can see that the Finalize () of an object is executed, but it can still survive.

/** * This code demonstrates two points: * 1. Objects can be saved by the GC itself. * 2. This chance of self-help is only one time, because the Finalize () method of an object is automatically called only once by the system */public class FINALIZEESCAPEGC {public static FINALIZEESCAPEGC     Save_hook = null;    public void IsAlive () {System.out.println ("Yes, I am still alive:)");     } @Override protected void Finalize () throws Throwable {super.finalize ();     System.out.println ("Finalize Mehtod executed!");    Finalizeescapegc.save_hook = this;      public static void Main (string[] args) throws Throwable {Save_hook = new finalizeescapegc ();     The object successfully saved itself for the first time Save_hook = null;     System.GC ();     Because the Finalize method has a low priority, pause for 0.5 seconds to wait for it to Thread.Sleep (500);     if (Save_hook! = null) {save_hook.isalive ();     } else {System.out.println ("No, I am dead:(");     }//The code below is exactly the same as above, but this time the rescue failed save_hook = null;     System.GC ();     Because the Finalize method has a low priority, pause for 0.5 seconds to wait for it to Thread.Sleep (500);     if (Save_hook! = null) {save_hook.isalive (); } else {System.ouT.println ("No, I am dead:(");  }    }  }

The results of the operation are as follows:

Finalize Mehtod executed!  Yes, I am still alive:)  
As can be seen from the running result, the Finalize () method of the Save_hook object is indeed called by the GC collector and has escaped successfully before being collected.


Another notable point is that the code has two pieces of exactly the same code fragment, the result is a successful escape, a failure, this is because the Finalize () method of any object will only be automatically called by the system once, if the object faces the next recovery, its finalize () The method will not be executed again, so the second code's self-help operation failed.




Second, how to recycle? (Common garbage collection algorithm)

Commonly used garbage collection algorithms include: Tag-clear algorithm, copy algorithm, marking-sorting algorithm, generational collection algorithm, the following describes the implementation of the principle.

Tag-purge algorithm (MARK-SWEEP): The most basic garbage collection algorithm, as the name implies, includes two procedures for labeling and purging. All objects that need to be recycled are marked first, and all tagged objects are collected uniformly after the tag is completed, and the tagging process is actually the process of the accessibility analysis algorithm described previously.


Insufficient points:

1 efficiency is not high, marking and clearing two processes is not efficient.

2 Space utilization is not high, after the mark is cleared, it will produce a lot of discontinuous memory fragmentation, too much space fragmentation may result in the future when the program is running to allocate a large object, you cannot find enough contiguous memory and have to trigger another garbage collection action in advance.


Copy algorithm: The assignment algorithm is to solve the markup-clearing algorithm's space utilization is not high and improved, it divides the available memory by capacity into equal size two blocks, each time only use one of them. When this piece of memory is exhausted, copy the surviving object to the other piece, and then clean up the used memory space once. This makes every time the entire half of the memory collection, memory allocation will not consider the complexity of memory fragmentation, as long as the mobile heap top pointer, in order to allocate memory, easy to implement, efficient operation.

Today's commercial virtual machines are using this collection algorithm to reclaim the new generation, IBM's special research shows that the new generation of object 98% is "dying", so do not need to follow the ratio of 1:1 to divide the memory space, Instead, it divides the memory into a larger Eden space and two smaller survivor spaces, each using Eden and one of the survivor. When recycled, the objects that are still alive in Eden and survivor are copied one at a time into another survivor space, finally clearing up Eden and the survivor space just used. The algorithm executes as shown in the following procedure:


It is obvious that the efficiency of this algorithm is very much related to the number of surviving objects, if there are many surviving objects, when the survivor space is not enough, it is necessary to rely on other memory (this refers to the old age) for the allocation Guarantee (Handle Promotion).


Mark-and-Organize algorithm (mark-compact): The replication collection algorithm requires more replication operations when the object has a higher survival rate, and the efficiency will become lower. More crucially, if you do not want to waste 50% of space, you need to have additional space to allocate security, in order to deal with all the objects in the memory used in 100% survival extreme situation, so in the old age generally can not directly select this algorithm. Instead, the tag-collation algorithm (MARK-COMPACT) is used.

The algorithm's tagging process is still the same as the "mark-purge" algorithm, but the next step is not to clean up the recyclable objects directly, but rather to have all the surviving objects move toward one end, and then directly clean out the memory outside the end boundary, as follows:




Generational collection algorithm (generational Collection): This algorithm is the most current virtual machine garbage collection algorithm, the algorithm is a comprehensive consideration of the above algorithms in the best case, according to the different life cycle of the object to divide the memory into several pieces. The Java heap is generally divided into the new generation and the old age, so that according to the characteristics of each era to adopt the most appropriate collection algorithm.

In the Cenozoic, every garbage collection is found to have a large number of objects died, only a small number of survival, then choose the replication algorithm, only need to pay a small number of surviving objects of the replication cost can be completed collection. In the old age, because the object has a high survival rate and no extra space to guarantee it, it must be recycled using the "mark-clean" or "mark-sweep" algorithm.


Note that there is another generation outside the heap that is the permanent generation (permanet Generation), which is used to store class classes, constants, method descriptions, and so on. The recovery of the permanent generation mainly recycles two parts: obsolete constants and useless classes.


Memory allocation and recovery strategy in Java

Automatic memory management in Java actually solves two problems automatically: allocating memory to objects and reclaiming memory allocated to objects. The memory recycling is described in detail earlier, so the next step is to focus on memory allocation techniques in Java.

Object memory allocation, to the general direction is allocated on the heap, the object is mainly distributed in the new generation of Eden, in a few cases will be directly allocated in the old age, the allocation of rules, although not completely fixed, but also to follow the following several rules:

1 Object Precedence in Eden allocation:

In most cases, objects are allocated in the Cenozoic Eden region. When the Eden Zone does not have enough space to allocate, the virtual machine initiates a minor GC. Here is the Cenozoic memory area graph:


During GC, the surviving objects in Eden Space and from space are moved to space, and then the Eden space and from survivor are cleaned up. If the to survivor is not sufficient to store an object during cleanup, the object is moved to the old age.

If the to space cannot store an object during the GC, the object is moved to the old age.


2 large objects go straight into the old age:

The so-called large objects are objects that require a large amount of contiguous storage space, the most common large objects such as long strings and large arrays, the reason that large objects are allocated directly in the old age to avoid a lot of memory duplication in the Eden Zone and two survivor areas, Note that in the Cenozoic, replication algorithms are used to collect garbage objects.


3 long-lived objects entered the old age:

As mentioned earlier, virtual machines use the idea of generational collection to manage memory, so memory recycling must be able to identify which objects should be placed in the new generation, and which objects should be in the old age. To do this, the virtual machine defines an object age counter for each object. If the object survives in Eden and is still alive after the first minor GC and can be accommodated by survivor, it will be moved to survivor space, and the object age will be set to 1. Objects in the survivor area each "escaped" once minor GC, the age increases 1 years old, when its age increases to a certain extent (by default 15 years old), will be moved to the old age.


The above is the main content of this blog, if the reader feels good, remember small hands shake, point a praise Oh! In addition, you are welcome to pay attention to my blog account Oh, will not be regularly for everyone to share technical dry, welfare more Oh!


Garbage collection and memory allocation policies in Java Virtual machine sequence Java

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.