Java garbage collection mechanism

Source: Internet
Author: User
Tags compact

For Java garbage collection mechanism, this long ago learned, and understand, for a long time not to be thrown back to the island of the claw, here or down, convenient to check again later, do not waste too much time.

To understand the Java garbage collection mechanism, you need to know the differences between versions of Java, especially as the JDK version improves, comparing the improvements in previous versions. Recently, especially in the JDK1.7 added G1, this is an increase in the new recovery method, started in the JDK1.6 about 40 of the version of the time has been added to the experimental G1.

Line is I forward the blog post, I write not so much time, there are two good, two each have complementary points.

The first article: said more, but not too detailed

Http://www.cnblogs.com/laoyangHJ/articles/java_gc.html

Garbage Collection GC (Garbage Collection) is one of the core technologies of the Java language, and we have previously explored the new features of Java 7 's newly added garbage collector G1, but in the internal workings of the JVM, the principle and mechanism of garbage collection in Java has not changed. The purpose of garbage collection is to clear objects that are no longer in use. The GC determines whether to collect the object by determining whether the object is referenced by the active object. The GC first determines whether the object is ready to be collected. Two common methods are reference counts and object reference traversal.

In general, can be divided into two major categories

Reference count Collector

The reference count is an early policy in the garbage collector. In this approach, each object (not a reference) in the heap has a reference count. When an object is created, and the object is assigned to a variable, the variable count is set to 1. When any other variable is assigned a reference to this object, the count is added 1 (a = B, then B refers to object +1), but the reference count of the object is reduced by 1 when a reference to an object exceeds the life cycle or is set to a new value. Any object with a reference count of 0 can be garbage collected. When an object is garbage collected, it refers to any object count minus 1.

Advantage: The reference counting collector can be executed very quickly, interwoven in the program running. It is advantageous to the real-time environment that the program is not interrupted for a long time.

Disadvantage: cannot detect a circular reference. If the parent object has a reference to a child object, the child object in turn references the parent object. In this way, their reference count will never be 0.

Trace Collector

earlier JVMs used reference counting, and most JVMs now traverse with object references. object reference traversal starts with a set of objects and recursively determines the reachable (reachable) object along each link on the entire object graph. If an object cannot arrive from one (at least one) of these root objects, it is garbage collected. During the object traversal phase, the GC must remember which objects can be reached in order to delete the Unreachable object, which is known as the markup (marking) object.

Next, the GC wants to delete the unreachable object. When deleted, some GC simply scans the stack, removes unmarked unmarked objects, and frees their memory to generate new objects, called Purge (sweeping). The problem with this approach is that memory is broken into small chunks, which are not sufficient for new objects, but are very large in combination. As a result, many GCs can reorganize objects in memory and compress (compact) to make available space.

To do this, the GC needs to stop other active activities. This approach means that all application-related work is stopped and only the GC is running. As a result, many miscellaneous requests are added and subtracted during the response. In addition, more complex GCS are constantly increasing or running concurrently to reduce or eliminate the interruption of the application. Some GC uses a single thread to do the work, while others use multithreading to increase efficiency.

In fact, a GC determines whether an object can be seen as a strong reference. A more accurate description is that there is a strong reference to an object, which must be referenced directly from the local variables of other strongly referenced objects, static variables, or other similar places. In other words, if a bunch of objects come through a "strong reference" to a non-surviving object, they will be recycled together.

some of the most specific garbage collection algorithms

(1) Mark-Clear

The collector first loops through the object graph and marks the reachable objects, then scans the stack for unmarked objects and frees up their memory. This collector typically uses a single thread to work and stops other operations. Also, because it clears only those unmarked objects and does not compress the tagged objects, it can result in a lot of memory fragmentation, which wastes memory.

(2) Marking-compression

Sometimes also called Mark-purge-compress collector, with the mark-purge collector has the same marking stage. In the second stage, the tag object is copied to the new domain of the stack to compress the stack. This collector also stops other operations.

(3) Copy

This collector divides the stack into two domains, often referred to as half-space. Using only half of the space at a time, the new object generated by the JVM is placed in the other half of the space. When the GC runs, it compresses the stack by copying the reachable object to the other half of the space. This method is suitable for short-lived objects, and continuous replication of long-lived objects results in reduced efficiency. And for a given size heap, it takes twice times the size of memory, because only half of it is used at any time.

(4) Incremental collector

The incremental collector divides the stack into multiple domains, collecting garbage from only one domain at a time, or it can be understood to divide the stack into a small chunk, one at a time, and only one block is garbage collected. This results in a smaller application outage time, which makes the user generally unaware that the garbage collector is working.

(5) Sub-generational

The disadvantage of the copy collector is that all tagged objects are copied each time they are collected, causing objects with long life cycles to be copied back and forth multiple times, consuming a lot of time. The generational collector solves this problem by dividing the stack into two or more domains for storing objects of different lifetimes. A new object generated by the JVM is typically placed in one of the domains. Over time, objects that continue to exist (non-short-lived objects) will receive a lifetime and go into a longer-lived domain. The generational collector uses different algorithms for different domains to optimize performance.

Parallel collector

The parallel collector uses some traditional algorithms and uses multithreading to perform their work in parallel. The use of multithreading on multi-CPU machines can significantly improve the scalability of Java applications.

Finally, a very simple example of a tracking collector is posted so that you can deepen your understanding of the collector:

where to pay attention to using the garbage collector

Here are some things to note about the garbage collector, the garbage collector has a lot of knowledge, and here are just a few of the necessary knowledge:

(1) Each object can only call the Finalize () method once. If an exception (exception) is generated when the Finalize () method executes, the object can still be collected by the garbage collector.

(2) The garbage collector tracks each object and collects the inaccessible objects (that is, the object is no longer referenced by the program) and reclaims the memory space it occupies. However, when garbage collection is in progress, the garbage collector invokes the Finalize () method of the object, if any. If, in the Finalize () method, the object is referenced by a program (commonly known as resurrection), the object becomes a reachable object and is not garbage collected for the time being. However, because each object can only be called once by the Finalize () method, each object can only be "resurrected" once.

(3) The Java language allows programmers to add a Finalize () method to any method that is called before the garbage collector swaps the objects. However, do not rely too much on this method to recycle and reuse system resources because the execution results after the method invocation are unpredictable.

(4) The garbage collector cannot be enforced, but the programmer can suggest that garbage collection be performed by investigating the System.GC method. Remember, it's just a suggestion. It is generally not recommended to write System.GC, because it will increase the amount of garbage collection.

How the Java GC works

Summary: The JVM memory structure is composed of heap, stack, local method stack, method area and so on, in addition, the JVM uses different garbage collection mechanism for the Cenozoic and the old generation respectively.

1. First look at the JVM memory structure, which is composed of heap, stack, local method stack, method area and so on, the structure diagram is as follows.

1) Heap

All the memory of objects created by new is allocated in the heap, and its size can be controlled by-XMX and-XMS. The heap is divided into the Cenozoic and the old generation, and the Cenozoic is further divided into the Eden and survivor areas, and the final survivor is composed of Fromspace and tospace, and the structure diagram is as follows:

Cenozoic. New objects are used to allocate memory in the Cenozoic, Eden Space is not enough, the surviving objects will be transferred to the survivor, the Cenozoic size can be controlled by-xmn, you can also use-xx:survivorratio to control the proportions of Eden and survivor old generation. For storing objects that are still alive after multiple garbage collection in the Cenozoic

2) stack

Each thread executes each method by applying a stack frame in the stack, each of which includes a local variable area and an operand stack to hold temporary variables, parameters, and intermediate results during the method call.

3) Local Method stack

Used to support the execution of the native method, which stores the state of each native method call

4) Method Area

Contains the class information to load, static variables, constants of the final type, properties, and method information. The JVM uses durable generation (permanetgeneration) to store the method area, and the minimum and maximum values can be specified by-xx:permsize and-xx:maxpermsize. After introducing the JVM memory composition structure, let's look at the JVM garbage collection mechanism.

2. JVM garbage collection mechanism

The JVM uses different garbage collection mechanisms for the Cenozoic and the old generation respectively

New generation of GC:

The new generation usually has a shorter survival time, so the so-called copying algorithm is used to retrieve the surviving object based on the copying algorithm, and copy it into a completely unused space, corresponding to the Cenozoic, which is copy between Eden and Fromspace or Tospace. The Cenozoic uses a free pointer to control the GC trigger, the pointer keeps the last allocated object in the Cenozoic interval, and when a new object is allocated memory, it is used to check if the space is sufficient and not enough to trigger the GC. When objects are continuously allocated, the objects gradually go from Eden to Survivor, and finally to the old generation,

Using JAVAVISUALVM to view, can obviously observe the new generation full after the object will be transferred to the old generation, and then empty continue to load, when the old generation is full, will be reported OutOfMemory exception, as shown:

JVM provides serial GC (SERIALGC), parallel reclaim GC (Parallelscavenge), and parallel GC (PARNEW) on execution mechanism

1) Serial GC

The entire scanning and copying process is a single-threaded way, suitable for single CPU, the new generation of small space and the demand for pause time is not very high application, is the client level of the default GC mode, can be-XX:+USESERIALGC to enforce the specified

2) Parallel Recovery GC

In the entire scanning and replication process in a multi-threaded way, for multi-CPU, the time required for a short pause on the application, the server level is the default use of GC mode, can be-XX:+USEPARALLELGC to enforce the designation, with-XX: Parallelgcthreads=4 to specify the number of threads

3) Parallel GC

Use with concurrent GC for legacy generations

GC for old generation:

The old generation and the new generation, the object survival time is longer, more stable, so the mark (Mark) algorithm for recycling, so-called Mark is to scan out the surviving objects, and then to reclaim unmarked objects, after recycling the empty space is either merged, or marked out for the next allocation, The bottom line is to reduce the loss of efficiency caused by memory fragmentation. The JVM provides a serial GC (SERIALMSC), parallel GC (PARALLELMSC), and concurrent GC (CMS) on the execution mechanism, and the details of the algorithm need to be further studied.

The various GC mechanisms above need to be combined, as specified in the following table:

Second article: the algorithm of garbage collection mechanism is introduced in detail.

Transferred from: http://www.cnblogs.com/dolphin0520/p/3783345.html

When it comes to garbage collection (garbage collection,gc), many people naturally connect it to Java. In Java, programmers don't have to worry about memory dynamic allocations and garbage collection issues, all of which are given to the JVM to handle. Garbage collection, as the name implies, frees up space for garbage, so what objects in Java are considered "junk"? What strategies are used to reclaim (free space) after some objects have been identified as garbage? What are the typical garbage collectors in today's commercial virtual machines? Let's take a discussion of these issues. The following is the directory outline for this article:

I. How to determine if an object is "junk"?

Two. Typical garbage collection algorithms

Three. Typical garbage collector

I. How to determine if an object is "junk"?

In this section we first understand one of the most basic questions: What if it is determined that an object is "junk"? Since the garbage collector's task is to reclaim the space occupied by garbage objects for use by new objects, how does the garbage collector determine that an object is "junk"? -that is, by what means an object can be recycled.

In Java, it is associated with an object by reference, that is, if you want to manipulate the object, you must do so by reference. So, obviously, a simple way is to determine whether an object can be recycled by reference counting. Without losing its generality, if an object does not have any references associated with it, then the object is basically unlikely to be used elsewhere, and the object becomes a recyclable object. This approach becomes the reference counting method.

This approach is characterized by simplicity and high efficiency, but it does not solve the problem of circular references, so this is not the case in Java (Python uses the reference notation). Look at the following code:

12345678910111213141516 publicclassMain {    publicstaticvoidmain(String[] args) {        MyObject object1 = newMyObject();        MyObject object2 = newMyObject();                object1.object = object2;        object2.object = object1;                object1 = null;        object2 = null;    }}classMyObject{    publicObject object = null;}

The last two sentences assign Object1 and object2 null, which means that object1 and object2 are no longer accessible, but because they reference each other and cause their reference count to be 0, the garbage collector never recycles them.

In order to solve this problem, the accessibility analysis method was adopted in Java. The basic idea of this method is to search by a series of "GC Roots" objects as a starting point, if there is no unreachable path between "GC Roots" and an object, it is said that the object is unreachable, but it is important to note that objects that are judged to be unreachable do not necessarily become recyclable objects. An object that is judged unreachable must undergo at least two marking processes to become a recoverable object, and if it is still not possible to escape from being a recyclable object during these two marks, it is essentially a recyclable object.

As to how the accessibility analysis method is specific, I do not see very clearly, if any friend is more clear, please feel free.

Let's look at an example:

1234567 Object aobj = newObject ( ) ; Object bobj = new Object ( ) ; Object cobj = newObject ( ) ; aobj = bobj; aobj = cobj; cobj = null; aobj = null;

The first few lines are likely to make an object a recyclable object? The code in line 7th causes the object to become recyclable. As for why leave the reader to think for themselves.

Let's look at an example:

123 string str = new   string ( "Hello" softreference<string> sr = new  softreference<string> ( new  string ( "Java" weakreference<string> WR = new  weakreference<string> ( new  string ( "world"

Which of these three sentences makes a string object a recyclable object? The 2nd sentence and the 3rd sentence, the 2nd sentence the string object will be judged as a recyclable object in the case of insufficient memory, and the 3rd clause can be judged as a recyclable object under any circumstances.

At the end of the day, we summarize the more common cases in which objects are judged to be recyclable:

1) Assign a reference to null or point a reference to an object to a new object, such as the following code:

12345 Object obj = newObject();obj = null;Object obj1 = new Object();Object obj2 = newObject();obj1 = obj2;

2) The object that the local reference refers to, such as the following code:

12345678 void  fun () {   ... &NBSP;&NBSP;&NBSP;&NBSP; for ( int  i= 0 ;I< 10 ;i++) { &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; object obj = new  object (); &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; system.out.println (Obj.getclass ()); &NBSP;&NBSP;&NBSP;&NBSP; }    }

Each time the loop finishes executing, the resulting object object becomes a recyclable object.

3) Only weak references to objects associated with them, such as:

1 WeakReference<String> wr = newWeakReference<String>(newString("world"));
Two. Typical garbage collection algorithms

After determining which garbage can be recycled, what the garbage collector has to do is start a garbage collection, but there is one problem: how to effectively recycle. Because the Java Virtual Machine specification does not explicitly stipulate how to implement a garbage collector, each vendor's virtual machine can implement a garbage collector in different ways, so it is only a discussion of the core ideas of several common garbage collection algorithms.

1.mark-sweep (Mark-Clear) algorithm

This is the most basic garbage collection algorithm, the reason is that it is the most basic because it is the easiest to achieve, the idea is the simplest. The tag-purge algorithm is divided into two stages: the tagging phase and the purge phase. The task of the tagging phase is to mark out all objects that need to be recycled, and the purge phase is to reclaim the space occupied by the tagged objects. The exact process is as follows:

It is easy to see that the tag-purge algorithm is easier to implement, but one of the more serious problems is that it is prone to memory fragmentation, and too many fragments can cause the subsequent process to allocate space for large objects without finding enough space to trigger a new garbage collection action ahead of time.

2.Copying (copy) algorithm

In order to solve the defect of mark-sweep algorithm, the copying algorithm is proposed. It divides the available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is used up, copy the surviving object to another piece, and then clean up the used memory space once, so the memory fragmentation problem is not easy. The exact process is as follows:

This algorithm is simple, efficient, and not prone to memory fragmentation, but it has a high cost of using memory space because it can use less memory than half the original.

Obviously, the efficiency of the copying algorithm is very much related to the number of surviving objects, if there are many surviving objects, then the efficiency of the copying algorithm will be greatly reduced.

3.mark-compact (marker-collation) algorithm

In order to solve the defect of copying algorithm and make full use of memory space, the mark-compact algorithm is proposed. The algorithm marks the same stage as Mark-sweep, but after the token is completed, it does not clean the recyclable object directly, but instead moves the surviving object to one end and then cleans up memory outside the end boundary. The exact process is as follows:

  

4.Generational Collection (generational collection) algorithm

The generational collection algorithm is the algorithm used by most of the JVM's garbage collectors today. Its core idea is to divide the memory into several different regions based on the life cycle of the object's survival. In general, the heap zoning is divided into the old age (tenured Generation) and the New Generation (young Generation), the characteristics of the old age is that each garbage collection only a small number of objects need to be recycled, and the new generation is characterized by a large number of objects to be recycled each time the garbage collected, Then we can take the most suitable collection algorithm according to the characteristics of different generations.

At present, most of the garbage collectors take the copying algorithm for the new generation, because each garbage collection in the Cenozoic has to reclaim most of the objects, that is, the number of operations that need to replicate is less, but the actual is not in accordance with the ratio of 1:1 to divide the new generation of space, In general, the Cenozoic is divided into a larger Eden space and two smaller survivor space, each time using Eden space and one of the survivor space, when recycling, Copy objects that are still alive in Eden and survivor to another survivor space, and then clean up Eden and the survivor space you just used.

Because of the characteristics of the old age is that each recycling only a small number of objects, the general use of the mark-compact algorithm.

Note that there is another generation outside the heap that is the permanent generation (permanet Generation), which is used to store class classes, constants, method descriptions, and so on. The recovery of the permanent generation mainly recycles two parts: obsolete constants and useless classes.

Three. Typical garbage collector

Garbage collection algorithm is the theoretical basis of memory recovery, and garbage collector is the concrete implementation of memory recovery. Here's a look at some of the garbage collectors available from the hotspot (JDK 7) virtual machine, which allows users to assemble the collectors used in each generation according to their needs.

1.serial/serial old

The serial/serial old collector is the most basic and oldest collector, which is a single-threaded collector and must suspend all user threads when it is garbage collected. Serial Collector is for the new generation of collectors, the use of the copying algorithm, Serial old collector is a collector for the older era, using the mark-compact algorithm. Its advantages are simple and efficient, but the disadvantage is that it will bring a pause to the user.

2.ParNew

The Parnew Collector is a multithreaded version of the serial collector that uses multiple threads for garbage collection.

3.Parallel Scavenge

The Parallel scavenge collector is a new generation of multi-threaded collectors (parallel collectors) that do not need to pause other user threads during recycling, using the copying algorithm, which differs from the first two collectors, primarily to achieve a manageable throughput.

4.Parallel old

Parallel old is the older version of the Parallel scavenge collector (parallel collector), using multithreading and Mark-compact algorithms.

5.CMS

The CMS (current Mark Sweep) collector is a collector that targets the shortest payback time and is a concurrent collector, using the mark-sweep algorithm.

6.g1

The G1 collector is the forefront of today's collector technology, a collector for service-side applications that leverages multi-CPU, multicore environments. So it is a parallel and concurrent collector, and it can build a predictable pause-time model.

Let's add something about memory allocation:

  

The memory allocation of the object, which is allocated on the heap in the general direction, is mainly allocated to the new generation of Eden space and from space, and in rare cases it is directly allocated in the old age. If there is not enough space in the new generation of Eden space and from space, a GC is initiated, and if a GC is made, Eden space and from space can hold the object in Eden space and from space. During GC, the surviving objects in Eden Space and from space are moved to space, and then the Eden space and from space are cleaned. If the to space is not sufficient to store an object during the cleanup process, the object is moved to the old age. After the GC is performed, Eden Space and to space are used, and the next GC will replicate the surviving object to the from Space, so that it loops back and forth. When an object escapes a GC in the Survivor area, its object age increases by 1, and by default, if the object reaches 15 years old, it will move to the old age.

In general, large objects are allocated directly to the old age, so-called large objects are objects that require a large amount of contiguous storage space, and the most common large objects are large arrays, such as:

byte[] data = new byte[4*1024*1024]

This typically allocates storage space directly in the old age.

Of course, the rules for allocation are not completely fixed, depending on which garbage collector combination is currently being used and the parameters of the JVM.

Java garbage collection mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.