Java garbage collection mechanism

Source: Internet
Author: User

Object reference

Garbage collection in Java is typically done in the Java heap because almost all of the object instances in Java are stored in the heap. When it comes to garbage collection in the Java heap, it's natural to talk about references. Before JDK1.2, the reference definition in Java was very pure: if the value stored in the data of the reference type represents the starting address of another piece of memory, it is said that this memory represents a reference. After JDK1.2, however, Java extends the concept of references into strong references (strong Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference) Four species, the reference strength in turn weakened.

    • Strong references: such as "Object obj = new Object ()", such references are the most common in Java programs. As long as a strong reference exists, the garbage collector never reclaims the referenced object.
    • Soft reference: It is used to describe objects that may also be useful, but not necessary. When system memory is insufficient, such references to associated objects are reclaimed by the garbage collector. After JDK1.2, the SoftReference class is provided to implement the soft reference.
    • Weak references: It is also used to describe non-required objects, but its strength is weaker than soft references, and the objects associated with weak references can only survive the next garbage collection occurrence. When the garbage collector is working, the objects associated with a weak reference are reclaimed regardless of whether the current memory is sufficient. After JDK1.2, the WeakReference class is provided to implement the weak reference.
    • Virtual reference: The weakest reference relationship does not affect its lifetime at all, nor can it obtain an object instance from a virtual reference. The only purpose of setting a virtual reference association for an object is to expect to receive a system notification when the object is reclaimed by the collector. The Phantomreference class is provided after JDK1.2 to implement the virtual reference.
Determination of garbage objects

Almost all object instances are stored in the Java heap, and before the garbage collector reclaims objects in the heap, it is important to determine whether the objects are still useful and to determine whether the object is a garbage object with the following algorithm:

Reference counting algorithm

Add a reference counter to the object, each time there is a place to reference it, the counter value is added 1, when the reference fails, the counter value is reduced by 1, any time the counter is 0 of the object is impossible to be used again.

The reference counting algorithm is simple to implement, the decision efficiency is very high, in most cases it is a good choice, when the Java language does not choose this algorithm for garbage collection, the main reason is that it is difficult to solve the problem of mutual circular reference between objects.

Root Search algorithm

Both Java and C # use Root search algorithms to determine whether an object survives. The basic idea of this algorithm is to use a series of objects called "GC Roots" as the starting point, starting from these nodes to search down, the path of the search is called the reference chain, when an object to the GC Roots no reference chain connected, it proves that this object is not available. In the Java language, redemption as a GC Roots can include the following:

    • The object referenced in the virtual machine stack (the local variable table in the stack frame).
    • The object referenced by the class static property in the method area.
    • The object referenced by a constant in the method area.
    • The reference object for the JNI (Native method) in the local method stack.

In fact, in the root search algorithm, to really declare an object to die, at least two times to go through the marking process: if the object found that there is no reference chain connected to the GC Roots after the root search, it will be first marked and filtered, the condition is whether this object is necessary to perform finalize ( Method When the object does not overwrite the Finalize () method, or the Finalize () method has been called by the virtual machine, the virtual machine treats both cases as unnecessary. If the object is judged to be necessary to execute the Finalize () method, then this object will be placed in a queue named F-queue, and later by a low-priority Finalizer thread that is automatically established by the virtual machine to execute the Finalize () method. The Finalize () method is the last chance for an object to escape the fate of death (because the Finalize () method of an object is called only once by the system), and later the GC will mark the second small size of the objects in the F-queue, if the Finalize () method to successfully save yourself, simply make the object re-referencing any one of the objects on the chain in the Finalize () method. And if the object is not linked to any chain references at this point, it will be recycled.

Garbage collection algorithm

After the decision is made, the garbage collection can be done in addition to the garbage object. Here are some garbage collection algorithms, because the implementation of garbage collection algorithm involves a lot of program details, so here is mainly to clarify the implementation of the idea of the algorithm, rather than the implementation of the detailed algorithm.

Tag-Purge algorithm

Mark-Clear algorithm is the most basic collection algorithm, it is divided into "mark" and "clear" two stages: first mark out the objects that need to be recycled, after the completion of the mark is complete, the collection of all tagged objects, its tagging process is actually the previous root search algorithm to determine the garbage object marking process. The tag-purge algorithm performs as shown:

Pre-Recycle Status:

Post-Recycle Status:

Tagging-sorting algorithms

Replication algorithm is suitable for the new generation, in the old age, the object survival rate is relatively high, if the implementation of more replication operations, the efficiency will be lower, so the old age will generally choose other algorithms, such as marker-collation algorithm. The algorithm marks the same procedure as the markup-purge algorithm, but the garbage object after the tag is handled differently, instead of cleaning up the recyclable object directly, instead of clearing the object to one end, and then directly cleaning out the memory outside the end boundary. The recycling of the tag-collation algorithm is as follows:

Pre-Recycle Status:

Post-Recycle Status:

Collection of generations

The current commercial virtual machine garbage collection uses the generational collection, it divides the memory into several blocks according to the different life cycle of the object, generally divides the Java heap into the Cenozoic and the old age. In the Cenozoic, each garbage collection will find a large number of objects die, only a small number of survival, so you can choose a copy algorithm to complete the collection, and the old age because the object survival rate is high, there is no extra space for the allocation of security, you must use the tag-clear algorithm or tag-collation algorithm for recycling.

Garbage collector

The garbage collector is a concrete implementation of the memory recovery algorithm, and there is no provision for how the garbage collector should be implemented in the Java virtual machine specification, so the garbage collectors provided by different vendors and versions of virtual machines can vary greatly. Sun HotSpot Virtual Machine version 1.6 contains the following collectors: Serial, parnew, Parallel scavenge, CMS, Serial old, Parallel old. These collectors work together in a variety of combinations to complete garbage collection in different sub-generational areas.

Garbage collection Analysis

Before using code analysis, we identified the following three points in the memory allocation strategy:

    • objects are assigned in Eden precedence.
    • Large objects go straight into the old age.
    • Long-term survival of the object will enter the old age.

The following two points are described for the garbage collection policy:

    • New Generation GC (Minor GC): Garbage collection actions occur in the Cenozoic, because most Java objects have an out-of-date feature, so the Minor GC is very frequent and the general recovery speed is relatively fast.
    • Old age GC (Major gc/full GC): GC, which occurred in the old age, appeared Major GC, often accompanied at least once Minor GC. Due to the long life cycle of the objects in the old age, the Major GC is not frequent, it is generally waiting for the old age to be full, and its speed will generally be more than 10 times times slower than the Minor GC. In addition, if direct memory is assigned, when the full GC is performed in the old age, the discarded objects in Direct memory are cleared.

Let's look at the following code:

public class SlotGc{      public static void main(String[] args){          byte[] holder = new byte[32*1024*1024];          System.gc();      }  }  

The code is simple enough to populate the memory with 32MB of data and then garbage collection through the virtual machine. After compiling the Javac, we execute the following instruction: JAVA-VERBOSE:GC SLOTGC to view the results of the garbage collection and obtain the following output information:

[GC 208K->134K(5056K), 0.0017306 secs][Full GC 134K->134K(5056K), 0.0121194 secs][Full GC 32902K->32902K(37828K), 0.0094149 sec

Note the data before the third line, "--", indicates the amount of memory occupied by the surviving objects in the heap before garbage collection, and the data after "-" indicates the amount of memory occupied by the surviving objects in the garbage-collected heap, the data in parentheses represents the total capacity of the heap memory, 0.0094149 sec Represents the time that is spent in garbage collection.

As can be seen from the results, System.GC () does not recycle the 32MB memory after running, this should be expected, because the variable holder is still in scope, the virtual machine naturally does not reclaim the memory of the object referenced by holder.

We modify the code as follows:

public class SlotGc{      public static void main(String[] args){          {          byte[] holder = new byte[32*1024*1024];          }          System.gc();      }  }  

After the curly braces are added, the scope of the holder is limited to curly braces, so when the System.GC () is executed, the holder reference is no longer accessible, and logically, this time it should reclaim the memory of the object that the holder refers to. However, when viewing the garbage collection situation, the output information is as follows:

[GC 208K->134K(5056K), 0.0017100 secs][Full GC 134K->134K(5056K), 0.0125887 secs][Full GC 32902K->32902K(37828K), 0.0089226 secs]

It is clear that the 32MB data has not been recycled. Let's make the following changes:

public class SlotGc{      public static void main(String[] args){          {          byte[] holder = new byte[32*1024*1024];          holder = null;          }          System.gc();      }  }  

The garbage collection information obtained this time is as follows:

[GC 208K->134K(5056K), 0.0017194 secs][Full GC 134K->134K(5056K), 0.0124656 secs][Full GC 32902K->134K(37828K), 0.0091637 secs]

Indicates that the memory of the object referenced by this holder is recycled. Let's take a slow analysis.

First, be clear: the root cause of whether holder can be recycled is if the slots in the local variable table also have references to holder array objects.

In the first modification, although it is recycled outside of the holder scope, there is no read-write operation on the local variable table after this, and the slot occupied by Holder has not been reused by other variables (recalling the Java memory area and memory overflow article about the slot), so as a GC R Oots part of the local variable table remains associated with it. This association is not interrupted in time, so the GC collector does not reclaim the object memory referenced by holder. In the second modification, by manually setting the holder to a null value before the GC collector works, the Slot in the local variable table occupied by holder is emptied, so that the GC collector works by Holder the memory of the object previously referenced.

Of course, we can also use other methods to reclaim the memory of the object referenced by holder, as long as the holder occupies a slot, such as performing a read-write operation outside of the holder scope.

Assigning null values to objects is not the best way to control the collection of variables, and it is the most elegant solution to control variable recycling time with appropriate variable scopes. In addition, an operation that assigns a null value is eliminated after being optimized by the virtual machine JIT compiler, and after JIT compilation, System.GC () can correctly reclaim memory without assigning a null value.

Performance tuning

The memory management and garbage collection of Java virtual machine is the most important part of the virtual machine architecture, which has a very important influence on the performance and stability of the program (especially the server side). Performance tuning requires specific analysis, and the actual analysis may need to consider many aspects, here is only a few simple and commonly used cases briefly introduced.

    • We can increase the responsiveness of the server by allocating a large heap to the Java virtual machine (provided that the physical machine's memory is large enough), but allocating a large heap is a good place to control the full GC frequency of the application sufficiently low because the time of a full GC results in a long pause. The key to control the full GC frequency is to ensure that the lifetime of most objects in the application should not be too long, especially for large objects with long life cycle, so as to ensure the stability of the old age.
    • Direct memory is allocated outside of the heap, and both are subject to physical machine memory and negatively correlated, so when allocating large heaps, it is possible to use the NIO mechanism to allocate a lot of direct memory, which may lead to direct memory OutOfMemoryError exception, you can adjust the size of the Direct Memory by the-xx:maxdirectmemorysize parameter.
    • In addition to the Java heap and permanent generation and direct memory, it is also important to note that the following areas also consume more memory, the sum of which is limited by the maximum memory of the operating system process:

1. Thread stacks: Can be resized by-XSS, thrown stackoverflowerror when out of memory (cannot be allocated vertically, i.e. cannot allocate new stack frames) or outofmemoryerror (cannot be assigned horizontally, that is, new threads cannot be established).

2. Socket buffer: Each socket connection has Receive and Send two buffers, each occupying approximately 37KB and 25KB of memory. If it cannot be allocated, Ioexception:too many open files exception may be thrown. For a detailed introduction to the socket buffer, see the articles in the My Java Network programming series for a deeper analysis of sockets.

3. JNI code: If the code uses JNI to invoke the local library, the memory used by the local library is not in the heap.

4. Virtual Machines and GC: Code execution for virtual machines and GC also consumes a certain amount of memory.

Java garbage collection mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.