"Reprint" Java performance optimization of the JVM GC (garbage collection Mechanism)

Source: Internet
Author: User

Chapter Source: https://zhuanlan.zhihu.com/p/25539690

Java performance optimization, sorting out an article for later restudying.

JVM GC (garbage collection Mechanism)

Before we learn the Java GC, we need to remember one word: Stop-the-world. It will occur in any GC algorithm. Stop-the-world means that the JVM stops executing the application because it needs to perform a GC. When stop-the-world occurs, all threads go into a wait state, except for the thread required by the GC, until the GC task completes. GC optimization is a lot of time to reduce the occurrence of stop-the-world.



What area of garbage is reclaimed by the JVM GC?

It is important to note that the JVM GC reclaims only the objects in the heap and method areas. The data in the stack area is automatically freed by the JVM when it goes out of scope, so it is not within the scope of the JVM GC's management.

How does the JVM GC determine that objects can be recycled?

· Object does not have a reference

· No catch exception occurred at scope

· The program executes properly at scope

· The program executed the System.exit ()

· Unexpected termination of program (killed thread, etc.)

There is no way to explicitly allocate and unregister the cache in a Java program, because these things the JVM has done for us, that is, the GC.

Sometimes we can set the related object to null to try to display the clear cache, but not set to NULL will be marked as recyclable, it is possible to escape.

Setting the object to null is at least harmless, but using System.GC () is not available, and using System.GC () is not a GC operation immediately, but it waits for a period of time, not even execution, and System.GC (), if executed, triggers the full GC, which greatly affects performance.

When does the JVM GC execute?

The Eden Zone does not have enough space to hold the new object, execute the Minro GC. The object that rises to the old age is larger than the remaining space of the old age when the full GC is executed, or when it is less than the Handlepromotionfailure parameter forces the full GC. Tuning is primarily to reduce the number of triggers for full GC, which can be controlled by Newratio to control the percentage of the age of the new generation, through Maxtenuringthreshold setting objects into the old ages threshold (described later).

Garbage collection mechanism by generation

New Generation (young generation): The vast majority of newly created objects are assigned here, and since most of them become inaccessible soon after they are created, many objects are created in the Cenozoic and then "disappear." The process by which an object "disappears" from this area is called: Minor GC.

Old generation: objects have not become unreachable and survived from the Cenozoic cycle and are copied here. Its region allocates more space than the Cenozoic. Because of its relatively large space, the number of GCs occurring in the old years is much less than in the Cenozoic. The process of disappearing an object from the old age, which is called: Major GC or full GC.

The persistent generation (Permanent generation) is also known as the method area: used to hold class constants and string constants. Note that this area is not used to store objects that survived from the old age, and this area can also occur in GC. GC events that occur in this region are also counted as Major GC. Only the conditions in which GC is occurring in this area are very stringent and must meet the following three conditions to be recycled:

1. All instances are recycled

2. ClassLoader to load the class is recycled

3. Class objects cannot be accessed by any means (including reflection)

Maybe we'll have a question:

What happens if the object of the old age needs to refer to a new generation of objects?

In order to solve this problem, there is a card table in the old age, it is a block of 512byte size. References to new generations of objects in all old eras will be recorded in this table. When performing GC for the new generation, it is only necessary to query the card table to determine whether it can be recycled without querying the entire old age. This card table is managed by a write barrier. Write barrier has brought a lot of performance gains to the GC, although it may cost some, but it's worth it entirely.

The default new generation (young generation), older generation (old generation) occupies a space ratio of 1:2.

The formation and logic of Cenozoic space

To better understand the GC, let's learn the composition of the new generation, which is used to preserve the objects that were created for the first time, and it is divided into three spaces:

· An Eden Space (Eden)

· Two survivor space (Fron Survivor, to Survivor)

Allocation of default Cenozoic space: Eden:Fron:To = 8:1: 1

The order of execution for each space is as follows:

1. Most of the objects that have just been created will be stored in the Eden Space (Eden).

2. After the first GC (Minor GC) is executed in Eden space, the surviving object is moved to one of the survivor spaces (Survivor).

3. Thereafter, each time a GC is performed in Eden Space, the surviving objects are stacked in the same survivor space.

4. When a survivor is saturated with space, the surviving object is moved to another survivor's space. It then empties which survivor space is already saturated.

5, repeat in the above steps n times (n = maxtenuringthreshold (age threshold setting, default 15)) still live objects, will be moved to the old age.

From the above steps can be found that two survivor space, must have one is kept empty. If two two survivor spaces have data, or two spaces are empty, there must be some kind of error in your system.

The important point to remember is that after the object has just been created, it is stored in the Eden space. Those long-lived objects are transferred to the old generation through Survivor Space (Survivor).

There are exceptions, for some larger objects (which need to allocate a larger contiguous memory space) go straight to the old age. Generally occurs in the case of survivor space shortage.

The composition and logic of the old age space

The composition of the old age space is actually very simple, it does not like the new generation of space as divided into several areas, it has only one region, the objects stored inside is not like the vast majority of the new generation of space is justified man road, Xi died. Almost all of the objects here are from the survivor space, they will never be easy dog belt. As a result, the full GC (Major GC) does not occur as frequently as minor GC, and the time to do Major GC is longer (about 10 times times) than minor GC.

JVM GC Algorithm Explained

1. Root Search algorithm

The root search algorithm is introduced from graph theory in discrete mathematics, and the program considers all referential relationships as a graph, starting from a GC ROOT of a node, looking for the corresponding reference node, and after finding the node, continue to look for the node's reference node. When all the reference nodes are searched, the remaining nodes are considered nodes that are not referenced, that is, useless nodes.

Red is a useless node and can be recycled.

Currently, the objects in Java that can be used as GC root are:

1. Objects referenced in the virtual machine stack (local variable table)

2. Objects referenced by static properties in the method area

3. Objects that are normally referenced in the method area

4. Objects referenced in the local method stack (native objects)

Basic all GC algorithms refer to the concept of root search algorithm.

2. Mark-Sweep algorithm

The tag-purge algorithm takes a scan from the root collection, marks the surviving object, marks it, and then scans the entire space for non-tagged objects for direct recycling, such as.

The tag-purge algorithm does not need to move objects, and it only handles objects that are not alive, and is extremely efficient when there are many surviving objects, but because the tag-purge algorithm directly reclaims the objects that are not alive and does not defragment the surviving objects, it can cause memory fragmentation.

3. Copy algorithm

The replication algorithm divides memory into two intervals, and when this algorithm is used, all dynamically allocated objects can be assigned only one interval (the active interval), while the other interval (space interval) is idle.

The replication algorithm uses a scan from the root set to copy the surviving objects to the idle interval, and when the active interval is scanned, the active interval is recycled all at once. At this point the original idle interval becomes the active interval. The next time the GC is repeated, it will repeat the operation as a loop.

The replication algorithm is extremely efficient when there are fewer surviving objects, but the cost is half the memory space used for object movement. So the usage scenario of the replication algorithm must be that the survival rate of the object is very low, and most importantly, we need to overcome the waste of 50% of memory.

4. Marking-Sorting algorithm

The mark-and-organize algorithm uses the tag-sweep algorithm to mark and purge objects in the same way, but after reclaiming the space occupied by the objects that are not alive, all the surviving objects are moved to the left free space and the corresponding pointers are updated. The tag-collation algorithm, which is more expensive but solves the problem of memory fragmentation, is based on the tag-purge algorithm and the object's moving sort.

In order to optimize the recovery of memory, the JVM uses the method of generational recovery, which mainly uses the replication algorithm for the recovery of Cenozoic memory (Minor GC). For the old age of recycling (Major GC), most of the use of marker-collation algorithm.

Introduction to the garbage collector

It is important to note that every collector has the stop the world problem, but each collector in the Stop the World Time optimization, the algorithm is different, according to their own needs to choose the appropriate collector.

1, Serial (-XX:+USESERIALGC)

From the name we can see that this is a serial collector.

The serial collector is the most basic, oldest collector in a Java virtual machine. Before JDK1.3 is the only option for the new generation of Java Virtual machine collectors. Currently also CLIENTVM under SERVERVM 4 core 4GB below machine default garbage collector. The serial collector is not collected using only one CPU, but when the JVM needs garbage collection, it pauses all user threads until the recycle is complete.

Using Algorithms: Replication algorithms

The JVM Chinese name is a Java virtual machine, so it works like a virtual computer, and each of these threads is considered a processor of the JVM, so the CPU0, CPU1 in the diagram are actually user threads, not real machine CPUs, and don't misunderstand.

Although the serial collector is the oldest, it is the most efficient in this case compared to other collectors, because it does not have the overhead of thread interaction and is concentrating on garbage collection for a single CPU-constrained environment.

2, Serialold (-XX:+USESERIALGC)

Serialold is the old-age collector version of the serial collector, which is also a single-threaded collector, which is currently used primarily in client mode. In the case of server mode, it has two main purposes: one is to use the parallel scavenge collector in JDK1.5 and previous versions, and the other is as a backup plan for the CMS collector, if the CMS appears concurrent mode Failure, the serialold will be used as the fallback collector.

Using algorithms: Tagging-sorting algorithms

Run with consistency.

3, Parnew (-XX:+USEPARNEWGC)

Parnew is actually a multithreaded version of the serial collector. In addition to the serial collector, only it can work with the CMS collector.

Using Algorithms: Replication algorithms

Parnew is the preferred Cenozoic collector for many JVMs running in server mode. However, in the case of single CPU, it is far less efficient than the serial collector, so be sure to pay attention to the use of the scene.

4, Parallelscavenge (-XX:+USEPARALLELGC)

Parallelscavenge, also known as the throughput priority collector, is a new generation collector similar to the Parnew collector.

Using Algorithms: Replication algorithms

The goal of the Parallelscavenge collector is to achieve a control throughput, so-called throughput is the ratio of the CPU's time spent running user code to the total CPU elapsed time, that is, throughput = Running user code time/(running user code time + garbage collection time). If the virtual machine runs for a total of 100 minutes and the garbage collection takes 1 minutes, the throughput is 99%.

5, Parallelold (-XX:+USEPARALLELOLDGC)

Parallelold is a parallel collector, and like Serialold, Parallelold is an old-age collector, a collector of older years ' throughput priorities. This collector is only available after JDK1.6, and before that, Parallelscavenge can only choose Serialold as the collector of its old age, which seriously drags the speed of parallelscavenge overall. With the advent of Parallelold, the "throughput first" collector is a veritable experience!

Using algorithms: Tagging-sorting algorithms

The Parallelscavenge + Paralleloold collector is a priority in cases where the throughput and number of CPUs are greater than 1.

6. CMS (-XX:+USECONCMARKSWEEPGC)

CMS is an old age collector, full name Concurrent low Pause Collector, is a new GC collector that JDK1.4 later started referencing, and has been further improved in JDK1.5, 1.6. It is a collector that needs more than the throughput requirements for the importance of response time. Using a CMS is a good fit for a server that requires high response times.

A major feature of CMS is the long pause of two short pauses to replace serial or parallel tagging algorithms.

Using algorithms: Tag-Clean

The implementation process for CMS is as follows:

· Initial tag (STW initial mark)

At this stage, the virtual machine is required to pause the application thread being executed, officially called STW (Stop Tow World). This process scans the object directly associated with the root object and makes the tag. The process will be completed very quickly.

· Concurrency token (Concurrent marking)

This phase is followed by the initial marking phase and continues to be traced down on the basis of the "initial tag". Note that this is a concurrency token, which means that the user thread can execute concurrently with the GC thread, and this phase does not halt the user's thread.

· Concurrent pre-Cleanup (Concurrent precleaning)

This phase is still concurrent, and the JVM looks for objects that are in the old age when the "concurrency tag" stage is being executed (this may be when objects are promoted from the new generation to the old age or to the old age). By Rescanning, the work of "re-tagging" at one stage is reduced, because the next stage is STW.

· Re-tagging (STW remark)

This phase pauses the application thread that is executing again, re-root the object to find and mark the objects that were missed in the concurrency phase (the update of the object state after the end of the concurrency tagging phase), and handle the object association. This time will take longer than the "initial tag", and this phase can be marked in parallel.

· Concurrent Cleanup (Concurrent sweeping)

This phase is concurrent, and the application thread and GC cleanup thread can execute concurrently.

· Concurrent Reset (Concurrent reset)

This phase is still concurrent, resetting the data structure of the CMS collector and waiting for the next garbage collection.

Disadvantages of the CMS:

1, memory fragmentation. Memory fragmentation is generated in memory space due to the use of the tag-cleanup algorithm. However, the CMS collector did some small optimizations, that is, the unallocated space is summarized into a list, when the JVM needs to allocate memory space, it will search this list to find a qualifying space to store the object. However, the problem of memory fragmentation still exists, if an object requires 3 contiguous space to store, because of memory fragmentation, the search for this space will result in full GC.

2, need more CPU resources. Because concurrent processing is used, in many cases, both GC threads and application threads execute concurrently, which requires more CPU resources and is a sacrifice of a certain throughput.

3, need a larger heap space. Because the CMS tag phase of the application thread is still executing, there is a problem with the heap space to continue allocating, and some space must be reserved to ensure that the CMS has space allocated to the newly added object before the heap space is reclaimed. The CMS defaults to start garbage collection in the old age space using 68%. This threshold can be set by-xx:cmsinitiatingoccupancyfraction=n.

7, Garbagefirst (G1)

This is a new garbage collector, both can be recycled and can reclaim the old age, sunhotspot1.6u14 more than earlyaccess version of this collector, sun Company is expected to release the official version SunHotSpot1.7. Consolidate and optimize the CMS by re-partitioning the memory area, while focusing on throughput and response time. The cup is used by Oracle to acquire the collector and use it for the commercial charge collector. So for the time being, there is no company that uses it, so let's do it later.

Tidy up the collectors of the new generation and the old years.

New Generation Collectors:

Serial (-XX:+USESERIALGC)

Parnew (-XX:+USEPARNEWGC)

Parallelscavenge (-XX:+USEPARALLELGC)

G1 Collector

Old Age Collectors:

Serialold (-XX:+USESERIALOLDGC)

Parallelold (-XX:+USEPARALLELOLDGC)

CMS (-XX:+USECONCMARKSWEEPGC)

G1 Collector

Differences in memory overflow and memory leaks

1. Memory Overflow

Memory overflow refers to a program that does not have enough space to allocate when it requests memory.

2. Memory leaks

Memory leak refers to the program after the application of memory, there is no way to release the memory has been applied to, it always occupies the memory, that is, the allocated objects can be up to but useless . Memory leaks are typically caused by a large object in memory, but cannot be freed.

As you can see from the definition, memory leaks will eventually cause memory overflow.

Note that the first step in locating a virtual machine problem memory problem is to determine whether it is memory overflow or memory leak, the former good judgment, tracking stack information can be, the latter is more complex, is generally the old age of large objects did not release, to find out in the old age of the big object is not released.

Differences in parallelism and concurrency

These two nouns are concepts in concurrent programming, and in the context of talking about garbage collectors, you can understand these two nouns:

1. Parallel parallel

Multiple garbage collection threads work in parallel, but the user thread is still waiting

2. Concurrent concurrent

means that the user thread executes concurrently with the garbage collection thread (but not necessarily in parallel, may be alternately executed), the user program continues to run, and the garbage collector runs on the other CPU

Minor the difference between GC and full GC

1. New Generation GC (Minor GC)

Refers to the garbage collection action occurring in the Cenozoic, because most Java objects are not very good, so the minor GC is very frequent and the recovery rate is faster

2, Old generation GC (Major gc/full GC)

The garbage collection action that occurs in the old age, major GC, often accompanied at least once minor GC (but not absolute). Major GC is generally more than 10 times times slower than minor GC

The difference between client mode and server mode

In some commercial virtual machines, the Java program was originally interpreted by the interpreter to interpret the. class file, and when the virtual machine discovers that a method or block of code is running particularly frequently, the code is identified as hotspot code hot Spot Code (which is also the origin of the virtual machine hotspot name we use). In order to improve the efficiency of hot code execution, at runtime, the virtual machine will compile the code into the machine code associated with the local platform and perform various levels of optimization, and the compiler for this task is called the Instant compiler (Just in time Compiler, the JIT compiler). The JIT compiler is not a required part of a virtual machine, and the Java Virtual Machine specification does not require the existence of a JIT compiler, nor does it qualify or instruct the JIT compiler how to implement it. However, the performance of the JIT compiler, the level of code optimization is the measurement of a commercial virtual machine is one of the most important indicators of good or not.

The interpreter and compiler actually have advantages over the compiler:

1, when the program needs to start and execute quickly, the interpreter can play a role, save the compile time, immediately execute

2, after the program runs, over time, the compiler gradually play a role, the more and more code to compile the cost of code, you can get higher execution efficiency

The hotspot we used built up two JIT compilers, the C1 compiler and the C2 compiler, by default using an interpreter and an editor to work together. At startup, the hotspot automatically chooses the operating mode based on its version and the hardware performance of the host machine, such as detecting whether the host machine is a server, such as J2SE, which detects whether the host has at least 2 CPUs and at least 2GB of memory.

1, if it is, the virtual opportunity to run in server mode, the mode and the C2 compiler run together, more attention to the quality of the compilation, startup speed is slow, but high efficiency, suitable for use in the server environment, for the production environment is optimized

2, if not, then the virtual opportunity to run in client mode, the mode and the C1 compiler run together, more attention to the speed of the compilation, startup speed, more suitable for the client version, the GUI is optimized

There are two ways to see whether a virtual machine is running in client mode or in server mode:

1. Run the "java-version" command at the program command line to see if your locally installed virtual machine is information

2, for example, we run the program with Eclipse or myeclipse, generally use the tool comes with the JRE, virtual machine is not a locally installed virtual machine. What to do now, you can view the virtual machine information by running the following statement in the program

System.out.println (System.getproperty ("Java.vm.name"));

The result of my operation here is

Java HotSpot (TM) 64-bit Server VM

Of course, to change the mode of virtual machine operation can also, only need to change jvm.cfg on it. We can find jvm.cfg from the following places:

The file path for 1, 32-bit JDK is java_home/jre/lib/i386/jvm.cfg

The file path for 2, 64-bit JDK is java_home/jre/lib/amd64/jvm.cfg

3, MyEclipse in .../common/binary/com.sun.java.jdk.win32.x86_64_1.6.0.013/jre/lib/amd64/jvm.cfg

Currently 64-bit support only server mode, the file content is the same, the above comments do not control it, the rest is these:

-server known-client ignore-hotspot aliased_to-server-classic warn-native error-green ERROR

Since my computer has a 64-bit JDK, it is "-client ingore". Supporting both the server mode and the client mode, should be "-server known" and "-client known", generally only need to change the sequencing of these two configurations, but only if java_home/jre/ The bin directory has both server and client two folders, respectively, corresponding to the respective virtual machine, missing one, after switching will be error.

"Reprint" Java performance optimization of the JVM GC (garbage collection Mechanism)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.