Translation GC Expert Series 1: Understanding Java Garbage Collection

Source: Internet
Author: User
Tags compact

Original link: http://www.cubrid.org/blog/dev-platform/understanding-java-garbage-collection/

How does the Java garbage collection (GC) principle benefit us? For software engineers, satisfying technical curiosity is one, but it's important to understand that GC can help us write Java applications better.

The above is my personal subjective view, but I believe that mastering GC is an essential skill to be a good Java programmer. If you are interested in the GC execution process, you may only have some experience in developing the application, and if you carefully consider how to choose the appropriate GC algorithm, you have a thorough understanding of the program you are developing. Of course this may not be a common standard for a good programmer, but few will oppose my view that "understanding GC as an essential skill for good Java programmers".

This article is a case study of a Java GC expert about GC tuning related content.

The purpose of this article is to introduce you to the GC concept in a popular way. I hope this article will be of some help to you. In fact, my colleagues have published some very interesting [excellent articles] on Twitter, and you can refer to them as well.

Back to garbage collection, you should know a word before you start to learn the GC: Stop-the-world. Regardless of which GC algorithm you choose, Stop-the-world is unavoidable. Stop-the-world means stopping from the application and going into the GC execution. Once stop-the-world occurs, other threads will stop working except for the required threads of the GC, and the disconnected threads continue their tasks until the GC task ends. GC tuning is usually done to improve the stop-the-world time.

Garbage collection based on the theory of generational generation

There is no need to explicitly allocate and free memory in a Java program. Some people want to explicitly free up memory space by assigning a value of NULL to an object or by calling System.GC (). Setting null for an object is not very good, but the problem is not too great, and if System.GC () is called, it can cause severe fluctuations in system performance, even if the call to System.GC () does not necessarily respond immediately to a garbage collection. (Fortunately, Nhn never saw an engineer do it.) )

When using Java, programmers do not need to explicitly free up memory space in program code, and the garbage collector will help you find unwanted (garbage) objects and move them out. The garbage collector is created on the basis of the following two assumptions (which may be referred to as inferences or preconditions):

    • Most objects will soon become unreachable.

    • There are only a few cases where old objects hold references to new objects

These two assumptions are called "weak generational assumptions". To prove this hypothesis, the physical memory space in the hotspot VM is divided into two parts: the New Generation (young generate) and the older generation (old generation).

Cenozoic: Most of the newly created objects are allocated in the new generation. Because most objects soon become unreachable, they are assigned to the next generation and then disappear again. When an object is removed from the Cenozoic, we call it "minor GC".

Old age: Objects that survive in the Cenozoic but are not unreachable will be copied to the old age. In general, the memory space of the old age is larger than that of the Cenozoic, so the frequency of GC occurs in the old age is lower than that of Cenozoic. When an object is removed from the old age, we call it "major GC" (or full GC).

Look at the schematic:


Figure 1:GC Region and data flow

The permanent generation in the figure is called a method area, which stores meta-information for classes and interfaces and interned string information. So this area is not a persistent zone defined for the surviving objects of the old age. GC also occurs in the method area, where the GC is also known as the major GC.

Some people may think that:

What if the object of the old age needs to hold a reference to a Cenozoic object?

In order to deal with this scenario, an "index table" was designed in the old age, which is a 512-byte block of data. This table is recorded whenever the old age needs to hold a reference to a new generation of objects. When a GC is required in the Cenozoic, the search for this table determines whether the Cenozoic object is a GC target object, thereby reducing the cost of traversing all older objects for inspection. The index table is managed using a write fence (writing barrier). wite barrier is a device that allows high performance to perform minor GC. Although it introduces the overhead of a data bit, it can lead to a significant decrease in overall GC time.


Figure 2: Index Table structure

The structure of the Cenozoic

To understand the GC in depth, we start with the new generation. All objects are assigned to the new generation when they are initially created. The Cenozoic can be divided into three parts:

    • An Eden area

    • Two survivor districts

Of the three regions, two are survivor districts. The surviving process for an object in three regions is as follows:

    1. Most new objects are assigned to the Eden area.

    2. Objects that survived in Eden after the first GC are moved to one of the survivor areas.

    3. In the second GC, objects that are still alive in Eden are moved to the survivor area that was previously moved into the object.

    4. Once the survivor area is free of space, the surviving objects are moved from the current survivor area to another empty survivor area. The current survivor area is then reset to an empty state.

    5. After several moves in two survivor areas, the surviving objects will eventually be moved to the old age.

As mentioned above, two survivor regions must have a blank at any time. If you have data that exists in two survivor or two regions, the usage is 0, which means that your system may be running out of error.

Show you the process of migrating data to the old age by minor GC:


Figure 3:GC before and after

In the hotspot VM, two techniques are used to achieve faster memory allocation: "Pointer collisions (bump-the-pointer)" and "Tlabs (thread-local Allocation buffers)".

Bump-the-pointer technology tracks newly created objects on Eden. Since the new object is assigned to the top of the Eden space, subsequent creation of the new object simply determines whether the size of the newly created object satisfies the remaining Eden space. If the new object satisfies the requirement, it is assigned to the Eden Space, which is also at the top of the Eden. So when a new object is created, it is only necessary to determine the size of the new object, so it has a faster memory allocation speed. However, in the multi-threaded environment, there will be a different situation. In order to satisfy thread safety when multiple threads are creating objects on Eden Space, locks are inevitably introduced, so the performance of objects is compromised as the cost of lock contention is high. The problem of multithreading is solved by tlabs in hotspot. Tlabs allows each thread to have its own small space on Eden, and the thread can only access its own Tlab zone, so bump-the-pointer can achieve fast memory allocations without locking through Tlab.

This section provides a quick tour of GC knowledge on the new generation. The two techniques above do not need to be deliberately memorized, just understand that the object is created in the Eden area, and then after a number of shifts in the survivor region, the long-lived objects will eventually be moved to the old age.

Garbage collection in the old age

When the old age data is full, the old age garbage collection will be carried out. Depending on the GC algorithm, the execution process will be different, so it is much easier to understand the nature of the garbage collection in the old years when you understand the characteristics of each GC.

In JDK 7, there are 5 types of GC built in:

    1. Serial GC

    2. Parallel GC

    3. Parallel Old GC (Parallel compacting GC)

    4. Concurrent Mark & Sweep GC (or "CMS")

    5. Garbage first (G1) GC

Where serial GC must not be used on a production environment server, this GC is designed for desktop applications on single-core CPUs. Using the serial GC can significantly dissipate the performance of the application.

The characteristics of each GC are described below.

Serial GC (-XX:+USESERIALGC)

This type of GC is used in the young generation garbage collection described earlier. In the old years, an algorithm called "Mark-sweep-compact" was used.

    1. First, the algorithm needs to mark the surviving objects in the old age.

    2. Then check the surviving objects in the heap space from the front to the back and keep the location intact (clean out the heap space for objects that are no longer alive, called space cleanup)

    3. Finally, moving the surviving object to the front part of the heap space to maintain the continuity of the used heap space divides the heap space into two parts: Object-and Object-free (called Space compression)

The Serial GC is suitable for scenarios where the CPU has fewer cores and uses less memory space.

Parallel GC (-XX:+USEPARALLELGC)


Figure 4:serial the difference between GC and parallel GC

The difference between serial GC and parallel GC can be easily seen in the figure. The Serial GC uses a single thread for GC execution, while the parallel GC uses multiple threads to execute concurrently, so parallel GC has a faster speed than Serial GC. The Parallel GC is suitable for multi-core CPUs and uses a large memory space scenario. Parallel GC is also known as "high throughput GC (throughput GC)"

Parallel old GC (-XX:+USEPARALLELOLDGC)

The Parallel old GC was introduced in JDK 5, and the only difference compared to the Parallel GC is that the Parallel GC algorithm was designed for the older generation. Its execution process is divided into three steps: Mark (Mark)--Summary (summary)--Compression (compaction). Where the summary step will mark the location of the surviving object in the space where the GC has been performed, it differs from the sweep step in the mark-sweep-compact algorithm and requires some complex steps to complete.

CMS GC (-XX:+USECONCMARKSWEEPGC)


Figure 5:serial GC vs. CMS GC

As you can see from the graph, the concurrency Token-cleanup (Concurrent Mark-sweep) GC is more complex than any other GC in the future. The initial mark at the beginning (initial mark) is relatively simple, and only the surviving objects near the ClassLoader are flagged, so the pause time (Stop-the-world) is relatively short. In the concurrency tag (concurrent mark) phase, objects associated with the newly identified and tagged surviving objects will be tracked and detected alive. This step differs in that there are multiple threads that handle this process in parallel. In the re-tagging (remark) phase, the new or aborted object that is associated with the concurrency token is partners detected. During the final concurrency cleanup (concurrent sweep) phase, the garbage collection process is actually executed. Other threads are still executing during garbage collection execution. The system outage time during GC is very short due to the way the CMS GC is executed. The CMS GC is also known as a low latency GC and is suitable for scenarios where response time requirements are more stringent for all applications.

Although the CMS GC has the advantage of interruption, its disadvantage is also obvious:

    • CMS GC requires more memory space and CPU resources than other GC

    • CMS GC does not provide memory compression by default

A comprehensive analysis of the system is required before using the CMS GC. In addition, to avoid excessive memory fragmentation, the CMS GC brings more Stop-the-world time than any other GC, so you need to analyze and determine how often the compression task executes and how long it takes to perform the compression task.

G1 GC

Finally, we study the introduction of G1 garbage collection.


Figure 6:g1 the layout of the GC

If you want to understand the GC clearly, forget about the new generation and the old age described above. As shown, each object is parsed into a lattice when it is created, and subsequent GC is also done in the lattice. Each time a region is assigned a full object, the newly created object is assigned to a different area and the GC starts executing. In this GC, there is no phenomenon that objects in other GCs move in the Cenozoic and Laosheng three regions. G1 is intended to replace a heavily complained CMS GC that exposes a large number of problems during long-term use.

G1 's biggest improvement is its performance, which is faster than any of the above GC. It was released in JDK6 in the form of an earlier version for testing, and its real release was in JDK7. I personally think that at least 1 years of testing time is required for NHN to actually use JDK7 in the production environment, so there is still time to wait. And I've heard that the use of G1 in JDK6 occasionally appears to be a JVM crash. So the stable version still takes time.

GC tuning is explained in the next article, but I would like to ask a question first. If all objects in the app are of the same type and size, the GC used on was can set the same GC option. If objects created on was are of different sizes and lifecycles, the configured GC options are different. In other words, not because a service uses the GC option "a", other services use the same option "a" to get the best performance. So in order to find the best value for the was thread, each was instance needs continuous tuning and monitoring in order to find the optimal configuration and GC goodness. This is not just from my personal experience, but from the consensus of the engineers on the Oracle JVM discussions at JavaOne 2010.

In this section we'll just briefly describe the GC Foundation in Java. In the next section, I'll discuss how to monitor GC status and how to do performance tuning.

This article refers to Java performance, published in December 2011, and the white paper, "Java Hotspottm virtual machine memory management," available on the Oracle Web site.

Sangmin Lee, senior engineer of Performance Lab, NHN Company

Translation GC Expert Series 1: Understanding Java Garbage Collection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.