Become a JAVAGC expert (1)-in layman's Java garbage collection mechanism

Source: Internet
Author: User

This article Importnew-Wang Xiaojie without permission, prohibit reprint!

What are the benefits of understanding the garbage collection mechanism (GC) for Java developers? The first thing you can do is to satisfy your curiosity as a software engineer, and second, to learn more about how GC works can help you write better Java applications.

This represents only my personal opinion, but I believe that a person who is proficient in GC is often a good Java developer. If you are interested in the GC process, you already have a large-scale application development experience. If you have ever thought about how to choose the GC algorithm correctly, it means that you fully understand the characteristics of the application you are developing. Of course, we cannot generalize, which is not a common denominator for evaluating a good developer. But what I'm saying is that in-depth understanding of GC is the only way to become a great programmer.

This is the first article in a series of JAVAGC experts, this article is mainly for the GC mechanism, in the next article, we will focus on the analysis of GC State and GC tuning from NHN examples.

The purpose of this article is to introduce you to the GC mechanism in a simple way. I hope these articles will help you. In fact, my students have published some very good articles about the Java kernel on Twitter and are popular. If you are interested, you can also follow them.

Back to the point, let's talk about garbage collection, before you learn the GC, you should first remember a word: "stop-the-world". Stop-the-world will occur in any GC algorithm. Stop-the-world means that the JVM stops executing the application because it is performing a GC. When stop-the-world occurs, all threads wait until the GC task is complete, except for the thread required by the GC. GC optimization is often referred to as reducing the time stop-the-world occurs.

Garbage collection mechanism by generation

You cannot explicitly allocate and unregister memory in a Java program. Some people set the associated object to null or call System.GC () to attempt to explicitly clean up memory. setting to NULL is at least harmless, but calling System.GC () can significantly affect system performance and must be completely eliminated (fortunately, I have not yet seen which developer of NHN called this method).

In Java, developers cannot clean up memory directly in program code, but instead, the garbage collector automatically finds unwanted garbage objects and cleans them out. The garbage collector is created with the following two assumptions (hypotheses) (called assumptions rather than speculation (suppositions) or premises (preconditions)).

    • Most objects will soon become unreachable
    • There are very few references to the new object by the old object (the object that was created longer)

These assumptions we call the weak-age hypothesis ( weak generational hypothesis). To reinforce this hypothesis, the hotspot virtual machine divides it physically into two-new generation (younggeneration) and oldgeneration.
New generation (young Generation): The vast majority of newly created objects are assigned here, and since most objects become inaccessible soon after they are created, many objects are created in the Cenozoic and then disappear. The process by which objects disappear from this area is what we call "minor GC".

old generation: objects have not become unreachable and survived from the Cenozoic and are copied here. It occupies more space than the Cenozoic. Because of its relatively large space, the GC that occurs in the old age is much less than that of the Cenozoic. The process of disappearing an object from the old age, which we call "major GC" (or "fullgc")

Take a look at the chart below.

Figure 1:GC Space & Data Flow

The persistent generation ( permanent generation ) in is also known as the method area . He is used to save class constants as well as string constants. Therefore, this area is not used to permanently store objects that survived from the old age. GC may also occur in this area. and GC events that occur in this area are also counted as major GC.

Some people may ask:
What happens if the object of the old age needs to refer to a new generation of objects?
In order to solve this problem, there is a "card table" in the old age, he is a byte size block. References to new generations of objects in all old eras will be recorded in this table. When performing GC for the new generation, it is only necessary to query the card table to determine whether it can be collected without querying the entire old age. This card table is managed by a write barrier . Write barrier has brought a significant performance boost to the GC, although there may be some overhead, but the overall time of the GC has been significantly reduced.

Figure 2:card Table Structure

the composition of the Cenozoic

To better understand the GC, we are now learning the new generation, which is used to preserve the objects that were created for the first time, and he can be divided into three spaces.

    • An Eden space (Eden )
    • Two survivor space (Survivor )

There are three spaces, including two survivor space. The order of execution for each space is as follows:

    1. Most of the objects that have just been created will be stored in the Eden space.
    2. After the first GC was performed in Eden Space, the surviving objects were moved to one of the survivor spaces.
    3. Thereafter, after the GC is executed in Eden Space, the surviving objects are stacked in the same survivor space.
    4. When one survivor is saturated with space, the surviving object is moved to another survivor's space. This will then empty the survivor space that is already saturated.
    5. In the above steps, repeated several times the surviving objects will be moved to the old age.

If you look closely at these steps, you will find that one survivor space must remain empty. If two survivor spaces have data, or two spaces are empty, that must indicate some sort of error in your system.
The process of moving data to the old age through frequent minor GC can be used to describe:

Figure 3:GC Comparison before and after execution

It is important to note that the hotspot virtual machine uses two techniques to speed up memory allocation. They are "bump-the-pointer" and "tlabs(thread-local Allocation buffers)" respectively.

bump-the-pointer Technology tracks the last object created in the Eden space. This object will be placed at the top of the Eden space. If you need to create an object later, you only need to check if the Eden space has enough space left. If there is enough space, the object will be created in the Eden Space and placed at the top. In this way, each time you create a new object, you only need to examine the object that was last created. This will greatly speed up memory allocation. However, if we are in multi-threaded situations, things will be very different. If you want to store objects in the Eden space in a thread-safe manner, you inevitably need to lock them up, which will greatly affect performance. Tlabs is the hotspot virtual machine solution for this problem. The scheme allocates a single piece of space in the Eden space for each thread, so that each thread accesses only their own tlab space, and the combination with Bump-the-pointer technology can allocate memory without locking.
The above is a brief introduction to the new generation of space GC technology, you do not have to deliberately remember the two techniques I have just mentioned. Don't know how they will affect you, but be sure to remember that after the object has just been created, it is saved in the Eden space. Those who have long survived will be in the old age space through survivor space.

The old age GC processing mechanism

The GC events of the old age space occur basically when the space is full and the process is different depending on the GC type, so understanding the different GC types will help you understand the content of this section.
JDK7 there are altogether 5 GC types:

    1. Serial GC
    2. Parallel GC
    3. Parallel Old GC (Parallel compacting GC)
    4. Concurrent Mark & Sweep GC (or "CMS")
    5. Garbage first (G1) GC

Where the Serial GC should not be used on the server. This GC type exists in the desktop era of a single-core CPU. Using the serial GC can significantly reduce your application's performance metrics.
Now, let's learn each GC type together.

1. Serial GC (-XX:+USESERIALGC)

The GC mode of the Cenozoic space we have already introduced, in the old-age space of the GC to take the algorithm called "mark-sweep-compact".

    1. The first step of the algorithm is to mark the surviving objects in the old age. Tag
    2. The second step is to check the heap memory space from the beginning and leave only the objects that are still surviving. Clean

The final step, starting from the beginning, fills the heap memory space sequentially, and divides the memory space into two parts: one holds the object and the other is empty (compressed).

2. Parallel GC (-XX:+USEPARALLELGC)

Figure 4:serial the difference between GC and Parallel GC

From there, you can easily see the difference between serial GC and parallel GC, serial GC uses only one thread to perform GC, and parallel GC uses multiple threads, so parallel GC is more efficient. This GC is useful in memory-rich and multicore situations, so we also call it "throughput GC".

3. Parallel old GC (-XX:+USEPARALLELOLDGC)

The Parallel old GC appears after JDK5. Compared to parallel GC, the only difference is the GC algorithm for the old age. The Parallel old GC is divided into three steps: Tag-summarize-compress (mark–summary–compaction). The summary (summary) step differs from Cleanup (sweep) in that it distributes the surviving objects to different areas of the GC that are pre-processed, and the algorithm is slightly more complex than cleanup.

4. CMS GC (-XX:+USECONCMARKSWEEPGC)

Figure 5:serial GC & CMS GC

As you can see, the CMS GC is much more complex than the various algorithms I explained earlier. The first step of initializing the tag (initial mark) is simpler. This step simply looks for those surviving objects that are closest to the ClassLoader. Therefore, the time to pause is very short. After the parallel tag ( concurrent Mark ) step, all objects referenced by the surviving object are confirmed to have been traced and verified. This step differs in that the other threads are still executing during the tagging process. At the re-tagging (remark) step, the objects referenced by the surviving object that were added or deleted in the parallel tag step are checked again. Finally, in the parallel exchange ( concurrent sweep ) step, the garbage collection process is forwarded. Garbage collection work is performed during the execution of other threads. Once this GC type is taken, the pause time caused by the GC can be extremely short. The CMS GC is also known as a low latency GC. It is often used in applications where response times are demanding.

Of course, this GC type has the advantage of having a short stop-the-world time, as well as the following drawbacks:

    • It consumes more memory and CPU than other GC types
    • Compression steps are not supported by default

You need to think carefully before using this GC type. If the compression task has to be performed because of too much memory fragmentation, the Stop-the-world takes longer than any other GC type, and you need to consider how often the compression task occurs and how long it will take to execute.

5. G1 GC

Finally, let's learn the garbage collection first (G1) GC type.

Figure 6:G1 The structure of the GC

If you want to understand G1, first you have to forget the concepts of the new generation and the old age you have learned. As you can see, each object is assigned to a different lattice, which is then performed by the GC. When one area fills up, the object is assigned to another area and the GC is executed. There are no more than three steps to move from the Cenozoic to the old age. This type is created to replace the CMS GC because the CMS GC has many problems when it continues to function for a long time.

The biggest benefit of G1 is performance, which is faster than any GC we've discussed above. But in JDK 6, he was just an early trial version. Officially released after the JDK7. Personally, NHN requires a long test period (at least one year) before the JDK 7 is formally put into business. So you may need to wait a little longer. And, I've heard a few times that I've been using G1 in JDK 6 to cause a Java virtual machine to crash. Please wait patiently until it is more stable.

Next time I'm going to discuss GC optimization-related issues, but before I do, I need to be clear on one thing, if the size and type of all objects created in the app are uniform, then the GC parameters of was used by the company can be the same. However, the size and lifecycle of the object created by was differs depending on the service and hardware. In other words, it is not possible for an application to use the GC parameter "A", which means that the same parameters also provide the best results for other services. But to adapt to local conditions, targeted. We need to find the parameters that are appropriate for each was thread, and continuously monitor and optimize the was instance on each device. This is not one of my talk, but the engineer who is responsible for the development of Oracle Java Virtual machine has been discussed in JavaOne 2010.

In this article we briefly describe the GC mechanism of Java, and continue with our follow-up articles, we will discuss how to monitor the Java GC status and optimize the GC.

In addition, I recommend a December 2011 release of Java Performance (Amazon, which can also be read online through Safari), and a white paper published on Oracle's website, "Java Hotspottm virtual machine memory Management "(This book is not the same as Java performance optimization) author Sangmin Lee, NHN, a performance Engineer Lab senior engineer.

English Original: Cubrid, compilation: importnew-Wang Xiaojie

Address: http://www.importnew.com/1993.html

"If you want to reprint, please mark in the text and keep the original link, translation links and translators and other information, thank you for your cooperation!" 】

Become a JAVAGC expert (1)-in layman's Java garbage collection mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.