Jvm gc Policy

Source: Internet
Author: User

1 Preface

GCGarbage Collect) is the core function of jvm for memory management. It is precisely because it frees java programmers from the hard sea of memory release. Therefore, as a programmer, it is necessary to understand its principles.
When I was asked about the specific implementation of GC, I knew some basic ideas and the results were despised. I still keep my personal opinion on this issue. Even if java has been used for a long time and does not involve performance tuning of java programs, GC does not need to be so deep. However, the GC idea has accumulated a lot of wisdom, so you have to take a good look. I have never been able to find a more systematic GC data. I will give a more systematic introduction here, hoping to help later users.

2. Memory Division

Speaking of GC, we must first understand how the jvm memory space is divided. This article only uses Sun JDK as an example and divides it into Permanent Generation (also known as method zone and persistent Generation) ), Native Method Stack, JVM Method Stack, and PC register.
Permanet Generation is the information about the class to be loaded, including method information and static variables. If the specified size is exceeded, an OutOfMemory error is thrown.
Heap, used to store object instances and values, including New Generation and Old Generation. The former stores newly generated objects, and the latter stores objects that remain alive after multiple GC operations. If the specified size is exceeded, an OutOfMemory error is thrown.
JVM Method Stack and PC register are created along with threads. They only occupy the operating system memory or registers. Because the threads are private, the performance is very high. If the value exceeds the specified size, an error of StackOverflowError is thrown.
Native Method Stack is put together with JVM Method Stack in Sun JDK implementation.

3. Memory Allocation

If java uses memory, the first step is to obtain the memory, which is allocated by jvm. As mentioned above, New objects are created on the New Generation of Heap, and Heap is shared by all threads. As you can imagine, a lock mechanism is required to ensure security. But of course this will bring about efficiency issues, so there is an optimization here, that is, each thread will allocate a TLABThread Local Allocation Buffer). This is exclusive to the thread, of course, it also gives a small space. By default, allocation is prioritized on TLAB, so it is more reasonable to write small objects for java programs. Another kind of optimization is mentioned here, that is, the jvm will analyze according to the actual running situation. If the escape analysis just finds that the variables in the method will be read externally, therefore, you can allocate resources directly on the Stack without using Heap.

4 GC Algorithm

Here is the main scene. Let's look at the specific GC algorithm. In fact, there is nothing profound about it, that is, the actual problems encountered to find the corresponding solution.

4.1 reference counting method

Simply think, if an object is not referenced by anyone, it will be recycled. This implementation is simple, but there is a problem. If two objects reference each other but no other objects reference them, they may cause resource leakage. Therefore, there are still some problems in actual use, and they are only suitable for some simple reference scenarios.

4.2 tracking and collection method

It is to think of the entire reference as a tree structure that allows loops, but there is only one root node, and then check whether the object is reachable from the root node. But this is to suspend the program to ensure that the site of a scan remains unchanged.

4.2.1 Copying

Open up another memory space, copy the scanned objects, and clear all the original memory space. This method is applicable when there are few surviving objects.

4.2.2 Mark-Sweep

Mark all the objects that can be scanned and then clear all unmarked objects. However, this method causes memory fragmentation. This method is applicable when there are many surviving objects.

4.2.3 Mark-Compact

Based on the Mark-Sweep, sort out the memory space to ensure continuous storage to eliminate memory fragments.

5. jvm implementation

After learning so much basic knowledge, let's see how Sun JDK works.
First, find out what the jvm is going to perform GC. In the above several scenarios where memory is needed, Native Method Stack, JVM Method Stack, and PC register use the operating system memory, which is directly released when used up. We don't need to worry about it. Permanet Generation is used to load information about classes. Considering dynamic loading, GC is still required. Heap is the soil for object generation and survival, which is of course the main goal of GC.
Let's explain the two terms first. Do you still remember the New Generation and Old Generation mentioned above? Why divide Heap into these two types? The reason is that the objects they store have completely different features, that is, the survival time. Therefore, we naturally want to use different GC policies. The GC for New Generation is called Minor GC, And the GC for both of them is called Full GC. Another common sense is that New Generation is usually not too large, while Old Generation is relatively large. Therefore, it is important to note that Full GC is costly and should be avoided as much as possible.

5.1 GC policy of New Generation

Serial GC. The Copying algorithm is used in a single thread mode. Now let's talk about why New Generation will be divided into Eden Space, S0, and S1 again. I believe you have come up with the extra memory Space required by the Copying algorithm, s0 and S1 are also called From Space and To Space. Think about the details.
Parallel Scavenge. Multiple Threads are used to segment the memory space, and the Copying algorithm is also used.
ParNew. Compared with Parallel Scavenge, it does more special processing when cms gc is used together with Old Generation.

5.2 GC policy of Old Generation

Serial GC. Of course it is also a single-threaded approach, but the implementation is to combine Mark-Sweep and Mark-Compact, and make some improvements.
Parallel Mark-Sweep and Parallel Mark-Compact. The Old Generation space is also divided into regions, but the granularity is finer. Why use these two algorithms.
CMSConcurrent Mark-Sweep) GC. I admit that I didn't really understand this GC. The purpose is to achieve concurrency, and the specific implementation is too troublesome. If you are interested, read the book. At the end of this article, I will explain which book it is. Here, we can say that the algorithm uses Mark-Sweep. For memory fragmentation problems, CMS provides a memory fragmentation sorting function, it will be executed once after several Full GC operations.

6. How to Use

If you know how jvm works, how can we use it? This is the most practical problem. In fact, each GC method can be specified with parameters at startup, but you should still read the book. Let me mention the client and server modes. The default mode is the client mode, but it depends on the automatic selection of the machine configuration. I guess you can't remember it. If it is used, it is better to display the statement. What is interesting is that the two modes can be considered as different scenarios for users, so different GC policies are also given. The specific table is as follows:
+ ---------------------------------------------------- +
| New Gen GC | Old Gen GC |
+ -------- + ------------------------------------------- +
| Client | Serial GC |
+ -------- + ------------------------------------------- +
| Server | Parallel Scavenge | Parallel Mark-Sweep GC |
+ -------- + ------------------------------------------- +

7 Supplement

Finally, let me talk about the functions of S0 and S1. By the way, let's take a look at what you think is right. In fact, I also thought wrong at the beginning. Do you still remember the conditions for the object from New Generation to Old Generation? If GC can survive several times before it can pass, will it be troublesome to count? All objects in New Generation are stored in Eden Space, and S0 and S1 are the objects that last survived GC, so every time Minor GC is used, it will know what other alive Eden Space is after GC, compared with the result saved in the previous S0 or S1, you can see if this object has survived in both GC. After comparison, you can clear the information of An Sx, save it to another Sx. Next GC continues to make this comparison.

8 thanks

This article is the reading notes in the "JVM memory management" section of distributed Java applications. Here, I would like to express my gratitude to the author of this book, Lin hahua, for his name: Bi Xuan. After reading this article, I feel very smooth and smooth about my knowledge. If you are interested in the underlying jvm and distributed applications, read this book!
 
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.