Detailed description of JVM memory partition and garbage collection mechanism, jvm garbage collection

Source: Internet
Author: User

Detailed description of JVM memory partition and garbage collection mechanism, jvm garbage collection

When writing Java code, you do not need to worry about whether your New object is released or when. Because the JVM has an automatic garbage collection mechanism. In our previous blog, we talked about the MRC (manual reference counting) and ARC (automatic reference counting) memory management methods in Objective-C. We will review them below. The current JVM memory reclaim mechanism is not to use the reference count, but mainly to use the "Replication-type reclaim" and "Adaptive reclaim ".

Of course, in addition to the above two algorithms, there are other algorithms, which will be introduced below. In this blog, we will first briefly talk about the JVM partition and then introduce the JVM garbage collection mechanism.

 

I. Introduction to JVM memory region division

Of course, this section briefly talks about the division of JVM memory areas, paving the way for the development of the garbage collection mechanism below. Of course, there are a lot of details on the content of the JVM memory area on the Internet. please Google it on your own.

According to the JVM memory area, the following figure is simple. The region is divided into two major parts: the Heap. All New objects are allocated in the Heap, the allocation method of malloc in C language is obtained from Heap. The garbage collector mainly recycles the memory in the heap area.

The other part is the non-heap zone. The non-heap Zone mainly includes the "Code Cache" used to compile and save local Code) "," Perm Gen "for saving JVM's own static data, reference for storing method parameter local variables, and recording the call sequence of methods" Java Virtual Machine Stack (JVM Stack) and Local Method Stack )".

  

The garbage collector recycles unused memory areas in the heap area and sorts out the corresponding areas. In the heap area, the object memory survival time or object size can be divided into "Young Generation" and "Old Generation ". Objects in the "Young Generation" are unstable and prone to garbage generation. Objects in the "Old Generation" are relatively stable and are not easy to generate garbage. The reason for separating them is to divide and conquer them. Based on the characteristics of memory blocks in different regions, different memory recovery algorithms are adopted to improve the efficiency of garbage collection in the heap area. The following is a detailed introduction.

 

 

II. Introduction to common memory recovery Algorithms

The above is a simple introduction to the division of memory regions in JVM. Next we will look at several common memory recovery algorithms. Of course, the memory Reclaim Algorithm described below is not only used in JVM, but we will also review the memory reclaim method in OC. The following sections mainly cover "reference counting recycling", "Copy recycling", "marked sorting recycling", and "generational recycling ".

 

1. reference counting memory recovery

The Reference Count memory reclaim mechanism is the memory reclaim mechanism currently in use in Objective-C and Swift languages, in our previous blog, we also talked in detail about reference counting memory collection. As long as there is a reference, the reference count is increased by 1. When the reference count is 0, the memory will be recycled. Of course, this memory cleanup method is easy to form a "reference loop ".

Memory leakage caused by cyclic reference in the reference count of Objective-C can be declared as weak or strong type. That is to say, we can define a reference as "strong reference" or "weak reference ". When a "strong reference loop" occurs, we can set one of the references to the weak type, and then this type of strong reference loop will be broken, this will not cause memory leakage. For more information about "reference counting memory Recycle" and more details, refer to the previous blog on OC content.

To better understand how reference counting works, we simply draw the following figure. The references a, B, and c in the stack on the left point to different area blocks in the stack respectively. In the memory area block in the heap, when there is a strong reference in this area, its retainCount will add 1. In the case of weak references, retainCount does not add 1.

Let's take a look at the 1st memory areas referenced by a. Because this memory block is only referenced by a in strong mode, retainCount = 1. When a is not referencing this memory area, retainCount = 0, the memory will be recycled. In this case, memory leakage will not occur.

Let's take a look at the memory area 2 that B points. Both B and memory block 3 Strongly reference the memory block 2, so the retainCount of 2 is 2. The memory block 2 also strongly references the memory block 3, so the retainCount of 3 is 1. Therefore, the memory area that B points to has a "strong reference loop", because when B no longer points to this memory area, rc = 2 will change to rc = 1. Because retainCount is not zero, the two memory areas will not be released, and 2 will not be released, so naturally the three memory areas will not be released, however, this memory area will no longer be used, so it will cause "Memory leakage. If the two memory areas are very large, we can imagine that the consequences are serious.

In this case referenced by c, it will not cause a "strong reference loop", because one of the reference chains is weak references. When c does not reference 4th blocks of memory, the rc changes from 1 to 0, and the block area is immediately released. After memory block 4 is released, the rc of memory block 5 changes from 1 to 0, and the memory block 5 is also released. In this case, memory leakage will not occur. In Objective-C, this method is used to reclaim memory. Of course, in OC, in addition to "strong reference" and "weak reference", there are also automatic release pools. That is to say, the Autorealease type reference will not be released immediately when retainCount is set to 0, but will be released only when the pool is automatically released. We will not repeat it here.

  

 

2. Copy-type memory recovery

After talking about the collection of reference counts, we know that reference counts can easily cause "circular references". To solve the memory leakage problem caused by "circular references, the concepts of "strong reference" and "weak reference" are introduced in OC. Next, let's look at the replication-type memory recycle mechanism, which does not need to be concerned with the issue of "loop reference. Simply put, the core of replica recycling is "replication", but the premise is conditional replication. During garbage collection, copy the "live object" to another blank heap area and clear the previous areas together. "Live object" refers to the objects that can be linked to the "stack" along the object reference chain. Of course, after copying a live object to a new "Heap", you must modify the reference of the stack.

Below is the copy-type recycling diagram we have drawn, which mainly divides the heap into two parts. During garbage collection, the live objects on one stack will be copied to another. The lower heap area 1 is the block currently in use, while the heap area 2 is the idle zone. Unmarked memory blocks in heap 1, that is, 2 and 3 are garbage objects to be recycled. 1, 4, and 5 are the "active objects" to be copied ". Because a along the stack can reach Block 1 and block 4 and 5 along c. Although Block 2 and 3 are referenced, they are not from non-heap zone, that is, the references of Block 2 and 3 are from the heap zone, so they are the objects to be recycled.

  

After finding the live object, the next step is to copy the live object and copy it to heap 2. Of course, the memory address between objects copied to heap 2 is continuous. If you want to allocate new memory space, you can allocate it directly from the idle part of the heap. In this way, the efficiency of memory space allocation is relatively high. After copying an object, modify the reference address from the non-heap zone. As shown below.

  

After the replication, We can recycle all the memory space in heap 2 directly. The following figure shows the final result after the replication. After the heap area 1 in the lower part is cleared, you can receive the copied objects. When garbage collection is performed in heap 2, the live objects in heap 2 are copied to heap 1.

From this instance, we can see that the efficiency of "Replication-type" garbage collection is relatively high when there is a large amount of memory garbage, because there are fewer replication objects, clear the old heap space directly. However, when there is a small amount of garbage, this method will copy a large number of live objects, and the efficiency is still relatively low. This method also divides the Heap Storage space into half. That is to say, half of them are idle and the utilization of heap space is not high.

  

 

3. Mark-compression and recycling Algorithm

From the above "Replication-type" garbage collection process, we know that the efficiency of garbage is relatively high when there is a large amount of garbage, while the efficiency of its work is relatively low when there is little garbage. Next, we will introduce another tag-compression and recycling algorithm. This algorithm is more efficient when there is less garbage, but less efficient when there is more garbage, this is complementary to the "replicaset. Below we will introduce the label-compression and recycling algorithm.

Mark-the first part of compression is marking. You need to mark "Live objects" in the heap. We have already talked about what is a "Live object" in the above content, so we won't go into details here. From the features of "Live object", we can see that the live objects below are memory areas 1 and 3, so we mark them.

  

After the mark is complete, we start to compress, compress the live object to a section in the "heap area", and then clear the remaining parts. Below are the two live objects 1 and 3 compressed. After compression, Clean the space below. That is to say, the Clean part can be used to allocate new objects.

  

Below is the Mark-status after compression cleaning. Mark-compressed garbage collection can make full use of the space in the heap area. When there are few garbage items, this processing method is still highly efficient. If there are too many garbage fragments, there are many "active objects" to be moved, and the efficiency is relatively low. This method can be used in combination with the "replication" method. You can select the recycle Method Based on the garbage status in the current heap area. It is just complementary to the "replicaset. The algorithm that integrates the recycling methods of "Replicate" and "tag-compressed" is the "generational" garbage collection mechanism, which will be described in detail below.

  

 

4. Generational garbage collection

"Generational" means that objects are divided into different generations based on the State or size of objects that are prone to spam. They can be divided into "Young Generation", "Old Generation", and "permanent generation ". "Permanent generation" is not in the heap. We will not discuss it again. Based on the characteristics of generational garbage collection, the following figure is shown.

In the heap, the areas are divided into "Young Generation" and "Old Generation ". The object memory in the "Young Generation" is not created for a long time, but is updated quickly, which easily produces "memory garbage ", therefore, the garbage collection method of the "Young generation" is highly efficient. The young generation can be divided into two areas: Eden Space (Eden) and Alibaba vor Sprace (Survivor ). Eden Space mainly stores the objects created for the first time, while movie vor Sprace stores the "live objects" that survived the Eden Space ". There are two types of terravor Sprace (Survivor zone): form and to, which are used to copy objects to each other for garbage cleaning.

The "Old Generation" stores some "large objects" and "objects" that survive from zoovor Sprace. Generally, the "Old Generation" objects are relatively stable, A small amount of garbage is produced. In this case, it is more efficient to use the "tag-compression" type for recycling. "Generational garbage collection" is mainly divided and governed. It classifies different objects based on their characteristics and selects appropriate garbage collection solutions based on their characteristics.

  

 

Iii. Working Principle of generational garbage collection

Of course, for specific JVM garbage collection, the thread can be divided into "Serial garbage collection" collected by a single thread, and "parallel garbage collection" collected by multiple threads ". Depending on the pending state of the program, it can be divided into "exclusive recycling" and "concurrent recycling ". Of course, I have talked about "parallelism" and "concurrency" many times before, but it is definitely not a concept and cannot be confused. This blog will not elaborate on these methods. If you are interested, please Google it on your own.

Next, let's take a look at the complete steps of the specific working principle of "generational garbage collection" to intuitively feel how the "generational garbage collection" is executed.

 

1. Before garbage collection

It is a short figure waiting for "generational garbage collection". We can see that some allocated object memory in the heap is not referenced on the stack. These are the objects to be recycled. We can see that the heap below is divided into "Young Generation" and "Old Generation", while the young generation can be subdivided into three areas: Eden Space, From and. We have already introduced the role of each region when we introduced "generational garbage collection" above, so we will not detail this section.

  

 

2. Generational garbage collection

Is the garbage collection process for the above heap controls. As we can see, the To area is a blank area and can be copied. Because the "Young Generation" is prone to memory spam, the "Replication-type" memory recycling method is adopted. We copy the "live object" in the Eden Space and From heap blocks To the To block. At the same time, we also need to modify the stack reference address of the copied memory. In the From or Eden region, the "Large Object" Bucket directly copies it to the "Old Generation ". Because the efficiency of "large objects" copying multiple times in the From and To areas is relatively low, they are directly added To the "Old Generation" To improve recycling efficiency.

For the garbage collection of the "Old Generation", the "mark-compression" type is used for garbage collection. First, mark the live object ".

  

 

3. Results After garbage collection

The following figure shows the specific result of garbage collection by generation. From the figure below, we can see that the active objects in Eden Space and From are all copied To the To area, while the storage Space in the heap area of the "Old Generation" also changes a lot. In addition, there are more large objects copied From the From area in the "Old Generation. The details are as follows.

  

 

 

Iv. Configuration and Analysis of GC logs in Eclipse

After talking about this, let's take a look at how to view the garbage collection process in Eclipse and analyze the garbage collection log information. By default, the garbage collection process and printing logs are not displayed. You need to add related configuration items in the running configuration to print the garbage collection logs. In this section, let's take a look at the configuration of garbage collection logs in Eclipse, and then we will analyze these logs. Of course, Java 8 is used in this blog. If you use other Java versions, the log information will be slightly different.

1. Configure Eclipse runtime settings

Add the corresponding configuration items to the running settings of Eclipse. The corresponding log information is printed only when garbage collection is performed. Select our project, and find Run deployments... Option.

  

 

Below is the dialog box opened by the above options, and then find the (x) = Arguments label bar, inVM argumentsAdd the corresponding virtual machine parameters as the runtime parameters of the project. Below we added-XX: + PrintGCTimeStampsAnd-XX: + PrintGCDetailsTwo parameters. From the two parameter names, we can easily see the corresponding function of the parameter. One is to print the timestamp during garbage collection, and the other is to print the details during garbage collection. Of course there are many other parameters, such as the specific algorithm parameters when "Garbage Collection" is selected, and the parameters for "serial" or "Parallel" are selected, there are also some parameters that choose "exclusive" or "concurrent" garbage collection. I will not go into details here. please Google it on your own.

  

 

2. Printing and parsing of recycled logs

After the above parameters are configured, when we use System. gc (); for forced garbage collection, the corresponding parameter information is printed. First, we need to create the code for testing. below is the test class we created. Of course, the code in the test class is relatively simple. It is mainly because the new is a string, and then the reference is set to null, and finally the callSystem. gc (). The Code is as follows:

package com.zeluli.gclog;public class GCLogTest {    public static void main(String[] args) {        String s = new String("Value");        s = null;        System.gc();    }}

 

Below is the effect of the above Code, next we will introduce the main content of the below log information.

  • [PSYoungGen: 1997 K-> 416 K (38400 K)] 1997 K-> 424 K (125952 K), 0.0010277 secs]

    • PSYoungGen indicates that the "Young Generation" is recycled in parallel, and 1997 K-> 416K indicates the size of "before-> after recycling" in the corresponding area of the young generation, while (38400 K) indicates the total size of the "Young Generation" heap. The rear 1997 K-> 424 K (125952 K) data is viewed from the perspective of the entire heap. 1997 K (memory used before heap recycle)-> 424 K (memory used after heap recycle) (125952 K-total heap memory ).
  • [ParOldGen: 8 K-> 328 K (87552 K)]

    • ParOldGen recycles the "Old Generation" in parallel, and the following parameters are similar to the preceding parallel recycles for the young generation.
  • [Metaspace: 2669 K-> 2669 K (1056768 K)]
    • It indicates the recovery of the "metadata zone". The Metaspace and "permanent generation" areas are used to store static data or system methods.

  

 

The above is a simple garbage collection log. The content of this blog will come here first. There are a lot of garbage collection content in JVM. Later, we will introduce it based on the specific situation. Today's blog is here.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.