Details about CMS garbage collection mechanism and cms garbage collection

Source: Internet
Author: User

Details about CMS garbage collection mechanism and cms garbage collection

Originality is not easy. It cannot be reproduced without permission ~~~

 

In general, the execution process of CMS can be divided into the following stages:

3.1 initial tag (STW)

3.2 concurrent mark

3.3 concurrent pre-cleaning

3.4 mark (STW)

3.5 concurrent cleaning

3.6 Reset

3.1 STW is required for the initial marking stage.

Perform accessibility analysis in this phase to mark GC ROOTDirect association.

Note that objects directly associated with indirect association are marked in the next stage.

3.2 The concurrent mark stage is the process of concurrent execution with the user thread.

In this phase, gc root tracing is performed, and the paused thread starts running again in the first phase.

Starting from objects marked in the previous stage, all reachable objects are marked in this stage.

3.3 The work done in the concurrent preprocessing phase is still marked, similar to the 3.4 re-marking function.

Why is this step necessary for similarity?

As we mentioned earlier, CMS aims to obtain the shortest pause time for GC.

STW (Stop The World) is required for remarking, so The remarking should be completed as much as possible in The concurrency phase to reduce The STW time.

This phase is marked fromTarget of new generation promotion,Objects allocated to the old ageAnd inModified objects in the concurrent stage.

This stage is complex,From a question that beginners can easily ignore or not understand, let's think about it.:

  • How can we determine that objects in the old age are living?

The answer is simple. The objects that can be reached through gc root tracing are alive.

Continue to extend. What if the following scenarios exist:


How can we ensure that the Current Obj is marked as alive during GC in the old age?

(Confirm that the new generation of objects are still living and there is the same problem. You can think about it and the answer will be given later in the article.)

The answer is to scan the new generation to ensure that.This is why CMS, although a gc in the old age, still scans the new generation.. (Note that the initial tag will also scan the new generation)

In the CMS log, we can clearly see the scan log:

[GC[YG occupancy: 820 K (6528 K)]

[Rescan (parallel) , 0.0024157 secs]

[weak refs processing, 0.0000143 secs]

[scrub string table, 0.0000258 secs] 

[1 CMS-remark: 479379K(515960K)] 480200K(522488K), 0.0025249 secs] 

[Times: user=0.01 sys=0.00, real=0.00 secs]

The Rescan phase (A subphase of the remark phase) scans objects in the new and old generations. The log shows that this phase is marked as Rescan (parallel), indicating that this phase is performed in parallel.

(If you still have questions in your mind, you are already getting started)

The point is: will the full scan of the New Generation and old generation be slow?Yes.

CMS claims to be the GC with the shortest pause time. Such a long pause time is certainly unacceptable.

How can this problem be solved?

Think about it first.

 

 

 

 

 

 

 

There must be a mechanism to quickly identify the new generation and the old generation of living objects.

Let's talk about the new generation.

You should have known that the remaining objects of the new generation of garbage collection are all living objects, and few living objects exist.

If we perform a Minor GC before scanning the new generation, will the situation be much better?

CMS has two parameters:CMSScheduleRemarkEdenSizeThreshold,CMSScheduleRemarkEdenPenetrationThe default values are 2 MB and 50% respectively. The combination of the two parameters means that after pre-cleaning, the eden space is used more than 2 MB to start the interrupted concurrent pre-cleaning (CMS-concurrent-abortable-preclean ), it is interrupted until the eden space usage reaches 50% and enters the remark stage.

If a Minor GC occurs during the pendable pre-cleaning phase, everything is fine and everything is fine.

There is a small problem here,How long does it take to terminate a pre-cleanup to ensure that a Minor GC occurs?

The answer is no guarantee. The reason is simple, because garbage collection is automatically scheduled by JVM, and we cannot control when GC is implemented.

But there is always an execution time at this stage, right? Yes.

CMS provides a parameterCMSMaxAbortablePrecleanTimeThe default value is 5 S.

Once it reaches 5 S, no matter whether Minor GC occurs or not, CMSScheduleRemardEdenPenetration will stop this stage and enter remark.

What if Minor GC is not executed within 5S?

CMS provides the CMSScavengeBeforeRemark parameter to force a Minor GC before remark.

There are both advantages and disadvantages. The good side is to reduce the pause time of the remark phase; the bad side is that the Minor GC is followed by a remark pause. As a result, the pause time is also relatively long.

The CMS log is as follows:

7688.150: [CMS-concurrent-preclean-start]

7688.186: [CMS-concurrent-preclean: 0.034/0.035 secs]

7688.186: [CMS-concurrent-abortable-preclean-start]

7688.465: [GC 7688.465: [ParNew: 1040940K->1464K(1044544K), 0.0165840 secs] 1343593K->304365K(2093120K), 

0.0167509 secs]7690.093: [CMS-concurrent-abortable-preclean: 1.012/1.907 secs]  7690.095: [GC[YG occupancy: 522484 K (1044544 K)]

7690.095: [Rescan (parallel) , 0.3665541 secs]7690.462: [weak refs processing, 0.0003850 secs] [1 CMS-remark: 302901K(1048576K)] 825385K(2093120K), 0.3670690 secs]

7688.186 start the terminable pre-cleanup, start the Minor GC in the next three seconds, and then enter the Remark stage.

In fact, to reduce the STW time of the remark stage, the pre-cleaning stage will do as much as possible to reduce the remark pause time.

The rescan phase of remark is multi-threaded. To facilitate multi-threaded scanning of the new generation,The pre-cleaning stage will split the new generation into blocks.

Each block contains multiple objects, so that the remark stage does not need to identify the starting position of each object from the beginning.

The roles and responsibilities of multiple threads are clearly defined, and the parts are allocated to multiple threads, and the scanning will soon be completed.

Unfortunately, this approach is still based on the occurrence of Minor GC.

If no Minor GC occurs, all the spaces below top (the next address space that can be allocated) are considered as one block (this block contains most of the new generation ).

This type of block does not play much role in the remark stage, so the parallel efficiency will be reduced.

 OK. The new generation mechanism is over. Let's talk about the old generation.

In the old ageCARD TABLE(This is actually an array, and each location in the array stores a byte.

CMS divides the space in the old age into bytes. Each element in the card table corresponds to a block.

When a concurrent tag is used, if the reference of an object changes, the block of the object is markedDirty card.

In the concurrent pre-cleaning phase, the block is re-scanned to mark the object referenced by the object as reachable.

For example:

Object status when Mark concurrency:


However, the reference of current obj changes:

 

 The block where current obj is located is marked as dirty card.

 Then, in the pre-cleaning phase, do you still remember that one of the tasks in this phase is to mark the objects modified in the concurrent marking phase? Then, the objects that are reachable through current obj are marked as follows:


The dirty card flag is also cleared.

This is the old generation mechanism.

However, card table has other functions..

Do you still remember the question mentioned above? During Minor GC, how can a new generation be identified if it is referenced in the old age?

(Some studies have shown that, among all the references, less than 1% of the New Generation scenarios were cited in the old age.You can analyze the cause by yourself.)

When a new generation is referenced in the old age, the corresponding card table is identified as the corresponding value (the card table contains eight bytes, it is agreed that the meaning of each bit can distinguish which is the and which is modified in the concurrent mark stage ).

Therefore, by scanning the card table, Minor GC can quickly identify the young generation referenced by the old generation.

Here, the hotspot virtual machine uses the bytecode interpreter, JIT compiler, and write barrier to maintain the card table.

When the bytecode interpreter or JIT compiler updates the reference, the write barrier operation card table will be triggered.

Click again, because the card table exists,What will happen when the space in the old age is very large?? (You can make full use of your imagination.)

Now, the pre-cleaning stage is finished.

3.4 mark (STW) pause all user threads, re-scan the objects in the heap, perform accessibility analysis, and mark the living objects.

With the above foundation, the workload at this stage is greatly reduced, and the pause time is also reduced.

Note that this stage is multi-threaded.

3.5 concurrent cleanup. The user thread is re-activated and the invalid objects are cleared.

3.6 reset. CMS clears the internal status to prepare for next recovery.

After the CMS execution process is completed, it focuses on the Operations During Concurrent pre-cleaning and several key parameters of CMS. You can digest it. After it is digested, you can take a rest, because it is not over yet.

 4. What's wrong with CMS?

Every coin has two sides ------ a sentence I often use in high school English compositions.

In my opinion,The three letters of CMS imply the problem..Concurrency + tag-clear algorithm is the source of the problem.

Concurrency first

4.1 concurrency means that multiple threads can seize CPU resources, that is, GC threads and user threads can seize CPU resources. This may cause a reduction in the efficiency of user thread execution.

The default number of recycle threads for CMS is(Number of CPUs + 3)/4.This formula means that when there are more than four CPUs, it is acceptable to ensure that the recycle thread occupies at least 25% of the CPU resources, so that the user thread occupies 75% of the CPU.

But what if there are few CPU resources, for example, when there are only two? According to the formula above, CMS starts one GC thread. This is equivalent to a GC thread that occupies 50% of the CPU resources, which may cause the execution speed of the user program to suddenly decrease by 50%, and the execution speed of 50% has obviously decreased.

How can this scenario be handled?

The answer is that you don't need to consider this scenario. In today's PCs, there are at least dual-core processors, let alone large servers.

The CMS Solution Provides an incremental mode (incremental mode ).

In this mode, GC threads and user threads run alternately During Concurrent marking and cleaning, minimizing the time for GC threads to exclusively occupy CPU resources.

This results in a longer GC time, but less impact on user threads.

However, practice has proved that the CMS performance in this mode is very general and there is no major optimization.

I-CMS has been declared as"Deprecated.

Https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/cms.html#concurrent_mark_sweep_cms_collector)

 

In the 4.2 concurrent cleanup phase, the user thread is still running, and new garbage may be generated during this period. New garbage cannot be cleared in this GC, and it can only be cleared later. These spam have a terminology:Floating garbage.

Because the user thread is still executing in the garbage collection stage, the memory space must be reserved for the user thread. Therefore, GC cannot be performed when the old age is full, just like other recyclers.

CMS providesCMSInitiatingOccupancyFractionParameter to set the percentage of space used in the old age. When the percentage is reached, garbage collection is performed.

The default value of this parameter is 92%. The parameter selection depends on the specific application scenario.

Too small will lead to frequent cms gc and a lot of pauses; in turn, what will happen if the setting is too high?

Assume that the value is set to 99%, and 1% of the remaining space is available.

In the Concurrent cleaning stage, if the user thread needs to use more than 1% space, a Concurrent Mode Failure error will be generated, which means that the concurrency Mode fails.

In this case, the virtual machine will start filing: Use the Serial Old collector to re-collect junk data in the Old age. As a result, the pause time becomes longer.

SoCMSInitiatingOccupancyFractionSpecific problem analysis is required.

There are some formulas for setting this parameter on the Internet. I personally think it is not very rigorous (the reason is that another problem occurs in CMS), so I will not write it out to avoid your doubts.

 

In fact, CMS hasDynamic Check Mechanism.

Based on historical records, CMS predicts how long it will take to fill up and reclaim the old age.

Before the space in the old age is used up, CMS can automatically perform garbage collection based on its own predictions.

Parameters can be used for this feature.UseCMSInitiatingOccupancyOnly.

 

Here is an exampleLet readers think about the problem. If you want to design,How to predict when to start Automatic Execution?

 

4.3 The first two problems are caused by concurrency. The following problems are caused by the mark-clearing algorithm.

Using the tag-clear algorithm can cause a large amount of space fragments. If there are too many space fragments, the allocation of large objects will be troublesome.

There is usually a lot of space available in the old age, but it is impossible to find enough continuous space to allocate the current object, and it has to trigger a Full GC.

The CMS solution is to useUseCMSCompactAtFullCollectionParameter (enabled by default). Enable memory fragmentation when Full GC is not supported.

This process requires STW. The fragmentation problem is solved, but the pause time becomes longer.

The virtual machine also provides another parameterCMSFullGCsBeforeCompactionIs used to set the number of Full GC tasks that are not compressed, followed by a compression (0 by default, fragment is performed each time Full GC is entered ).

Extend a "foreground collector" item to everyone. This item is also declaredDeprecated. Https://bugs.openjdk.java.net/browse/JDK-8027132)

The problems with CMS have been clearly explained.

So far, CMS-related content has been completed.

 

Summary:

CMS uses multiple methods to minimize GC pause time and reduce user program pauses.

When the pause time is reduced, the CPU throughput is sacrificed.

This is a trade-off between pause time and performance. It can be simply understood as "Space (performance)" for time.

 

You can think of yourself as a designer..

 

I declare again that it cannot be reproduced without permission!

Welcome to follow the subscription number:

References:

Https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/cms.html#concurrent_mark_sweep_cms_collector

Https://blogs.oracle.com/jonthecollector/entry/did_you_know

Http://dept.cs.williams.edu /~ Freund/cs434/hotspot-gc.pdf

Https://plumbr.eu/handbook/garbage-collection-algorithms-implementations

Https://blogs.msdn.microsoft.com/abhinaba/2009/03/02/back-to-basics-generational-garbage-collection/

Https://bugs.openjdk.java.net/browse/JDK-8027132

Deep understanding of JVM advanced features and best practices of Java Virtual Machine

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.