Java garbage collector

Source: Internet
Author: User

The garbage collector is a concrete implementation of the memory recovery algorithm. Shows 7 collectors that act on different generations, and if there is a connection between the two collectors, they can be used together.

Below is a collection of 7, divided into two pieces, above the new generation collector, the following is the old age collector. If there is a connection between the two collectors, it means they can be used with each other.

    • Serial collector: Serial collector

A single-threaded collector, but its single-threaded meaning is not just about using a CPU or a thread to do garbage collection, but more importantly, when garbage collection, all other worker threads ("Stop the World") must be paused until the collection is over.

The serial collector is a new generation collector, single threaded execution, using a copy algorithm. It must suspend all other worker threads (user threads) when it is garbage collected. Is the default new generation collector in JVM client mode. For environments that limit a single CPU, the serial collector is naturally able to achieve the highest single-threaded collection efficiency due to the lack of thread interaction overhead.

Advantage: Simple and efficient, for a single CPU-constrained environment, the serial collector can achieve the highest single-thread collection efficiency due to the overhead of no thread interaction.

This is a good choice for virtual machines running in client mode.

    • Parnew collector (Parallel GC)

The Parnew collector is actually a multithreaded version of the serial collector, and the rest is the same as the serial collector, in addition to using multiple threads for garbage collection.

The Parnew Collector is the preferred new generation collector in many virtual machines running in server mode, one of the important reasons for performance-agnostic is that, in addition to the serial collector, only the Parnew collector can work with the CMS collector (a concurrency old-age collector).

    • Parallel (Parallel): Refers to a number of garbage collectors working in parallel, but the user thread is still in a wait state.
    • Concurrency (Concurrent): The user thread executes concurrently with the garbage collection thread (but not necessarily in parallel, possibly alternately), the user program continues to run, and the garbage collector runs on the other CPU
    • Parallel Scavenge collector (parallel recycle GC)--Throughput priority

The Parallel scavenge collector is also a new generation collector, it is also a collector using the copy algorithm, and is also a parallel multi-threaded collector. The parallel scavenge collector is characterized by its focus on other collectors, where the focus of collectors such as CMS is to minimize the downtime of user threads when garbage collection occurs, while the parallel scavenge collector's goal is to achieve a controllable throughput. It is a collector implementation that targets the maximization of throughput (that is, GC time is the smallest of total elapsed time), which allows for a long period of STW in exchange for maximum total throughput.

Throughput = Program Run time/(program run time + garbage collection time), the virtual machine runs for a total of 100 minutes. Where garbage collection takes 1 minutes, the throughput is 99%.

The shorter the pause time, the more suitable the program that needs to interact with the user, the good response speed can improve the user's experience, and the high throughput can make the most efficient use of CPU time, as soon as possible to complete the operation of the program tasks, mainly for the background operation and do not need too many interactive tasks.

    • Serial old (serial GC) collector

Serial old is an older version of the Serial collector, which also uses a single thread to perform the collection using the "mark-and-organize" algorithm. The virtual machine is primarily used in client mode.

In the case of server mode, it has two main purposes: first, it is used in conjunction with the parallel scavenge collector in JDK1.5 and previous versions, and the other is a fallback scheme for the CMS collector, where the concurrent collection occurs concurrent Mode Use when failure.

    • Parallel old (parallel GC) collector

Parallel old is an older version of the Parallel scavenge collector, using multithreading and the "mark-and-organize" algorithm. Older versions of the throughput priority collector, using the multithreading and tagging-collation algorithm, JVM 1.6 provided, before the new generation using PS collector, the old age in addition to serial, there is no choice, because PS can not work with the CMS collector.

    • CMS (Concurrent Mark Sweep) collector

CMS is a collector that targets the shortest pause time, and using a CMS does not achieve the highest GC efficiency (the overall GC time is minimal), but it minimizes the downtime of the service during GC, which is critical for real-time or highly interactive applications such as securities trading. Such applications are generally not tolerated for long-time STW. The CMS collector uses the tag-purge algorithm, which means that it generates space fragmentation during operation, so the virtual machine provides parameters to open the CMS collection and then perform a memory compression.

It is broadly divided into 4 steps:

①. Initial tag (CMS initial mark): marks the object that the GC roots can directly relate to, fast

②. Concurrency token (CMS concurrenr mark): Perform GC ROOTS root search algorithm phase to determine if the object is alive

③. Re-tagging (CMS remark): Fixed the tag record of the part of the object that caused the tag to change during concurrent tagging, which was a bit longer than the initial marking phase, but shorter than the concurrent tagging phase.

④. Concurrency Cleanup (CMS concurrent sweep)

Because the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup process throughout the process, the memory reclamation process for the CMS collector is performed concurrently with the user thread.

The advantages of CMS collector: Concurrent collection, low pause, but the CMS is still far from perfect, the main device has three significant shortcomings:

The CMS collector is very sensitive to CPU resources. In the concurrency phase, although the user thread does not pause, it consumes CPU resources and causes the reference program to slow down and the total throughput to decrease. The number of recycled threads that the CMS starts by default is: (Number of CPUs +3)/4.

The CMS collector cannot handle floating garbage and may appear "Concurrent Mode Failure", resulting in another full GC after failure. Because the CMS concurrent cleanup phase user thread is still running, with the program running since the heat will have a new garbage generation, this part of the garbage appears after the tagging process, the CMS will not be able to process them in this collection, we have to leave the next GC to clean it off. This part of the rubbish is called "floating rubbish". It is also because the user thread in the garbage collection phase needs to run, that is, to reserve enough memory space for the user thread to use, so the CMS collector cannot wait until the old age is almost completely filled up like other collectors and then collects it, and needs to reserve a portion of the memory space for the program to run when it is collected concurrently. By default, the CMS collector is activated when 68% of space is used in the old age, or it can provide a trigger percentage by the value of the parameter-xx:cmsinitiatingoccupancyfraction to reduce the number of memory recoveries to improve performance. The "Concurrent Mode Failure" failure occurs when the memory reserved during the CMS operation does not meet the needs of other threads of the program, and the virtual machine will start a fallback plan: temporarily enable the serial old collector to re-use the garbage collection of the older age, So the pause time is very long. So the parameter-xx:cmsinitiatingoccupancyfraction set too high will easily lead to "Concurrent Mode Failure" failure, performance is reduced.

The last drawback is that the CMS is a collector based on the "tag-purge" algorithm, which is collected with a "mark-sweep" algorithm, resulting in a lot of fragmentation. Too much space debris will cause a lot of trouble with object allocation, such as large objects, where memory space cannot find contiguous space to allocate and have to trigger a full GC in advance. To solve this problem, the CMS collector provides a-xx:usecmscompactatfullcollection switch parameter that adds a defragmentation process after the full GC, and can also be-xx: The cmsfullgcbeforecompaction parameter sets the number of times the full GC is executed, followed by a defragmentation process.

    • G1 Collector

The G1 (garbage first) collector is a new collector provided by JDK1.7, and the G1 collector is implemented based on the "tag-organize" algorithm, which means that no memory fragmentation is generated. There is also a feature before the collection is collected by the entire new generation or the old age, while the G1 will be the entire Java heap (including the new generation, the old age).

    • Parallelism and Concurrency: Take advantage of the hardware advantages of multi-CPU, multi-core environment to shorten the time of Stop-the-world pause.
    • Collection of generational:
    • Space consolidation: Based on the "mark-and-organize" algorithm, the runtime does not generate memory space fragmentation and can provide structured, available memory after collection
    • Predictable pauses: the pursuit of low pauses and the creation of a predictable pause-time model that allows the user to explicitly designate a time slice with a length of m milliseconds to consume on garbage collection for no more than n milliseconds

With the G1 collector, the memory layout of the Java heap differs greatly from that of other collectors, which divide the entire Java heap into separate regions of varying size (region), while preserving the concept of the Cenozoic and older generations, but the Cenozoic and the old are no longer physically isolated, They are all collections that are part of the region (which does not need to be contiguous).

G1 tracks the value of the garbage accumulation in each region (the amount of space recovered and the amount of time it takes to recover), maintains a prioritized list in the background, and prioritizes recovering the region with the highest value per time, based on the allowable collection times.

In the G1 collector, the object reference between region and the new generation of the collector and the old age are referenced by the virtual machine using remembered set to avoid a full heap scan. Each region in the G1 has a corresponding remembered Set, and the Virtual Machine Discovery program generates a write barrier temporary interrupt write operation when it writes to the reference type of data. Checks if the object referenced by reference is in a different region, and if so, logs the relevant reference information to the remembered set of the region to which the referenced object belongs, by Cardtable. When memory reclamation is in progress, adding the remembered set to the enumeration scope of the GC root node guarantees that the full heap scan will not be missed.

Operation Steps of the G1 collector:

    • Initial tag (Initial marking)
    • Concurrency token (Concurrent marking)
    • Final mark (Final marking)
    • Filter collection (Live Data counting and evacuation)

The initial marking phase simply marks the object that GC Roots can directly relate to, and modifies the value of Tams (next Top at Mark Start), allowing the next stage of the user program to run concurrently, creating new objects in the correctly available region, which requires a stalled thread, but takes a short time.

Concurrency tagging phase: From GC roots, the object in the heap is analyzed for accessibility, which takes a long time, but can be executed concurrently with the user program.

Final marking phase: In order to fix the part of the tag that is marked as a result of the operation of the user program in the concurrent tagging phase, the virtual machine records this part of the record in the remembered set logs, and the final marking phase requires the remembered set The data in the logs is merged into the remembered set, which requires a stalled thread, but can be executed concurrently with the user program.

Filter recovery phase: The recovery value and cost of each region are sorted first, and the recovery schedule is specified based on the desired GC pause time. This phase can be executed concurrently with the user program because only a subset of the region is reclaimed and the time is user-controllable.

Mind Mapping

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.