Java Virtual Machine (2)--garbage collection

Source: Internet
Author: User

Tag: Set new section to turn on efficiency throughput steps for size

Garbage collection Algorithm 1. Mark-Clear

Mark the surviving object, and then clean out the unmarked object.

Insufficient:

    • Mark and clear process are not efficient;
    • Generates a large amount of discontinuous memory fragmentation, which prevents large objects from being allocated memory.
2. Labeling-Finishing

Let all surviving objects move toward one end, and then directly clean out memory outside the end boundary.

3. Copying

Divide the memory into two blocks of equal size, using only one piece at a time, and when this piece of memory is exhausted, copy the surviving object to the other, and then clean up the used memory space once.

The main disadvantage is that only half of the memory is used.

Today's commercial virtual machines use this collection algorithm to reclaim the Cenozoic, but instead of dividing the Cenozoic into two blocks of equal size, they are divided into a larger Eden space and two smaller Survivor spaces, each using Eden space and one piece of Survivor. At the time of recycling, the objects that are still alive in Eden and Survivor are copied once to another Survivor space, and the last piece of Survivor that was used was cleared.

The size ratio of Eden and Survivor for the HotSpot virtual machine defaults to 8:1, which ensures that the memory utilization reaches 90%. If there are more than 10% objects per collection to survive, then a piece of Survivor space is not enough, it is necessary to rely on the old age for the allocation of security, that is, to borrow the old age of space storage can not put objects.

4. Collection of Generations

Now the commercial virtual machine uses the Generational collection algorithm, which divides the memory into several blocks according to the object survival cycle, and the different blocks adopt the appropriate collection algorithm.

The heap is generally divided into the new generation and the old age.

    • Cenozoic Use: Replication algorithms
    • Old age use: Mark-clear or mark-organize algorithm
Garbage collector

These are the 7 garbage collectors in the HotSpot virtual machine, which means that the garbage collector can work together.

    • Single threaded and multi-threaded: single-threaded means that the garbage collector uses only one thread for collection, while multithreading uses multiple threads;
    • Serial vs. Parallel: Serial refers to the garbage collector alternating with the user program, which means that the user program needs to be paused when garbage collection is performed, while parallel means that the garbage collector and the user program execute concurrently. In addition to CMS and G1, other garbage collectors are executed serially.
1. Serial Collector

Serial is translated as serial, which means it is executed serially.

It is a single-threaded collector that uses only one thread for garbage collection work.

It has the advantage of being simple and efficient, with the highest single-threaded collection efficiency for individual CPU environments because of the overhead of no thread interaction.

It is the default Cenozoic collector in Client mode, because in this scenario, the memory allocated to the virtual machine management is generally not very large. The Serial collector collects a few 10 trillion or even one hundred or two hundred trillion of new generation pauses that can be controlled in over 100 milliseconds, as long as it is not too frequent, and this pause is acceptable.

2. Parnew Collector

It is a multithreaded version of the Serial collector.

Is the preferred generation collector for virtual machines in Server mode, except for performance reasons, mainly because except for the Serial collector, only it can work with the CMS collector.

The number of threads opened by default is the same as the number of CPUs, and you can use the-xx:parallelgcthreads parameter to set the number of threads.

3. Parallel Scavenge Collector

As with Parnew, it is a multi-threaded collector.

The other collector focus is to minimize the downtime of the user thread as the garbage collection is minimized, and its goal is to achieve a controllable throughput, known as the "throughput first" collector. The throughput here refers to the amount of time the CPU spent running user code.

The shorter the pause time, the more suitable the program that needs to interact with the user, the better response speed can improve the user experience. The high throughput can efficiently utilize CPU time, and complete the operation task of the program as soon as possible, and it is suitable for the background operation without too much interactive task.

Shortening the pause time is in exchange for sacrificing throughput and new generation space: The new generation of space becomes smaller and garbage collection becomes frequent, resulting in decreased throughput.

The GC Adaptive Tuning Strategy (GC ergonomics) can be turned on with one switch parameter, without the need to manually specify the size of the Cenozoic (-XMN), the proportions of the Eden and Survivor areas, and the age of the old age. The virtual opportunity collects performance monitoring information based on the current system's operation and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput.

4. Serial Old Collector


is the old version of the serial collector and is also used for virtual machines in client mode. If used in Server mode, it has two major uses:

    • Used in conjunction with the Parallel scavenge collector in JDK 1.5 and prior versions (before the birth of Parallel old).
    • As a backup plan for the CMS collector, it is used when concurrent collection occurs Concurrent Mode Failure.
5. Parallel Old Collector

is the old version of the parallel scavenge collector.

The Parallel scavenge plus Parallel old collector is a priority when it comes to throughput and CPU resource sensitivity.

6. CMS Collector

CMS (Concurrent Mark Sweep), MarkSweep refers to the tag-purge algorithm.

Divided into the following four processes:

    • Initial tag: Simply mark the object that the GC Roots can directly relate to, which is fast and needs to be paused.
    • Concurrent tagging: The process of GC Roots tracing, which takes the longest and does not require a pause throughout the recycling process.
    • Re-tagging: A pause is required in order to fix the tag record of the part of the object that caused the tag to change during the concurrent tag because the user program continues to work.
    • Concurrent cleanup: no pauses required.

The collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup process throughout the process, without pausing.

Has the following disadvantages:

    • Low throughput: Low downtime is at the expense of throughput, resulting in insufficient CPU utilization.
    • Unable to process floating garbage, Concurrent Mode Failure may appear. Floating garbage refers to the garbage that occurs during the concurrent purge phase due to the continued operation of the user thread, which can only be recycled until the next GC. Due to the existence of floating garbage, it is necessary to reserve a portion of memory, which means that CMS collection cannot wait for the old age to be recycled like other collectors. If the reserved memory is not sufficient to store floating garbage, Concurrent Mode Failure will appear, and the virtual machine will temporarily enable Serial old to replace the CMS.
    • Tag-Clears the space debris caused by the algorithm, often leaving the old age space remaining, but unable to find enough contiguous space to allocate the current object, and had to trigger a full GC in advance.
7. G1 Collector

G1 (Garbage-first), a garbage collector for service-side applications, performs well in multi-CPU and large memory scenarios. The mission given by the HotSpot development team is to replace the CMS collector in the future.

The heap is divided into the new generation and the old age, the other collector collects the whole new generation or the old age, and G1 can be directly to the new generation and the old age together to recover.

The G1 divides the heap into multiple independent regions of equal size, which are no longer physically isolated in the new generation and the old age.

By introducing the concept of region, the original whole block of memory space is divided into small spaces, so that each small space can be garbage collected separately. This partitioning method brings a lot of flexibility and makes it possible to predict a time-of-pause model. By recording each region's garbage collection time and the amount of space available for recycling (these two values are obtained through past recycling experience), and maintaining a prioritized list, each time the most valuable region is reclaimed based on the allowable collection time.

Each region has a remembered Set, which is used to record the area where the reference object of the Location object resides. By using remembered Set, the full heap scan can be avoided when the accessibility analysis is done.

If you do not calculate the operation of maintaining remembered Set, the operation of the G1 collector can be broadly divided into the following steps:

    • Initial tag
    • Concurrency token
    • Final tag: In order to fix the part of the tag record that is causing the markup to change during the concurrent tagging process, the virtual machine will record this time object change in the remembered set Logs of the thread, and the final marking stage will need to remembered set Logs Data is merged into the remembered Set. This phase requires a stalled thread, but can be executed in parallel.
    • Filter recycling: First, the recovery value and cost in each region are sorted, and the recovery plan is based on the expected GC pause time for the user. This phase can also be done concurrently with the user program, but because only a portion of the region is reclaimed, time is user-controllable, and a pause in the user thread will significantly improve collection efficiency.

Has the following characteristics:

    • Spatial integration: As a whole, a collector based on the "mark-and-organize" algorithm, from a local (two region), is based on the "copy" algorithm, which means that no memory space fragmentation occurs during the run.
    • Predictable pause: Allows the user to explicitly specify that the time spent on a GC should not exceed N milliseconds in a time fragment of length M milliseconds.

Java Virtual Machine (2)--garbage collection

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.