Summary of JVM garbage collection (GC)

Last Update:2014-09-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Basic recycling Algorithm

1. reference counting)
The old recycling algorithm. The principle is that this object has a reference, that is, adding a count. deleting a reference reduces the count. During garbage collection, only objects with zero collection count are used. The most critical issue of this algorithm is that it cannot handle circular references.
2. Mark-clear (mark-sweep)
This algorithm is executed in two phases. In the first stage, all referenced objects are marked from the reference root node. In the second stage, the whole heap is traversed to clear unmarked objects. This algorithm suspends the entire application and generates memory fragments.
3. Copy)
This algorithm divides the memory space into two equal regions and uses only one of them at a time. During garbage collection, traverse the current region and copy the objects in use to another region. The algorithm only processes objects in use at a time, so the replication cost is relatively small. At the same time, the corresponding memory can be organized after the replication, but there is a "Fragmentation" problem. Of course, the disadvantage of this algorithm is also obvious, that is, it requires two times of memory space.
4. Mark-compact)
This algorithm combines the advantages of "tag-clear" and "copy" algorithms. It is also divided into two phases. In the first phase, all referenced objects are marked from the root node. In the second stage, the whole heap is traversed to clear unlabeled objects and compress the surviving objects to one of the heap, discharge in sequence. This algorithm avoids the "tag-clear" fragmentation problem, and also avoids the space problem of the "copy" algorithm.
5. incremental collecting)
The garbage collection algorithm is implemented, that is, garbage collection is performed simultaneously by the application. I don't know why the Collector in jdk5.0 does not use this algorithm.
6. Generational collecting)

Based on the garbage collection algorithm obtained after object lifecycle analysis. Objects are divided into young, old, and persistent generations. Different algorithms (one of the above methods) are used to reclaim objects in different lifecycles. The current garbage collector (beginning with j2se1.2) uses this algorithm.

A. Young (young generation)
The young generation is divided into three zones. One Eden zone and two vor zones. Most objects are generated in the Eden area. When the Eden zone is full, the surviving objects will be copied to the primary vor zone (one of the two). When the primary vor zone is full, the surviving objects in this region will be copied to another region vor. When the region VOR is full, the surviving objects will be copied from the first region vor, it will be copied as "tenured )". Note that the two regions of the same vor are symmetric and irrelevant. Therefore, objects copied from Eden and copied from the previous vor may exist in the same region, only objects copied from the first vor region are copied to the old district. In addition, there is always a blank vor area.
B. tenured (elder generation)
The old generation stores objects that survive from the young generation. Generally, the old generation stores objects with long life periods.
C. Perm (permanent generation)
Used to store static files. Currently, Java classes and methods are supported. Persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as hibernate, in this case, you need to set up a large persistent storage space to store the classes added during the running process. The persistent generation size is set by-XX: maxpermsize = <n>.

GC type

1. Scavenge GC
Generally, when a new object is generated and the Eden application fails, scavenge GC is triggered, and GC is performed in the heap Eden area to clear non-surviving objects, and move the surviving objects to the same vor area. Then, sort out the two zones in the same vor.
2. Full GC
Organize the entire heap, including young, tenured, and perm. Full GC is slower than scavenge GC, so full GC should be minimized. Full GC may occur due to the following reasons:
* The tenured is full.
* The perm domain is full.
* System. GC () is displayed and called
* After the last GC, heap's allocation policies for various domains are dynamically changed.

Garbage Collector

1. Serial collector
A single thread is used to process all garbage collection tasks. Because multi-thread interaction is not required, the efficiency is relatively high. However, the advantages of multi-processor cannot be used, so this collector is suitable for single-processor machines. Of course, this collector can also be used on a multi-processor machine with a small amount of data (about MB. You can enable-XX: + useserialgc.
2. Parallel collector
1. Parallel garbage collection for the young generation can reduce the garbage collection time. Generally used on multi-threaded Multi-processor machines. Use-XX: + useparallelgc. To open. The parallel collector is introduced in j2se5.0 update and is enhanced in Java se6.0-it can be used for parallel collection by heap elders. If the old generation does not use concurrent collection, it uses a single thread for garbage collection, which restricts the scalability. Use-XX: + useparalleloldgc.
2. Use-XX: parallelgcthreads = <n> to set the number of parallel garbage collection threads. This value can be set to be equal to the number of machine processors.
3. This collector can be configured as follows:
* Maximum garbage collection pause: specifies the maximum pause time for garbage collection, which is specified through-XX: maxgcpausemillis = <n>. <N> millisecond. If this value is specified, the heap size and garbage collection parameters are adjusted to reach the specified value. Setting this value may reduce the application throughput.
* Throughput: the ratio of the garbage collection time to the non-garbage collection time. The formula is 1/(1 + n) by-XX: gctimeratio = <n> ). For example, when-XX: gctimeratio = 19, it indicates that 5% of the time is used for garbage collection. The default value is 99, that is, 1% of the time is used for garbage collection.

3. Concurrent collector
It can ensure that most of the work is performed concurrently (the application does not stop), and garbage collection only suspends a small amount of time. This collector is suitable for medium and large scale applications with high response time requirements. Enable-XX: + useconcmarksweepgc.
1. the concurrent collector mainly reduces the pause time of the old generation. It uses an independent garbage collection thread to track reachable objects without stopping the application. In each old generation garbage collection cycle, the concurrent collector will temporarily pause the entire application at the early stage of collection, and pause again during collection. The second pause will be a little longer than the first one. In this process, multiple threads perform garbage collection at the same time.
2. the concurrent collector uses the processor for a short pause. In a system with N processors, the concurrent collection part is recycled using k/N available processors. Generally, 1 <= k <= N/4.
3. Use the concurrent collector on a host with only one processor. If it is set to incremental mode, a short pause time can also be obtained.
4. floating garbage collection: because the application is running at the same time for garbage collection, some garbage may be generated when the garbage collection is completed, resulting in "floating garbage ", this garbage can be recycled only in the next garbage collection cycle. Therefore, the concurrent collector generally requires 20% of the reserved space for these floating spam.
5. concurrent mode failure: the concurrent collector collects data when the application is running. Therefore, make sure that the heap has sufficient space for the application during garbage collection. Otherwise, garbage collection is not completed, the heap space is full first. In this case, "concurrent mode failure" will occur, and the entire application will be paused for garbage collection.
6. Start the concurrent COLLECTOR: Because concurrent collection is collected during application running, you must ensure that there is sufficient memory space for the program to use before the collection is complete. Otherwise, "concurrent mode failure" appears ". You can set-XX: cmsinitiatingoccupancyfraction = <n> to specify the number of remaining heaps to start concurrent collection.

4. Summary
* Serial processor:
-- Applicability: applications with a small data volume (about MB) and a single processor that have no requirements for response time.
-- Disadvantage: it can only be used for small applications
* Parallel processor:
-- Applicability: medium and large applications with high throughput requirements, multiple CPUs, and no application response time requirements. Example: background processing and scientific computing.
-- Disadvantage: The application response time may be long.
* Concurrent processors:
-- Applicability: medium and large applications with high requirements on response time, multiple CPUs, and high requirements on application response time. Example: Web server/application server, telecom exchange, and integrated development environment.

Optimization Summary

1. Young Generation Size Selection
* Applications with priority over Response Time: set as large as possible until it is close to the minimum response time limit of the system (depending on the actual situation ). In this case, the collection frequency of the young generation is also the smallest. At the same time, reduce the number of objects that reach the old generation.
* Applications with high throughput priority: the maximum throughput may reach the Gbit level. Because there is no requirement on the response time, garbage collection can be performed in parallel, which is generally suitable for applications with more than 8 CPUs.

2. Select the size of the Elder Generation
* Applications with priority over response time: the concurrency collector is used in older generations. Therefore, you need to carefully set the response size. Generally, you need to consider the concurrency session rate, session duration, and other parameters. If the heap settings are small, it may cause memory fragmentation, high recovery frequency, and application suspension. The traditional mark clearing method is used. If the heap is large, it takes a long time to collect. For the optimal solution, you generally need to obtain the following data:
O concurrent garbage collection information
O number of persistent generation concurrent collection times
O traditional GC Information
O ratio of time spent on the recovery of young and old generations
Reducing the time spent by the young and old generations will generally improve application efficiency.
* Throughput-first applications: Generally, throughput-first applications have a large young generation and a small old generation. The reason is that most of the short-term objects can be recycled as much as possible to reduce the medium-term objects, while the old generation will store the long-term surviving objects as much as possible.

3. fragmentation problems caused by small heaps
The heap is not compressed because the concurrent collector of the old generation uses the marking and clearing algorithms. When the collector recycles the object, it merges the adjacent space and assigns it to a large object. However, when the heap space is small and runs for a period of time, "fragments" will appear. If the concurrent collector cannot find enough space, the concurrent collector will stop, then, use the traditional marking and clearing methods for recycling. If "fragments" appear, you may need to configure the following:
*-XX: + usecmscompactatfullcollection: Enable compression for the old generation when concurrent collectors are used.
*-XX: cmsfullgcsbeforecompaction = 0: When the preceding configuration is enabled, how many full GC times are set here to compress the old generation?

I am the dividing line of tiantiao

Reference: http://chenchendefeng.iteye.com/blog/455883

Summary of JVM garbage collection (GC)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Summary of JVM garbage collection (GC)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Summary of JVM garbage collection (GC)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support