JVM garbage collection algorithm summary and summary

Last Update:2014-09-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Let's take a look at the memory model when the JVM virtual machine is running:

1. Method Area Perm (permanent generation, non-heap)

2. Virtual Machine Stack

3. Local method stack (Native method)

4. Heap

5. Program counters

1. The first question is: how does jvm know which objects need to be recycled?

CurrentlyTwo identification algorithms, three collection algorithms, two clearing algorithms, and three collectors

Reference COUNTING METHOD

Each object has a reference count. Every time an object is referenced, the reference counter is + 1, the object reference is released, and the reference counter is-1 until the reference count of the object is 0, the object can be recycled.

This can be represented in graphs in the data algorithm. Object A-object B-Object C is referenced, so it will not be recycled. Because object B is not referenced, no path can reach object B, the reference count of object B is 0, and object B is recycled.

However, this algorithm has obvious defects. In the case of loop reference, the objects referenced by the loop will not be recycled. For example, if object A and object B are referenced cyclically and no other object references A and B, both A and B will not be recycled.

Root Search Algorithm

This algorithm currently defines several root objects that are not recycled by the jvm Virtual Machine. Therefore, the objects referenced by these objects are all in use, the unused objects of these objects are the objects to be recycled. Simply put: if the object can reach root, it will not be recycled. If the object cannot reach root, it will be recycled.

For example, Object D cannot access the root object, so it will be recycled.

The following objects are considered as root objects:
Class loaded by the bootstrap loader and the created object static variable (static) in the Method Area of the jvm Runtime) reference object jvm runtime Method to the object referenced by the constant pool jvm current running thread VM stack variable Table reference object local method stack (jni) referenced object

Even if the two objects cannot access the root object, the two objects will still be recycled. For example, objects C and D are referenced by each other, but they are recycled because they cannot access the root.

Jvm uses the root search algorithm to determine whether to recycle objects.

In the root search algorithm, all references are strongly referenced. A strongly referenced link is an object created using the new method and displays the associated objects.

Object obj = new Object();

The above indicates a strong reference relationship, and the variable obj strongly references an Object of the Object.

Java has four application relationships, from strong to weak:

Strong Reference-> Weak Reference-> Soft Reference-> Phantom Reference)

Strong Reference: only when the referenced object root cannot be reached will it be identified as recyclable, and garbage collection can be recycled.

Weak Reference: even if the root object referenced by the root algorithm can be reached, it will still be recycled if the jvm heap memory is insufficient.

Soft Reference: no matter whether the referenced object is reachable as root, garbage collection is used to determine whether to recycle the object in response to memory requirements.

Phantom Reference: After the recycler determines that the indicated object can be recycled separately, it is added to the garbage collection queue.

Mark-clear

The algorithm used to mark objects to be recycled is the simplest. It is used to mark objects to be recycled and then clear the information of these objects in the memory. How to mark objects to be recycled is described in the previous article.

Mark-clear-compression

This algorithm compresses space and moves objects again on top of the Mark-clear algorithm. Because the tag clearing algorithm will cause a lot of memory space fragments left behind, as the number of fragments increases, it seriously affects the memory read/write performance. Therefore, after the tag-clearing algorithm, the memory fragments are sorted. The simplest arrangement is to compress the object to one side and leave space on the other side. Because it takes a certain amount of time to compress the space, the garbage collection time will be affected.

Mark-clear-copy

In this algorithm, the memory is allocated to two spaces. One Space (A) is used to load normal object information, and the other memory space (B) is used for garbage collection. Copy all the objects in space A to Space B, and delete space A at A time. This algorithm is more efficient than Mark-clear-compression, but requires two blocks of space. It requires a large amount of memory and a low memory utilization. Applies to objects with short-lived periods. Objects with long-lived periods of continuous replication result in lower efficiency.

Because the current processors are all multi-core and the performance of the processors has been greatly improved, several garbage collection algorithms have been generated on this basis. It mainly includes two algorithms

Parallel mark clearing

The so-called parallel mechanism means that garbage collection is only a thread. Now multiple garbage collection threads are created. Mark and clear in parallel. For example, after the objects to be marked are evenly allocated to multiple threads, multiple threads are cleared after the mark is completed.

Concurrent mark clearing

The so-called concurrency means that applications and garbage collection can be executed simultaneously. In the tag clearing algorithm, you need to pause the application when marking objects, clearing objects, and compressing objects. The parallel mark clearing compression algorithm is based on the mark clearing compression algorithm. The mark clearing compression algorithm is divided into the following processes:

Initial mark-> concurrent mark-> re-mark-> concurrent clear-> Reset

The above algorithms are the basic algorithms for garbage collection. jvm garbage collection is based on the above algorithms. Based on the above algorithms, java garbage collection can be divided into the following types:

Serial collector

Processing all garbage collection tasks with a single thread is highly efficient because multi-thread interaction is not required. However, the advantages of multi-processor cannot be used, so this collector is suitable for single-processor machines.

Single-thread collector. At present, the efficiency of multi-core servers is relatively low. Suitable for use when the heap memory is small.

Parallel collector

Use multiple threads to process all garbage collection tasks and take advantage of multi-core processors. However, if the number of threads is too large, frequent scheduling among threads may also affect the performance. The number of threads concurrently collected in half is the number of processors.

Medium and large applications with high throughput requirements, multiple CPUs, and no application response time requirements. Example: background processing and scientific computing.

Concurrent collector

The concurrent collector mainly reduces the pause time of the old generation. It uses an independent garbage collection thread to track reachable objects without stopping the application. In each old generation garbage collection cycle, the concurrent collector will suspend the entire application briefly (the initial marking process) at the initial stage of collection, and pause again during collection. The second pause will be a little longer than the first one (the re-marking process). In this process, multiple threads perform garbage collection at the same time.

The concurrent collector uses a processor for a short pause. In a system with N processors, the concurrent collection part is recycled using K/N available processors. Generally, 1 <= K <= N/4.

When the concurrent collector is used on a host with only one processor, you can set it to the incremental mode to get a short pause time.

Floating garbage:Because the application is running at the same time for Garbage collection, some Garbage may be generated when the Garbage collection is completed, resulting in "Floating Garbage ", this garbage can be recycled only in the next garbage collection cycle. Therefore, the concurrent collector generally requires 20% of the reserved space for these floating spam.

Concurrent Mode Failure:The concurrent collector collects data when the application is running. Therefore, you must ensure that the heap has enough space for the program during the garbage collection period. Otherwise, the garbage collection is not completed and the heap space is full. In this case, "concurrent mode failure" will occur, and the entire application will be paused for garbage collection.

Concurrent COLLECTOR: The Collector that uses the concurrent mark clearing algorithm during garbage collection.

High response time requirements, multiple CPUs, and large applications. For example, page request/web server. Front-end business systems are usually used.

Serial processor:

-- Applicability: applications with a small data volume (about MB) and a single processor that have no requirements for response time.

-- Disadvantage: it can only be used for small applications

Parallel processor:

-- Applicability: medium and large applications with high throughput requirements, multiple CPUs, and no application response time requirements. Example: background processing and scientific computing.

-- Disadvantage: application response time may be extended during garbage collection.

Concurrent processor:

-- Applicability: medium and large applications with high requirements on response time, multiple CPUs, and high requirements on application response time. Example: Web server/application server, telecom exchange, and integrated development environment.

JDK5.0 applicable generational garbage collection Algorithm

The generational garbage collection policy is based on the fact that different objects have different lifecycles. Therefore, objects in different lifecycles can be collected in different ways to improve recycling efficiency.

During Java program running, a large number of objects are generated, some of which are related to business information, such as Session objects, threads, and Socket connections in Http requests, this type of object is directly linked to the business, so the lifecycle is relatively long. However, there are still some objects, mainly temporary variables generated during the program running process. These objects will have a short life cycle, such as String objects, because of their unchanged class features, the system generates a large number of these objects, and some objects can be recycled only once.

Imagine that, without differentiation of the object survival time, every garbage collection process recycles the entire heap space, which takes a long time. At the same time, because every reclaim operation requires traversing all the surviving objects, but in fact, this traversal is ineffective for objects with a long life cycle, because it may have been traversed many times, but they still exist. Therefore, the division of generation garbage collection adopts the idea of division and governance to divide the objects of different life cycles on different generations, different generations use the most suitable garbage collection method for recycling.

Generation Division

Virtual machines are divided into three generations: Young Generation, Old Generation, and Permanent Generation ). The persistent generation mainly stores the class information of Java classes, which has little to do with Java objects to be collected by garbage collection. The division of the young and old generations has a great impact on garbage collection.

Young generation:

All newly generated objects are first put in the young generation. The goal of the young generation is to quickly collect objects with short lifecycles as much as possible. The young generation is divided into three zones. One Eden zone and two vor zones (generally ). Most objects are generated in the Eden area. When the Eden zone is full, the surviving objects will be copied to the primary vor zone (one of the two). When the primary vor zone is full, the surviving objects in this region will be copied to another region vor. When the region VOR is full, the surviving objects will be copied from the first region vor, it will be copied as "Tenured )". Note that the two regions of the same vor are symmetric and irrelevant. Therefore, objects copied from Eden and copied from the previous vor may exist in the same region, only objects copied from the first vor region are copied to the old district. In addition, there is always a blank vor area. At the same time, according to the program requirements, the VOR area can be configured as multiple (more than two), which can increase the time for the object to exist in the young generation and reduce the possibility of being put into the old generation.

Elder Generation:

Objects that are still alive after N garbage collection in the young generation will be put into the old generation. Therefore, it can be considered that objects with long lifecycles are stored in the old generation.

Permanent generation:

Used to store static files. Currently, Java classes and methods are supported. Persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as Hibernate, in this case, you need to set up a large persistent storage space to store the classes added during the running process. The persistent generation size is set through-XX: MaxPermSize = & lt; N>.

Under what circumstances will garbage collection be triggered?

Because the object is divided into generations, the garbage collection area and time are different. There are two types of GC: Scavenge GC and Full GC.

GC type
There are two types of GC: Scavenge GC and Full GC.

1. Scavenge GC
Generally, when a new object is generated and the Eden application fails, Scavenge GC is triggered, and GC is performed in the heap Eden area to clear non-surviving objects, and move the surviving objects to the same vor area. Then, sort out the two zones in the same vor.
2. Full GC
Organize the entire heap, including Young, Tenured, and Perm. Full GC is slower than Scavenge GC, so Full GC should be minimized. Full GC may occur due to the following reasons:
* The Tenured is full.
* The Perm domain is full.
* System. gc () is displayed and called
* After the last GC, Heap's allocation policies for various domains are dynamically changed.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

JVM garbage collection algorithm summary and summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

JVM garbage collection algorithm summary and summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support