JVM garbage Collection algorithm (most complete)

Source: Internet
Author: User

JVM garbage Collection algorithm (most complete)

The following is the memory model of the JVM virtual Runtime:

1. Method area Perm (permanent generation, non-heap)

2. Virtual Machine stack

3. Local method Stack (native method)

4. Heap

5. Program counter

1 The first question is: How does the JVM know that those objects need to be recycled?

At present, two kinds of identification algorithms, three recovery algorithms, two scavenging algorithms and three collectors

    • Reference counting method

Each object has a reference count, the object is referenced once, the reference counter is + 1, the object reference is disposed, the reference counter-1, until the object's reference count is 0, and the object is identified to be recyclable

This can be represented by the graph in the data algorithm, the object A-object B-Object C has a reference, so it will not be recycled, object B is not referenced, no path can reach object B, object B's reference count is 0, object B will be recycled.

However, this algorithm has obvious flaws, and in the case of circular references, the objects that are referenced by the loop are not recycled. For example: Object A, object B circular reference, no other object references A and B, neither a nor B are recycled.

    • Root Search algorithm

This algorithm currently defines several root objects that are objects that are not recycled by the JVM virtual machines, so objects referenced by these objects are objects in use, objects that are not used by objects that are about to be reclaimed. Simple is to say: If the object can reach root, it will not be recycled, if the object can not reach root, it will be recycled.

such as: Object D cannot access the root object, so it will be recycled

The following objects are considered to be root objects:

  • Classes loaded by the startup class (bootstrap loader) and objects created
  • Objects referenced by the JVM Runtime method area class static variable (static)
  • The JVM runtime method goes to the object referenced by the constant pool
  • The JVM currently runs the virtual machine stack variable table reference object in the thread
  • Objects referenced in the local method stack (JNI)

Because this algorithm even has objects that are referenced to each other, the two objects are recycled if they cannot access the root object. Such as: Object C and Object D reference each other, but because the root cannot be accessed, it is recycled.

The JVM implements the root search algorithm when determining whether to reclaim objects.

In the root search algorithm, the references we refer to here all specify a strong referential relationship. The so-called strong reference relationship is the object created by the new method, and the associated object is displayed

[Java]View PlainCopy
    1. <span style="Font-family:microsoft yahei;font-size:14px;" ><span style="Font-family:microsoft yahei;font-size:14px;" >object obj = new Object ();</span></span>

The above represents a strong reference relationship, and the variable obj strongly references an object of the objects.

There are four kinds of application relationships in Java, from strong to weak respectively:

Strong Reference (strong reference) –>weak Reference (weak reference), Soft Reference (soft reference) –> Phantom Reference (Reference)

Strong Reference: Can only be identified as recyclable if the reference object root is unreachable, garbage collection may be recycled

Weak Reference: Even if the root algorithm references the root of the object, it will be recycled if the JVM heap has insufficient memory.

Soft Reference: Whether the referenced object is rooted or not, garbage collection determines whether it needs to be reclaimed in response to memory needs.

Phantom Reference: Is added to the garbage collection queue after the collector determines that it indicates that the object can be reclaimed separately.

    • Mark-Clear
The
simplest algorithm for tag cleanup is primarily to mark out objects that need to be recycled, and then to erase the information of those objects in memory. How to mark objects that need to be recycled is described in the previous article.

    • Mark-Clear-compress

The algorithm is the process of re-moving the object by compressing space on top of the tag-clear algorithm. Because the markup cleanup algorithm causes a lot of memory space fragmentation that is left behind, as the fragmentation increases, the performance of memory reads and writes is severely impacted, so the fragmentation of the memory is collated after the mark-erase. The simplest arrangement is to compress the object to one side, leaving the space on the other side. Because of the time required to compress space, it can affect the time of garbage collection.

    • Mark-Clear-copy

The algorithm is that the memory is allocated to two spaces, one space (a) is responsible for loading the normal object information, and the other memory space (B) is garbage collected. Each time the object that is alive in space A is copied into space B, and the space A is deleted at once. This algorithm is more efficient than mark-clear-compression, but requires two blocks of space, memory requirements are larger, memory utilization is low. For short-lived objects, continuous replication of long-lived objects results in reduced efficiency

Because the processor is multicore now, the performance of the processor has been greatly improved, so on this basis there are several garbage collection algorithms. Mainly includes two kinds of algorithms

    • Parallel tag Cleanup

The so-called parallel is that the original garbage collection is just a thread. Now create multiple garbage collection threads. Parallel tagging and purging. For example, when the objects that need to be tagged are evenly distributed across multiple threads, when the tag is complete, multiple threads are purged.

    • Concurrency token cleanup

The so-called concurrency is that applications and garbage collection can be executed at the same time. In the markup cleanup algorithm, you need to pause the application when you mark the object and clear the object, and compress the object. Then the parallel tag clearing compression algorithm is based on the tag-clearing compression algorithm, which divides the tag-clearing compression algorithm into the following processes

Reset, concurrent cleanup, re-tagging, and concurrency tags, initial tag

The above algorithms are the basic garbage collection algorithm, the JVM garbage collection is based on the above algorithms, on the basis of the above algorithms, Java garbage collector can be divided into the following:

    • Serial collector

Single-threaded processing of all garbage collection work, because there is no need for multi-threaded interaction, so the efficiency is higher. However, it is also not possible to use multi-processor benefits, so this collector is suitable for single-processor machines

Single thread collector. In the current multi-core server run situation, the efficiency is low. Used in cases where heap memory is less suitable.

    • Parallel collector

Use multithreading to handle all garbage collection efforts, leveraging the benefits of multi-core processors. However, performance can also be affected if the number of threads is excessive, causing frequent scheduling between threads. Half of the threads that are collected in parallel are the number of processors.

"High throughput Requirements", multi-CPU, no application response time requirements of medium and large-scale applications. Examples: Background processing, scientific calculation.

    • Concurrent Collectors

The concurrent collector mainly reduces the time of the old generation to pause, and he uses a separate garbage collection thread to track the available objects without stopping the application. In each old generation garbage collection cycle, the concurrency collector briefly pauses the entire application during the collection period (the initial tagging process) and pauses again in the collection. The second pause is slightly longer than the first (the process of re-tagging), during which multiple threads are garbage collected at the same time.

The concurrent collector uses the processor for a short pause time. On a system of N processors, the concurrent collection portion is recycled using k/n available processors, typically 1<=K<=N/4.

The concurrent collector is used on a host with only one processor, and a short pause time can also be set to incremental mode mode.

floating garbage: because garbage collection occurs while the app is running, some of the garbage may be generated when the garbage collection is complete, resulting in "floating garbage", which can be reclaimed at the next garbage collection cycle. Therefore, the concurrent collector typically requires 20% of the reserved space for these floating garbage.

Concurrent Mode Failure: The concurrent collector collects when the application is running, so it is necessary to ensure that the heap has sufficient space for the program to use during the garbage collection period, otherwise, the garbage collection is not completed and the heap space is first full. In this case, a "concurrency mode failure" will occur and the entire app will be paused for garbage collection.

Concurrent collector, the collector that uses the concurrency tag cleanup algorithm when garbage collection

High response time requirements, multi-CPU, large-scale applications. For example, page request/web server. Front-end business systems are used more.

Serial Processor:

-Applicable: Small data volume (around 100M), single processor and no response time requirements of the application.

--Cons: Only for small applications

Parallel Processor:

--Application: "High throughput Requirements", multi-CPU, application response time is not required for medium and large-scale applications. Examples: Background processing, scientific calculation.

--Cons: Application response time may be extended during garbage collection

Concurrent processors:

-Application: "High response time Requirements", multi-CPU, the application response time has a high demand for medium and large-scale applications. Examples: Web server/Application server, Telecom Exchange, integrated development environment.

JDK5.0 applicable generational garbage collection algorithm

The generational garbage collection strategy is based on the fact that the life cycle of different objects is not the same. Therefore, different life cycle objects can be collected in different ways to improve the efficiency of recycling.

In the course of running a Java program, a large number of objects are generated, some of which are related to business information, such as session objects, threads, and socket connections in HTTP requests, which are directly linked to the business and therefore have a longer life cycle. However, there are some objects, mainly the temporary variables generated during the run of the program, which are shorter life cycles, such as: string objects, because of their invariant class characteristics, the system will produce a large number of these objects, some objects even once can be recycled.

Imagine that, in the absence of a distinction between object survival time, each garbage collection is to reclaim the entire heap space, take a relatively long time, and because each collection needs to traverse all the surviving objects, but in fact, for the long life cycle of the object, this traversal is not effective, because it may have been traversed many times , but they still exist. Therefore, the generational garbage collection uses the idea of divide and conquer, divides the generations, puts the objects of different life cycles on different generations, and uses the garbage collection method which is most suitable for it in different generations.

How to divide and substitute

The total of the virtual machines is divided into three generations: young Generation, Old Generation, and persistent (Permanent Generation). The persistent generation primarily stores the class information of the Java class, which is not related to the Java objects collected by garbage collection. The division of the younger generation and the old generation is more significant for garbage collection.

Young generations:

All newly generated objects are first placed in the younger generation. The goal of the young generation is to collect as quickly as possible those objects with short life cycles. The young generation is divided into three districts. One Eden area, two survivor districts (in general). Most objects are generated in the Eden area. When the Eden Zone is full, the surviving objects will be copied to the Survivor area (one of two), and when the survivor area is full, the surviving objects of this area will be copied to another survivor area, when the survivor is full, Objects that are copied from the first survivor area and that are still alive will be duplicated in the old Age zone (tenured). It should be noted that the two areas of the survivor are symmetrical and have no relationship, so the same area may exist at the same time from Eden copied objects, and from the previous survivor copied objects, and copied to the old quarter only from the first survivor to come over the object. Moreover, there is always an empty survivor area. At the same time, according to the program needs, the survivor area can be configured as multiple (more than two), which can increase the time of the object in the younger generation, reduce the possibility of being put into the old generation.

Old generation:

Objects that survived after n garbage collection in the younger generation will be placed in the old age. Therefore, it can be considered that older generations are storing objects with longer life cycles.

Persistent Generations:

Used to store static files, now Java classes, methods, and so on. The persistence generation has no significant impact on garbage collection, but some applications may dynamically generate or invoke some classes, such as Hibernate, at which point a large, persistent generation space is required to store the new class in these runs. Durable generation size by-xx:maxpermsize=&lt; N> the settings.

What happens when garbage collection is triggered

Because objects are processed in a generational way, garbage collection areas and times are different. There are two types of GC: Scavenge GC and full GC.

(1) Generation of GC implementations Serial: A single-threaded collector that collects using only one thread and pauses all other worker threads (Stop the World) when collected. It is the default Cenozoic collector in client mode. parnew: Serial multi-threaded version of the collector. In a single-CPU or even two-CPU environment, performance beyond the serial collector cannot be guaranteed due to the overhead of thread interaction. Parallel Scavenge: is also the multi-threaded collector, the difference with parnew is that it is ThroughputPriority collector. throughput = Run user code time/(run user code + garbage collection Time). Another difference is that after configuring-xx:+useadaptivesizepolicy, the virtual opportunity automatically adjusts parameters such as Eden/survivor to provide the throughput required by the user. What we need to configure is the memory size-xmx and the throughput gctimeratio. (2) The implementation of GC in the old age Serial Old: The old age version of the serial collector. Parallel Old: Parallel scavenge version of the old age. Previously, if the new generation of PS GC, the old age only serial can cooperate with it. Now, with the parallel old, you can use it in areas that are sensitive to throughput and CPU resources. CMS: The use is Mark-ClearInstead of Mark-and-organize, it is a collector with low pauses. However, due to the use of tag-erase, memory fragmentation issues are unavoidable. You can use the-xx:cmsfullgcsbeforecompaction setting to perform several CMS recoveries, followed by a memory defragmentation.

GC Type
There are two types of GC: Scavenge GC and full GC.

1. Scavenge GC

In general, when a new object is generated, and when the Eden application space fails, it is good to trigger the scavenge GC, the heap Eden Zone is GC, the non-surviving objects are cleared, and the surviving objects are moved to the survivor area. Then tidy up the two districts of survivor.
2. Full GC
Organize the entire heap, including young, tenured and perm. The full GC is slower than the scavenge GC, so the full GC should be minimized as much as possible. The full GC may be caused by the following reasons:
* tenured is written full
* Perm field is full
* System.GC () is displayed call
* Dynamic changes in the domain allocation policy of the heap after the last GC

JVM garbage Collection algorithm (most complete)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.