JVM garbage collection mechanism

Source: Internet
Author: User

Scope: which areas to recycle

Of the five memory models of the JVM, three are not garbage collected: program counters, JVM stacks, local method stacks. Because their lifecycles are synchronized with threads, the memory they occupy is automatically freed as the threads are destroyed, so only the method area and heap need to be GC.


Premise: How to Tell if an object is dead

All garbage collection algorithms are faced with the same problem, that is to find the application unreachable memory block , release it, this is not reachable mainly refers to the application has no memory block reference, in Java, An object that is reachable to an application is a reference to the object that is referenced by the root (which is primarily a static variable of a class, or a reference to an object that is active in all line stacks), or an object referenced by another reachable object.

Reference counting algorithm

Reference counting is the simplest and most straightforward way to add a reference count to each object, which represents how many references to the current program refer to this object, and if the reference count of this object becomes 0, then this object can be collected as the target object of the garbage collector.

Pros: Simple, straightforward, no need to pause the entire application

Cons: 1. Requires the compiler's mates, the compiler to generate special instructions for reference counting operations; 2. Problems with circular references cannot be handled

So this approach is an early strategy for garbage collection and is now rarely used. Sun's JVM does not use the reference counting algorithm for garbage collection, which is based on the root search algorithm .

Root Search algorithm

Through a series of objects named "GC Root" as a starting point, searching down from these nodes, the path of the search is called the reference chain (Reference Chain), when an object to GC root does not have any reference chain connected, the object is unreachable, the object is not available, The garbage collector reclaims the memory it occupies.

In the Java language, objects that are available as gcroot include the following objects:

A. The referenced object in the Java Virtual machine stack (the local variable table in the stack frame).

B. The object referenced by the class static property in the method area.

C. The object referenced by a constant in the method area.

D. The reference object for the Jni local method in the local method stack.

To judge a useless class:

(1). All instances of the class have been reclaimed, that is, the instance object of the class does not exist in the Java heap.

(2). The class loader that loaded the class has been recycled.

(3). The Java.lang.Class object of the class is not referenced anywhere, and the method of the class cannot be accessed anywhere through the reflection mechanism.

Four types of references

When collecting an object, the GC will determine if there is a reference to the object, and there are four main references in Java:

⑴ Strong references (strong Reference)

Strong references are the most commonly used references. If an object has a strong reference, the garbage collector will never recycle it. When there is not enough memory space, the Java virtual Machine prefers to throw a outofmemoryerror error, which causes the program to terminate abnormally, and does not rely on random recycling of strongly referenced objects to resolve out-of-memory issues.

⑵ Soft Reference (Soft Reference)

If an object has only soft references, enough memory space is available, the garbage collector does not recycle it, and if the memory space is insufficient, the memory of those objects is reclaimed . The object can be used by the program as long as it is not reclaimed by the garbage collector. soft references can be used to implement memory-sensitive caches.

As an example, if there is an application that needs to read a large number of local images, it can seriously affect performance if the images are read from the hard disk every time, but if they are all loaded into memory and can cause memory overflow, a soft reference is used to solve the problem.

The design idea is: With a hashmap to save the path of the picture and the corresponding picture object associated with the mapping between the soft reference, when the memory is low, the JVM will automatically reclaim the space occupied by these cached picture objects, thus effectively avoiding the problem of memory overflow.

A soft reference can be used in conjunction with a reference queue (Referencequeue), and if the object referenced by the soft reference is reclaimed by the garbage collector, the Java Virtual machine will add the soft reference to the reference queue associated with it.

(3) Weak references (Weak Reference)

The difference between a weak reference and a soft reference is that an object with only a weak reference has a shorter life cycle. As the garbage collector thread scans the area of memory it governs, once an object with only a weak reference is found, its memory is reclaimed, regardless of whether the current memory space is sufficient or not. However, because the garbage collector is a low-priority thread, it is not necessarily quick to discover objects that have only weak references.

A weak reference can be used in conjunction with a reference queue (Referencequeue), and if the object referenced by the weak reference is garbage collected, the Java virtual machine adds the weak reference to the reference queue associated with it.

⑶ Virtual Reference (Phantom Reference)

"Virtual Reference", as the name implies, is a dummy, unlike several other references, a virtual reference does not determine the object's life cycle. If an object holds only virtual references, it can be reclaimed by the garbage collector at any time, just as there are no references.

virtual references are primarily used to detect whether an object has been removed from memory and to track the activity that the object is reclaimed by the garbage collector. One difference between a virtual reference and a soft reference and a weak reference is that the virtual reference must be used in conjunction with the reference queue (Referencequeue) . When the garbage collector prepares to reclaim an object, if it finds that it has a virtual reference, it will add the virtual reference to the reference queue associated with it before reclaiming the object's memory.

Referencequeue queue = new Referencequeue (); Phantomreference PR = new Phantomreference (object, queue);


Policy: Garbage collection Policy in the JVM Tag-Purge algorithm


The tag cleanup collector stops all work , scans each active object from the root, then marks the scanned object, and clears those objects that are not marked after the tag is finished.

Advantages:

1 Troubleshooting circular References

2 No need for compiler mates, so no additional instructions are executed

Disadvantages:

1. Each active object is scanned for a longer time to collect pauses.

2. The tag-purge algorithm takes a scan from the root collection, marks the surviving object object, and then scans the entire space for objects that are not tagged, as shown in.

The tag-purge algorithm does not need to move objects, and it only handles objects that are not alive, and is extremely efficient in the case of many surviving objects, but because the tag-purge algorithm directly reclaims objects that are not alive, it can cause memory fragmentation .

Replication Algorithms


The copy collector divides the memory into two pieces of the same size space, and at some point, only one space is active, and when the active space is full, the GC copies the active object into the unused space, and the original inactive space becomes the active space.

Advantages:

1 scans only reachable objects, eliminating the need to scan all objects, reducing the time to pause the app

Disadvantages:

1. Need extra space to consume, at some point, always have a piece of memory in unused state

2. Copying objects requires a certain amount of overhead

The replication algorithm takes a scan from the root set and copies the surviving objects into a new, unused space, an algorithm that is extremely efficient when space-surviving objects are relatively young, but the cost is that a piece of memory swap space is needed for the object to move.

Tagging-sorting algorithms


The tag grooming collector draws on the benefits of the tag cleanup and copy collectors, which are executed in two stages, first scanning all active objects, marking all active objects, and the second stage clearing unmarked objects first, then copying the active objects to the bottom of the heap

This algorithm greatly reduces memory fragmentation and does not require twice times as much space as the copy algorithm.

The tag-grooming algorithm uses the tag-sweep algorithm to mark objects in the same way, but when it is cleared, all surviving objects are moved to the left-hand free space after reclaiming the space occupied by the objects that are not alive, and the corresponding pointers are updated. The tag-collation algorithm, which is based on the tag-purge algorithm and moves the object, is more expensive, but solves the problem of memory fragmentation.

Generational recovery algorithm

Garbage generation Algorithm (generationalcollecting) is based on the garbage collection algorithm derived from the object life cycle analysis.

Because we have introduced earlier, the memory is mainly divided into three blocks, the new generation, the old generation, the lasting generation. Three generations of different characteristics, resulting in their use of the GC algorithm, the new generation for those with short life cycle, frequently created and destroyed objects, the old generation suitable for the life cycle of relatively long objects, durable generation in the Sun hotspot refers to the method area (some JVMs do not have a persistent generation of this argument). Firstly, the concept and characteristics of the next generation, the old generation and the enduring generation are introduced.


Young (younger generation, Cenozoic): A portion of the heap in the JVM specification is divided into three districts by a young generation. One Eden area, two survivor districts. Most objects are generated in the Eden area. When the Eden Zone is full, the surviving objects will be copied to the Survivor area (one of two), and when the survivor area is full, the surviving objects of this area will be copied to another survivor area, when the survivor is full, Objects that are copied from the first survivor area and that are still alive will be copied to the old generation. It should be noted that the two areas of the survivor are symmetrical and have no relationship, so the same area may exist at the same time from Eden copied objects, and from the previous survivor copied objects, and copied to the old quarter only from the first survivor to come over the object. Moreover, there is always an empty survivor area.

The new generation uses the replication algorithm and the mark-Remove garbage collection algorithm , the Cenozoic 98% object is the short life cycle object that faces to live to die, so do not need to divide the Cenozoic into two parts of equal size memory, but will divide the Cenozoic into Eden area, Survivor from ( Survivor 0) and Survivor to (Survivor1) three parts, which account for the Cenozoic memory capacity of the default ratio of 8:1:1, where Survivor from and Survivor to always have a region is blank, Only Eden and one of the survivor a total of 90% of the Cenozoic capacity is used to allocate memory for newly created objects, only 10% of survivor memory is wasted, and when a new generation of memory space is required for garbage collection, the surviving objects are copied into the empty Survivor memory area , Eden and non-blank survivor are marked-cleaned and recycled, and two survivor areas are rotated.

If the blank survivor space is unable to hold the surviving objects, the memory allocation guarantee mechanism is used to directly copy the new generation of surviving objects into the old generation memory, and if there is not enough contiguous memory in the Cenozoic for the creation of large objects, the memory space is allocated directly in the old generation.

Java Virtual machines for the new generation of garbage collection called Minor GC, the number of times more frequent, each recovery time is also relatively short.

Use the Java Virtual machine -xmn parameter to specify the Cenozoic memory size.

tenured (old generation, former generation): Part of the heap in the jvmspecification old generation stores the object of survival from the young generation. In general, older generations are storing long-term objects.

In older generations, objects are generally long life-cycle objects, and the survival rate of the objects is high, so the use of tag-sorting garbage collection algorithms in older generations .

Java virtual machines are known as majorgc/full GCfor older generations of garbage collection, with relatively few times and a longer time-to-recovery.

The Java Virtual machine -xms parameter can specify a minimum memory size, and the-xmx parameter can specify the maximum memory size, which subtracts the Cenozoic memory size specified by the XMN parameter, and calculates the minimum and maximum memory capacity of the old generation.

Perm (permanent generation, perpetual generation): The method area in JVM specification is used to store static files, now Java classes, methods, and so on. The persistence generation has no significant impact on garbage collection, but some applications may dynamically generate or invoke some classes, such as Hibernate, at which point a large, persistent generation space is required to store the new class in these runs.

The method area in the Java Virtual Machine memory, known as the permanent generation in the sunhotspot virtual machine, is an area of memory shared by each thread that stores data such as class information, constants, static variables, and immediately compiled code, which have been loaded by the virtual machine. permanent garbage collection is less efficient, but it must also be garbage collected, otherwise it will still throw a OutOfMemoryError exception if the memory is not sufficient for permanent generation.

The permanent generation also uses the tag-collation algorithm for garbage collection , and the Java Virtual machine parameters -xx:permsize and -xx:maxpermsize can set the initial size and maximum capacity of the permanent generation.

Garbage collection Process

We looked at the memory partition management of the JVM, and now we see how the JVM's garbage collection works.

First, when the Java application Server is started, the JVM starts, and the class and interface of the JDK, the classes and interfaces required by the application Server runtime, and the class and interface definition files of the Java EE application are also loaded into the JVM's permanent storage with the compiled class file or the class file in the jar package. Create the JVM in the Garden of Eden, apply the Java objects that the server runtime must have, create the Java objects that must be created when the EE application starts, and you can provide services externally.

The JVM creates the appropriate Java objects in the Eden area based on each request of the user, and when the space in Eden is insufficient to create a new Java object, the JVM's garbage collector performs garbage collection on the Eden campus, Destroys Java objects that are no longer referenced by other objects (if the object is referenced only by an object that has no other object reference, this object is also classified as having no need to exist, and so on), and the Java objects referenced by other objects are moved to the Survivor Zone 0.

If there is sufficient space in the Survivor 0 area, it is placed directly in the Survivor Zone 0; If the survivor 0 does not have enough space to store, the JVM's garbage collector performs garbage collection on the survivor's 0, destroying Java objects that are no longer referenced by other objects. And move those Java objects that are referenced by other objects to the Survivor Zone 1.

If there is sufficient space in the Survivor 1 area, it is placed directly in the Survivor Zone 1; If the survivor 1 does not have enough space to store, the JVM's garbage collector performs garbage collection on the survivor's 1, destroying Java objects that are no longer referenced by other objects. And move those Java objects that are referenced by other objects to the retirement area.

If the pension area has enough space to store it directly into the pension area; if there is not enough room for the retirement area, the JVM's garbage collector performs garbage collection on the old-age district, destroys Java objects that are no longer referenced by other objects, and preserves Java objects that are referenced by other objects.

If there is no space in the final retirement area, Survivor 1, Survivor 0 and Eden, the JVM will report "JVM Heap Space Overflow" (Java.lang.OutOfMemoryError:Java heap spaces), or there is no space in the heap space to create objects.

This is the memory partition management of the JVM, compared to non-partitioning, generally, garbage collection is much faster, because it is not necessary to scan the entire memory and save a lot of time.

Spatial allocation and promotion of objects

(1) Object priority assigned on Eden

(2) Large objects go straight into the old age

The virtual machine provides the-xx:pretenuresizethreshold parameter, and objects larger than this parameter value are allocated directly to the old age. Because the new generation uses a tag-copy strategy, allocating large objects in Eden will result in a large number of memory copies between the Eden area and the two survivor zones.

(3) long-term survival of the object will enter the old age

Objects in the Survivor area each time the MINORGC, the age increases 1 years old, when its age increases to a certain extent (by default 15 years old), will be promoted to the old age.


Trigger: When to start GC

Minor GC (Cenozoic recovery) trigger conditions are relatively simple,Eden space is not enough to start Minor GC recycling generation .

The full GC (old-age recovery, typically accompanied by a MINORGC) has several triggering conditions:

(1) Lack of space in the old age

(2) Permspace space shortage

(3) The average size of the MINORGC promoted to the old age is greater than the remaining space in the old age.

Note here: Permspace is not the same as the method area, but HOTSPOTJVM with Permspace to implement the method area, some virtual machines do not permspace and other mechanisms to implement the method area.


Implementation: The type of the collector in the JVM Serial collector (Serial Collector)

Single-threaded recycle operation, suspending execution of all application threads during recycling, default collector in client mode

Young generation of recovery algorithm (Minor Collection): The Eden area of the survival of the object moved to the area, to the area can not be moved directly to the old generation, the from zone to the to area, to the area can not be installed directly to the old generation, From the area inside the age of a large upgrade to the old generation. After the recycle, the Eden and from zones are empty, at which point the function of from and to is swapped, from the to,to to the from, and each round is empty before it is recycled. The design of the selection for replication.

Old generation recovery algorithm (full Collection): The recycling of older generations is divided into three steps, Mark (Mark), Clear (Sweep), and merge (Compact). The tagging phase marks all surviving objects, clears all dead objects, and merges all living objects into the first part of the old age, leaving the spare fragments behind. The design of the selection for merging, reduce the fragmentation of memory.

Parallel collector (Parallel Collector)

Using multiple threads for garbage collection at the same time, the multi-core environment can make full use of CPU resources, reduce recovery time, increase JVM productivity, and the default collector in server mode. As with the serial collector, the execution of all application threads is suspended during recycling.

Young Generation Recovery algorithm (Minor Collection): garbage is recycled using multiple threads, and the algorithm for each thread is the same as the serial collector.

Old generation recovery algorithm (full Collection): The old generation is still single-threaded, the same as the serial collector.

Parallel merge collector (Parallel compacting Collection)

The recycling of young generations and older generations is handled with multithreading. Older generations have shorter payback times than parallel recyclers, reducing the pause interval (pause time).

Young Generation Recovery algorithm (Minor Collection): Same as parallel collector (Parallelcollector)

Old generation recovery algorithm (full Collection): The old generation is divided into three steps, marking, counting, merging. This is the idea of dividing the old generation into a number of fixed-size regions. Tagging phase, dividing all surviving objects into n groups (which should be the same as the number of recycled threads), each thread independently responsible for its own set, marking the location of the surviving object and the survival information of the region, marked as parallel. Statistical phase, the survival rate of each region is counted, in principle, the survival rate is higher than before, to find the starting position worth merging (most of the objects are not worth merging), the statistical phase is serial (single thread). The merging phase, based on the information in the statistical phase, replicates the surviving objects from one region to another (region) in parallel.

Concurrency token cleanup collector (Concurrent mark-sweep Collector)

aka Low Latency Collector (low-latencycollector), which allows applications to be suspended for the shortest time possible by various means. There is no merge or copy operation that basically performs the recycle operation concurrently with the application.

Young Generation Recovery algorithm (Minor Collection): Same as parallel collector (Parallelcollector)

Old generation recovery algorithm (full Collection): divided into four steps, initial tag (Initial mark), concurrency tag (concurrentmark), re-tagging (Remark), and concurrency Cleanup (Concurrent Sweep). Note that there are no merge operations, so there is fragmentation.

Initialization phase: Pauses the application thread, finds all surviving objects, takes a short time, and the collector uses a single thread.

Concurrency tagging phase: The Collector tag operation runs concurrently with the app, and the collector uses a single-threaded token to survive the object.

Mark again: The concurrent tagging phase may add or modify objects as the application is also running. So pause the application thread again, find all the modified objects, and use the multithreaded markup.

Concurrent cleanup: The Collector cleanup operation runs concurrently with the app, and the collector uses a single thread to clean up the dead object.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

JVM garbage collection mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.