Java garbage collection mechanism

Source: Internet
Author: User
Tags garbage collection

1. Preface

This article summarizes the understanding of the Java Memory management mechanism, as easy to understand as possible, knowledge from the deep understanding of the Java Virtual Machine book.

2. Origins

The simple understanding of the computer is that the results are obtained by parameters according to the execution plan. The execution plan is the procedure, the parameter is the actual variable, finally runs obtains the result which we want. Because of its cheapness and persistence, the disk is used to save programs and data, but is subject to the speed of execution, and the memory function is revealed. Memory running fast, but expensive, volatile data (power loss), the capacity is far less than the storage medium, but fast is the hard truth, the calculation is nothing more than the result of fast. Because of the capacity constraints, it is impossible to all the data and program code are loaded into memory, and because the operating system evolved to the time-sharing system, the management of memory is more prominent.

Some languages require the code itself to complete the allocation and release of memory, typically C + +. This is an efficient and clear way for developers to decide how to use memory, but for large-scale project development disaster, improper allocation of memory, not release, collaborative development can lead to a series of problems. And the memory problem is difficult to find, testing, the program writer's personal ability to put forward a very high requirements, but obviously can not be expected to do all.

In order to solve the appeal problem, someone came up with another program to manage the memory, release it, this is typical of the JVM. It allows developers to not focus too much on the allocation and use of memory, all done, of course, completely do not care is not possible. This is the origin of writing a program for memory allocation and recycling alone.

You can think about how to write a program that manages memory and what issues to consider:

1. How to manage memory to make memory usage more efficient

2. Under what circumstances can objects be recycled

3. What time and how to recycle

This article will comb through the above questions.

3. Explore 3.1 Memory management

Managing memory is a technical activity because it involves issues of allocation, recycling, and redistribution, and good design can improve efficiency.

There is a basic concept to understand, that is, sequential reading and writing must be faster than discrete read and write, so for the sake of efficiency, the need to ascend is the continuity of memory. The first is to open up a large chunk of memory, and then a piece of the allocation, the first is no problem, but after the recovery will be a lot of memory fragmentation, this is certainly not conducive to subsequent use. Several algorithms are discussed below:

3.1.1 Tag-Purge algorithm

This is the simplest algorithm mentioned above. Marks all objects that need to be reclaimed and reclaims the tagged objects uniformly after the tag is complete. This causes memory to produce a large number of fragments, which is not contiguous enough, and also hinders the allocation of large objects.

3.1.2 Copy algorithm

This approach is a solution to the fragmentation of appeals, simple violence. allocating memory to a large area of two blocks, while cleaning up one piece, all sequentially moves the surviving object to the other side. This always ensures that the data in memory is continuous.

But this method has two problems, one is too memory-intensive, and the other is too much replication. Here is a basic definition of memory for virtual machines, the new generation and the old age. This partitioning is for a reason, most of the objects in the virtual machine are temporary, in the state of being born to Die (98%), but some objects will survive, even throughout the entire operating cycle. For these cases, the virtual machine defines two zones, which can be used to accelerate the recycling of different algorithmic policies for their characteristics.

The replication algorithm is well suited to the new generation of Asao, which also means that it does not need to divide memory equally. Modern JVMs divide the Cenozoic into a piece of Eden Space (Eden Renaissance), 2 survivor (Survivors, Eden survivor). The default size ratio is 8:1:1. That is, Eden occupies 80% of the space, survivor each accounted for 10%. The specific operation is: when cleaning, the current use of Eden and survivor survived objects, moved to another piece of the currently unused survivor area, so wasted space is a survivor,10%, more than half save a lot of space.

Here are some other details: 10% of the survivor is not necessarily fully sufficient, this time will be some content refreshed to the old age. The old age is not enough. That's really going to throw an exception. This also involves the concept of full GC and general GC, followed by instructions.

3.1.3 Labeling-Sorting algorithm

The above replication algorithm two problems, although the first memory through the characteristics of the solution, only a waste of 10% of memory, but the survival of the object many times, the efficiency of replication is not resolved. Another definition of the old age, the area of the object of large probability will survive to the next GC, so obviously using the replication algorithm is not a good deal. This is usually done with a tagging algorithm. The popular will is to move the surviving object closer to the memory end, and finally unified clean out the memory behind the surviving object.

3.1.4 Generational Collection algorithm

This is not a new algorithm, but the idea mentioned above: according to the object survival time, minimize the operation, take the appropriate recovery algorithm. Therefore, the memory is divided into the new generation, the old age, and the new generation often take the replication algorithm, the old age often take the marker collation algorithm. Of course, this is not absolute, according to the actual use of the collector.

3.2 What objects need to be recycled

This is a good answer: objects that cannot be reused, that is, useless objects, can be recycled. The question is how to tell if an object is no longer being used?

3.2.1 Reference Counting method

This method is well understood: each object has a counter, if there is a place to refer to it, the counter plus 1, the reference failure is reduced by one. When the object is created, the reference will not be 0, and the object will not be found after 0, so it can be recycled. This method is simple, the judgment is very efficient, Redis uses this algorithm, some languages such as Python also use this method, but at least the mainstream JVM does not use this method.

The reason for this is the circular reference: a refers to the b,b reference to a, neither of which is 0, but no other place has anything to do with these two places. In this case it is also the object that should be cleaned up, but in fact the reference counting method cannot handle the situation.

Analysis of 3.2.2 Accessibility

To solve the appeal problem, the JVM has adopted a method of accessibility analysis. From a series of objects called "GC Roots" as the starting point, search down all the objects that can be accessed, and if an object cannot be searched, it must be unavailable. AB objects refer to each other, but gc roots cannot be searched, so the problem of invalid objects that are referenced by the loop is resolved.

GC Roots Object It seems to me that there are currently absolutely no objects to clean up, and there are several objects that meet this condition:

1. The object referenced in the virtual machine stack (local variable table in the stack frame). It's going to be used right away, how can it be cleaned

2. The class static property in the method area refers to the object, and the constant refers to the object. The static and final of the method area are basically and class loaded through the life cycle, and the duly long objects cannot be cleaned

3. The object referenced by the JNI (native method) in the local method stack. If you don't know the right example, the thread cannot be referenced without saving, but it is still held in the local method stack by the thread manager, which of course cannot be cleaned up.

3.2.3 again on quoting

References are narrower if they can only be labeled as referenced and not referenced, and are difficult to choose for some dispensable ones. For example, the design of a cache service, there is memory of course good, cache reservation. But when there is not enough memory, you will want to be able to effectively reduce these caches. Because the cache is definitely going to be used, the recovery algorithm cannot be recycled, and there is no way to discard it at this time. After JDK1.2, four kinds of references were presented: strong reference, soft reference, weak reference, virtual reference. There are a lot of explanations on the use of these references on the Internet, here is a recommended article, do not do too much description: here.

Strong references: Ubiquitous, new and the like. A strong reference is present and will not be recycled.

Soft references: Objects that are useful, but are not required. Before a memory overflow occurs, it is recycled and throws an exception if it is not enough after recycling. SoftReference class. (used for caching)

Weak reference: An object that is not required. Is weaker than a soft reference, and only survives until the next garbage collection occurs, and is erased when garbage is collected. WeakReference class to implement. (One-time use, auto-recycle, WEAKHASHMAP implementation, Tomcat for generational caching: here)

Virtual References: Phantom references or phantom effects, weakest relationships. Does not affect the lifetime of the object, the only effect is to receive a notification after the cleanup. Phantomreference to achieve. (for some operations after object destruction, Finalize can do the same, but due to some of its related problems, it is not good to use: here)

3.2.4 Recycling Process

Unreachable objects are not necessarily recycled, mainly because they are marked two times.

If not reachable, first mark first and determine if the Finalize method needs to be executed. Does not overwrite the Finalize method, or the method is executed, it does not need to be executed again. This situation will be cleaned up directly.

If a finalize method needs to be executed, it is placed in the F-queue queue, executed by the finalizer thread, and the meaning of execution is triggered, but not necessarily at the end of the run: because slow execution or a dead loop can cause other objects in the f-queue to wait, causing the recycling system to crash. The Finalize method allows this object to be referenced by other objects and can escape the fate of recycling. Because the GC then marks the F-queue for a second time, if it is held by another object, it will remove the collected collection, and if it does not escape, the object will actually be recycled.

3.2.5 Recovery Method Area

Many people think that the method area (permanent generation) is not necessary for recycling, although the virtual machine specification does not require recovery in the method area, cost-effective is relatively low, but there are also recycled.

The main content of recycling is two parts: obsolete constants and useless classes. JDK7 removes the character constant from the permanent generation PermGen (it is easy to overflow), JDK8 even removes the permanent generation, using the meta-data metaspace: here.

Character constants are easy to determine if recycling is required, but the class is cumbersome and requires the following 3 conditions to be met:

1. All instances of this class are recycled (ensure that no objects need to use the class's bytecode, methods, etc.)

2. The ClassLoader of the load class is recycled (making sure no objects can be created through new)

3. The class object of the object is not referenced and cannot be accessed by reflection on the method of the class. (Supplementary 2 of the reflex condition)

Even if they are satisfied, they will not necessarily be recycled. The hotspot provides-XNOCLASSGC parameters for control, or loads the unload information using-verbose:class and-xx:+traceclassloading,-xx:+traceclassunloading view classes. The first two can be used in the product version, unloading parameters are supported in version Fastdebug.

3.3 Hotspot Implementation

When analyzing accessibility, the system must be guaranteed to freeze, i.e. there is no new object generation during this time period. This is what is called the Stop the world, which suspends all operations and remains unchanged. At present, it is also trying to reduce the pause time.

The OOPMAP data structure is used in the hotspot to get a direct idea of where the object reference is stored. When the class is loaded, the hotspot calculates what data type is at the offset in the object, and during the JIT compilation, it also records where in the stack and register the reference is in a particular location. This allows the GC to get this information directly when it is scanned.

With the help of Oopmap, the enumeration of GC roots can be completed very quickly. But another problem is that there are a lot of instructions to make a reference change, and if each instruction generates OOPMAP overhead is high. In fact, no oopmap is generated for each instruction, only the information is recorded at a specific location, which is called a security point and can only be paused when the security point is reached, starting the GC.

The selection of the security point is basically selected by the program "Whether it has a specially crafted program for a long time to execute"-because each instruction executes very briefly and the program is less likely to run too long because the instruction stream length is too long, and the most obvious special feature of "Long Execution" is the instruction sequence multiplexing, Such as method calls, loop jumps, exception jumps, and so on, so the instructions with these functions will produce safepoint.

Another problem is how to get all the threads (not including the JNI calls) to the security point and then pause. There are two ways of doing this:

1. Preemptive Interrupt: Preemptive interrupt does not require the execution code of the thread to actively cooperate, when the GC occurs, all the threads are interrupted first, if the thread break is found not on the security point, restore, let it run to the security point. This method is no longer used.

2. Active Interrupt: The GC needs to interrupt the thread, not directly to the thread operation, simply set a flag, each thread execution will actively poll this flag, when found to be true, the suspension itself. The polling flag is coincident with the security point, plus the place where the object needs to be allocated memory. The test instruction is the generated polling instruction, which generates a self-trapping exception signal when the thread executes to the test instruction, pausing the thread implementation wait in the pre-registered exception handler.

The security point seems to solve the problem of how to get into GC, but it is not necessarily. The safepoint mechanism ensures that when the program executes, it encounters a safepoint that can enter the GC within a very long period of time. But what about when the program does not execute? When not executing is not allocating CPU time, the typical example is sleep or blocked state, the thread can not respond to the JVM's interrupt request, to a safe place to interrupt the suspend, the JVM obviously can not wait for the thread to be reassigned CPU time, this situation needs to take another method-- Security zone safe region to solve.

A security zone refers to a code in which the reference relationship does not change. GC is safe anywhere in the region. When a thread executes to a secure zone, it identifies itself to the safe region, and the GC does not have a pipeline status. When the thread leaves the security zone, check to see if the system has completed the root node enumeration, and if it is done, the threads continue to execute, otherwise you will have to wait until you receive a signal that you can leave the security zone.

4. Garbage collector

It says so much here, look at the JDK7 hotspot virtual machine for garbage collection implementation. As mentioned earlier, the JVM divides the memory into the young generation and the old age, which is also to use different algorithms for different life cycle objects to improve the efficiency of the strategy, so there is a use of a variety of garbage collector, is the generation of the collector and its can be combined with the collector.

You can see the younger generation: serial, parnew, Parallel scavenge

Older: CMS, Serial old, Parallel

G1 Universal in the younger generation and the old age, the other connection is can be used in combination of meaning, but the young generation can only choose one, corresponding to select a can be combined in the old age. G1 General, so choose it can not choose the other. In addition, this is an earlier version of the JVM provided by the collector, in recent years, there has been a great development, such as the ZGC provided by JDK11, claims to be very powerful. JDK9 improved the G1 collector and scrapped several combinations of Defnew+cms, parnew+serialold, and incremental CMS.

4.1 Serial Collector

This is the most basic collector, with a long history, single-threaded (not just using one thread, but pausing all other threads). This is the classic stop the world, very bad. Imagine running for 1 hours and suddenly stopping for 5 minutes, which is very detrimental to the program. is serial/serial old processing.

However, this collector is still the default Cenozoic collector running in client mode in version 1.7. The advantage is simplicity and efficiency.

4.2 Parnew Collector

This is the serial evolutionary version, the use of multi-threaded way, the rest of the same as the serial, such as control parameters:-xx:survivorratio,-xx:pretenuresizethreshold,-xx: Handlepromotionfailure. Work as follows:

In addition to multithreading, the collector does not have much to innovate with serial, but it is the preferred new generation collector for many virtual machines running in server mode, since only the serial collector is capable of working with the CMS collector (this is the first really meaningful concurrency collector , the first time that the garbage collection thread works with the user thread is implemented.

Unfortunately, CMS, as an old-age collector, cannot work with parallel scavenge. So you can only choose one of the parnew or serial. If you use the parameter-XX:+USECONCMARKSWEEPGC, the default is Parnew, or you can use the-XX:+USEPARNEWGC option to force the designation.

On a single CPU, parnew because of thread switching, the effect is certainly not as good as serial, the general number of threads is related to the number of CPU cores, you can limit the number of threads by-xx:parallelgcthreads.

4.3 Parallel scavenge Collector

This is a new generation of collectors, using the replication algorithm, parallel multi-threading, and parnew What is the difference? The focus is on the difference between the target and other collectors.

The general goal of the collector is to complete the cleanup action as soon as possible, and the shorter the pause time the better. But unlike parallel, it focuses on throughput, that is, the proportion of user code running time to the total elapsed time. The calculation formula is: Run user code time/(run user code time + garbage collection time), such as if the total run for 10 minutes, garbage collection took 1 minutes, the throughput is 90%.

The task of strong interactivity will need to pause the shorter the better, such as a mouse click event, encounter garbage collection and so on for 1 minutes, it can not stand. High throughput is not suitable for interactive tasks, but the response is to use the CPU more efficiently, so the operation efficiency will be high.

-xx:maxgcpausemillis set the maximum garbage collection pause time, set short is to sacrifice throughput and Cenozoic, will lead to collection of frequent

-xx:gctimeratio sets the throughput size, which is greater than 0 for integers less than 100. For example, set 19, that is, the GC occupies 5%. (1/(1+19)), the default value is 99, which is the GC time of 1%.

-xx:+useadaptivesizepolicy, set this will not need to set-xmn-xx:survivorratio,-xx:pretenuresizethreshold and other details parameters, will be adaptive. You only need to set the Xmx maximum heap, and the above two parameters can be.

4.4 Serial Old Collector

This is an older version of the old age collector, serial. The same single thread, using the tag-collation algorithm, is used in the client mode.

There are two other uses:

1. Use with parallel scavenge collectors

2. As a fallback scheme for CMS, occurs when concurrent Mode failure is used.

4.5 Parallel Old Collector

This is used to save the new generation before the use of parallel scavenge, the old age can only choose the poor performance of the serial of the Collector's dilemma.

The throughput of Parallel scavenge + Serial old is not necessarily stronger than the parnew+ CMS combination, and this collector makes up for this embarrassment and is more suitable for applications where throughput is a priority.

4.6 CMS Collector

The Concurrent Mark sweep collector is a collector with the goal of obtaining the shortest recovery pause time. At present, many Java programs focus on the Internet station or b/s system service end, such services pay particular attention to the response speed, hope that the system pauses the shortest time, in order to bring a better experience to users.

The CMS takes a tag-finishing algorithm that is not the same as before the old age collector, which uses the mark-and-sweep algorithm, the entire process is divided into 4 steps:

1. Initial tag: Pause all threads and mark the objects that the GC roots can directly relate to, fast

2. Concurrency token: No need to pause all threads, is the process of GC Roots tracing

3. Re-tagging: pausing threads, correcting tag records for parts of a tag that have changed as a result of the user program's continued operation during a concurrent tag, takes longer than the initial tag, and the Concurrency tag is short

4. Concurrent purge: No need to pause threads, clear tagged objects

The CMS has 3 notable drawbacks:

1. Sensitive to CPU resources. While the concurrency phase does not cause the user thread to pause, it consumes a portion of the resources causing the application to slow down and the overall throughput to decrease. The default number of recycled threads is (cpu+3)/4, not less than 25% of CPU resources. An incremental concurrency collector, a variant of CMS, was developed to allow GC threads and user threads to run alternately during the concurrency phase, reducing GC exclusive time and not performing well. It was also mentioned that JDK9 was abandoned in the past.

2. Unable to process floating garbage, concurrent Mode Failure may occur, resulting in another full GC. The concurrent cleanup phase user thread is still generating garbage, which is left to the next GC process, known as floating garbage. So the CMS can't wait for the old age to be almost completely filled up like the other collectors, and it needs to leave some space to deal with the situation. In JDK5, the old age of 68% is triggered, and the trigger percentage can be increased by the value of the parameter-xx:cmsinitiatingoccupancyfraction. If the reserved space is not enough, will trigger concurrent Mode Failure, will use the preliminary scheme, previously said serial old collector to clean up, the pause time is longer.

3. Mark-Clear creates a large amount of space debris, which can cause great inconvenience to large object allocations, triggering the full GC ahead of time. The-xx:+usecmscompactatfullcollection switch parameter (enabled by default) is used to merge the memory fragments when the CMS collector wants full GC, and the defragmentation is not able to be done concurrently, so the pause time is longer. -xx:cmsfullgcsbeforecompaction, this parameter is used to set the number of times the uncompressed full GC is compressed once, and the default of 0 means that it is compressed every time.

4.7 G1 Collector

The G1 collector is set to the default collector in JDK9, which has been better developed over the years. This is a garbage collector for service-side applications. The mission assigned to it is to replace the CMS collector released by JDK1.5. It is also possible to see that the collector has been completed for a long time.

G1 has the following features:

1. Concurrency and parallelism: Take advantage of multi-CPU, multi-core environment, Shorten stop-the-world time

2. Generational collection: The generational concept remains in G1, although it manages the new generation and the old age.

3. Spatial integration: A collector based on the tag-collation algorithm, which is implemented locally based on the replication algorithm. Does not produce large amounts of fragmentation.

4. Predictable pauses: In addition to the quest for pauses, a predictable pause-time model is established that allows the user to specify a time fragment of length m milliseconds, which consumes less than n milliseconds in garbage collection, and is almost special for the garbage collector in real-time Java (RTSJ).

Although the generational concept is preserved, it is not physically isolated, but it is dividing the heap into multiple, independent areas of equal size. The ability to create a pause time model is also because it can be planned to avoid full-area garbage collection throughout the Java heap. G1 tracks the value of the garbage accumulation in each area, maintains a prioritized list in the background, and prioritizes the most valuable areas at each time based on the allowable collection times, ensuring maximum efficiency in a limited time. Although the idea is simple, the implementation is very complex, because the region is not isolated, it is not possible to scan only one area, otherwise the other regions quoted how to judge? From the 04 G1 theory until now is set as the default collector, it is difficult to see.

In G1, the remembered set is used to avoid a full heap scan, and each zone has a set that, when the virtual machine Discovery program writes to a reference type of data, generates a write barrier temporarily interrupt writes. Checks if the object referenced by reference is in a different region, and if so, records the relevant information through cardtable to the remembered set of the zone to which the referenced object belongs. When recycled, this ensures that no full heap scans are made and that the area object is not referenced by other regions.

The G1 recovery step is similar to the CMS and is divided into the following stages:

1. Initial tag: Mark the object that GC roots can directly relate to, modify the value of Tams, let the next stage user program concurrently run, can create new object in the correct usable area

2. Concurrency token: From the object in the heap for accessibility analysis, identify the surviving objects, can be executed concurrently

3. Final tag: Fix the record that the tag changes due to the execution of the user program during the concurrent marking process, and the remembered set logs of the change recording thread will merge this Part data into remembered set.

4. Filter recycling: Sort each area and make a recovery plan based on the expected GC pause time for the user.

5 GC Log

Time: "GC type" zone: GC before using capacity->GC (total capacity in this region), GC time consuming "GC before Java heap usage capacity->GC Java heap usage capacity (heap total capacity), total GC time consuming"

Example: 33.125: [GC [Defnew:3324k->152k (3712K), 0.0025925 secs] 3324k->152k (11904K), 0.0031680 secs]

The occurrence region is different according to different collectors, different generations.

Defnew is the new generation of serial.

Parnew: A new generation of parnew

Psyounggen: A new generation of parallel scavenge

There is another format, such as:

100.667: [Full GC [tenured:0k->210k (10240K), 0.0149142 secs] 4603k->210k (19456), [perm:2999k, 2999K (21248K)] , 0.0150007 secs] [times:user=0.01 sys=0.00, real=0.02 secs]

In time, user is the CPU time consumed by the state, SYS is the CPU time consumed by the kernel state, and real refers to the time of the wall clock that passes from the beginning to the end (Wall clock times). The difference between the CPU time and the wall clock time is that the wall clock time contains various non-operational waiting times, such as disk IO, which the CPU does not include. However, if it is a multicore CPU, multithreading will overlay these times, so you will see that the user or sys time exceeds the real time.

The memory recycle log is printed by parameter-xx:+printgcdetails.

6 Memory allocation process

In most cases, objects are allocated in the new generation of Eden, and when there is not enough space to trigger a minor GC, the memory is still insufficient to move the object to the secured area of the old age.

Large objects require a lot of contiguous space, such as byte[], which can trigger garbage collection, and worse, a bunch of big objects. The-xx:pretenuresizethreshold parameter is provided by the virtual machine, and objects larger than this setting value are allocated directly in the old age. Avoid large amounts of memory duplication between Eden and two survivor zones. This parameter must be written as a number of bytes and cannot be written directly to MB.

Another way to enter the old era is to meet the age threshold, after each minor GC after the object is still or, its age will be added 1, reaching the threshold will enter the old age. Set by-xx:maxtenuringthreshold.

In order to better apply the memory condition, not necessarily must reach the age to promote the old age, if survivor space in the same age all object size sum is greater than half of survivor space, the age is equal to this age object can enter the old age directly.

Before sending the minor GC, it is safe to check whether the largest available contiguous space in the old age is greater than the sum of all the objects in the new generation, so the minor GC is secure. Because the old age is in the new generation of space shortage of the guarantor, the worst case is that all objects have entered the old age, so long as the old age of space enough, then minor GC must be safe. If it is not secure, the Handlepromotionfailure setting value will be checked to see if the guarantee fails, and if allowed, it will check whether the continuous maximum free space in the old age is greater than the average size of the previous promotion to the old age object, and if it is greater than, attempt to minor GC, Although this time there is a risk, less than or not allowed to fail, it will be changed to full GC. JDK6 Update24, this parameter does not actually work, greater than the average promotion size, will be minor GC, otherwise is the full GC.

Java garbage collection mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.