jvm-garbage collection algorithm, garbage collector, memory allocation, and collection strategy

Source: Internet
Author: User
Tags garbage collection

is the object dead?

There are two ways to determine whether an object is alive:

  1. Reference counting algorithm: Each object has a reference count property, a new reference count plus 1, and a reference release count minus 1. When the count is 0, it can be recycled.

  2, the Accessibility analysis algorithm (reachability): starting from the GC roots down search, the path of the search is called the reference chain. When an object is connected to a GC roots without any reference chain, it proves that the object is not available. Unreachable objects.

  In--java languages, the objects that can be used as GC roots include the following:
(1) The object referenced in the virtual machine stack (local variable table in the stack frame);
(2) The object referenced by the class static property in the method area;
(3) The object that the constants in the method area refer to;
(4) The object referenced by JNI (that is, generally referred to as the native method) in the local method stack;

  To really declare an object dead, you have to go through two markers. If the object is found to be not associated with GC Root after the accessibility analysis, it is marked for the first time.

  The garbage collection of the permanent generation mainly recycles two parts: obsolete constants and useless classes

  Determine if a class is a "useless class" to satisfy the following 3 conditions:  

(1) All instances of the class have been reclaimed, that is, no instances of the class exist in the Java heap
(2) The ClassLoader that loaded the class have been recycled.
(3) The corresponding Java.lang.Class object of this class is not referenced anywhere, and the method of accessing the class can be accessed anywhere by reflection.

Whether the class is recycled, the Hosspot virtual machine provides -XNOCLASSGC parameters for control, and you can use -verbose:class and -xx: Traceclassunloading View information for class loading and unloading.

A brief introduction to quoting:

The hotspot virtual machine is an accessibility analysis algorithm to determine whether an object survives. So to talk about Java references, Java references include: strong references (Final Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference)

These 4 kinds of references are under the Java.lang.ref package:

  

Strong references (Final Reference)

Refers to references that are common in program code, such as Object obj = new Object (), as long as a strong reference exists, and the garbage collector never reclaims the referenced object.

strong references have the following three features:

1. A strong reference can directly access the target object;
2. The object that the strong reference lock points to will not be reclaimed by the system at any time. The JVM prefers to throw an oom exception and not reclaim the object that the strong reference points to;
3. Strong applications may lead to memory leaks;

Soft references (Soft Reference)

is used to describe objects that are useful but not necessary. For objects associated with soft references, these objects are then listed in the collection scope for a second collection before the system is about to occur with a memory overflow exception. If this collection does not have enough memory, a memory overflow exception will be thrown. For objects associated with soft references, if memory is sufficient, the garbage collector does not reclaim the

If the memory is not enough, the memory of these objects will be reclaimed.

Weak references (Weak Reference)

Used to describe an object that is not required, but its strength is weaker than a soft reference, and the object associated with the weak reference only survives until the next garbage collection is sent. When the garbage collector is working, the objects associated with a weak reference are reclaimed regardless of whether the current memory is sufficient. Once a weak reference object is reclaimed by the garbage collector, it is added to a registration reference queue.

Virtual reference (Phantom Reference)

A virtual reference, also known as a phantom reference or phantom Reference, is the weakest reference relationship. An object that holds a virtual reference is almost the same as no reference, and can be reclaimed by the garbage collector at any time. When you try to get a strong reference by a virtual reference, the Get () method always fails. Also, a virtual reference must be used in conjunction with a reference queue, which is designed to track the garbage collection process.

garbage collection algorithm:1. Mark-Sweep algorithm

The mark-and-purge algorithm is divided into two stages of the "mark" stage and the "purge" phase: All objects that need to be reclaimed are marked first, and all tagged objects are collected uniformly after the tag is complete. The reason is that it is the most basic collection algorithm, because the subsequent collection algorithms are based on this idea and improve their shortcomings.

Its main shortcomings are two: one is the efficiency problem, the labeling and removal process is not efficient, and the other is a space problem, the mark after the purge will produce a large number of discontinuous memory fragmentation, too much space debris may cause, When the program needs to allocate large objects in the future, it cannot find enough contiguous memory and has to trigger another garbage collection action in advance.

2. Copy algorithm

The Copying collection algorithm, which divides available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is exhausted, copy the surviving object to the other piece, and then clean up the used memory space once.

This makes each one of the pieces of memory recycling, memory allocation will not consider the complexity of memory fragmentation, as long as the mobile heap top pointer, in order to allocate memory, simple implementation, efficient operation. Only the cost of this algorithm is to reduce the memory to half the original, the continuous replication of long-lived objects leads to inefficient.

3. Marking-Sorting algorithm

The replication collection algorithm performs more replication operations when the object has a higher survival rate and becomes less efficient. More crucially, if you do not want to waste 50% of space, you need to have additional space to allocate security, in order to deal with all the objects in the memory used in 100% survival extreme situation, so in the old age generally can not directly select this algorithm.

According to the characteristics of the old age, someone proposed another "mark-and-sweep" (mark-compact) algorithm, the marking process is still the same as the "tag-purge" algorithm, but the next step is not directly to the recyclable objects to clean up, but to let all the surviving objects moved to one end, Then directly clean out the memory outside the end boundary.

                                              

4, Generational collection algorithm

The basic hypothesis of GC generation: Most objects have very short life cycles and short survival times.

The "generational collection" (generational Collection) algorithm, which divides the Java heap into the Cenozoic and the old, allows the most appropriate collection algorithm to be used according to the characteristics of each age. In the Cenozoic, every garbage collection is found to have a large number of objects died, only a small number of survival, then choose the replication algorithm, only need to pay a small number of surviving objects of the replication cost can be completed collection. And the old Times

Because the object has a high survival rate and there is no additional space to guarantee it, it must be recycled using the tag-clean or mark-organize algorithm.

garbage collector:

                                              

1. Serial Collector:

The serial collector is the oldest, most stable, and efficient collector, which can cause a long pause, using only one thread to recycle. The new generation, the old generation uses the serial recovery, the Cenozoic replication algorithm, the old generation mark-compression , the garbage collection process will stop the world (service pauses).

In the New generation, the name is "defaultGeneration", and the name in the GC log file is "defnew", which is the default garbage collector for the new generation in client mode.

Parameter control: -XX:+USESERIALGC serial Collector

2. Parnew Collector:

The Parnew collector is actually a multithreaded version of the serial collector. Cenozoic parallel, old generation serial; New generation replication algorithm, old age token-compression algorithm.

    In the New generation the name is "Parallel-Generation" in the GC log file name is "parnew", is the multi-threaded version of Serial, is the preferred new generation collector in Server mode, Only the Parnew collector can be used with the CMS collector.

    Parameter control:

-XX:+USEPARNEWGC parnew Collector

-xx:parallelgcthreads Limit the number of threads

3. Parallel Scavenge Collector:

The Parallel scavenge (parallel multithreaded collector) collector is similar to the Parnew collector, and the Parallel collector pays more attention to the throughput of the system. The adaptive throttling strategy can be turned on by parameters, and the virtual opportunity collects performance monitoring information based on the current system's operation, dynamically adjusting these parameters to provide the most appropriate pause time or maximum throughput ( throughput = When running user code

(When running user code + garbage Collection Time ), or parameters can be used to control the GC time is not greater than the number of milliseconds or scale; Cenozoic replication algorithm, old generation marker-compression.

    The name in the GC log file is "Psyounggen"

    Adaptive adjustment is also a significant difference between the parallel scavenge collector and the Pamew collector

Parameter control:

    -XX:+USEPARALLELGC using parallel collector + old age serial

precise control of throughput parameters:

     -xx:maxgcpausemillis Control maximum garbage collection pause time, value is a number of milliseconds greater than 0

     -xx:gctimeratio directly sets the throughput size. Its parameter is an integer greater than 0 less than 100, which is the ratio of garbage collection time to total time, equal to the reciprocal of the throughput

     -xx:useadaptivesizepolicy This parameter is open, you do not need to manually specify the Cenozoic size (-xmn), the ratio of Eden and Survivor (-xx:survivorratio), the age of promotion to the old age (-xx: Pretenuresizethreshold) and other detail parameters.

4. Serial Old Collector

Single-threaded collector with tag-collation algorithm . The main purpose of this collector is to use the virtual machine in client mode

5. Parallel Old Collector

Parallel old is the older version of Parallel scavenge, using multithreading and flags-sorting algorithms.

    parameter control: -XX:+USEPARALLELGC using parallel collector + old age serial

6. CMS Collector

The CMS (Concurrent Mark Sweep) collector is a collector ( based on the tag-purge algorithm implementation ) that targets the shortest recovery pause time. At present, a large part of the Java applications are concentrated in the Internet or B/s system services, such applications pay particular attention to the response of the service, the hope that the system will be the shortest time to pause, in order to bring a better user experience.

The entire collection process is divided into 4 steps:

(1) Initial mark (CMS initial mark)

(2) Concurrent tagging (CMS concurrent mark)

(3) Re-tagging (CMS Remark)

(4) Concurrent purge (CMS concurrent sweep)

One of the first marks, re-tagging these two steps still need "Stopthe World". The initial tag simply marks the object that the GC Roots can directly relate to, is fast, the concurrent tagging phase is the process of GC Roots tracing, and the re-tagging phase is to fix the concurrency tag period. The part of the tag that has changed because the user program continues to operate

Like marking records, this phase of the pause time will generally be slightly longer than the initial marking phase, but far shorter than the time of the concurrent tag.

Because the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup process throughout the process, the memory reclamation process for the CMS collector is performed concurrently with the user thread, in general. Old age collector (new generation using parnew)

    Pros: concurrent collection, low pauses
    Cons: generating large amounts of space debris, concurrent phases can reduce throughput

     Parameter settings:

    -XX:+USECONCMARKSWEEPGC using the CMS collector
    -xx:+ usecmscompactatfullcollection full GC , a defragmentation is used to turn on memory fragmentation consolidation when the CMS collector is not up to full GC; The grooming process is exclusive, Causes the pause time to become longer (by default)
-xx:+cmsfullgcsbeforecompaction Set up a few full GC and then defragment it once
    -xx:parallelcmsthreads Set the number of threads for the CMS (typically approximately equal to the number of available CPUs)

7. G1 Collector

The G1 collector is a server-on-the-surface collector that has the following characteristics compared to CMS:

    1. Space integration: The G1 collector uses a mark-and-finish algorithm that does not generate memory space fragmentation to allocate large objects without triggering a full GC in advance because no contiguous space is found

    2, predictable pause: This is another big advantage of G1, reduce the pause time is the common concern of G1 and CMS, but G1 in addition to the pursuit of low pause, but also to establish a predictable pause time model, can let the user explicitly specified in a length of n milliseconds in a time fragment, The time spent on garbage collection must not exceed n milliseconds, which is almost a real-time Java (RTSJ) garbage collector

Characteristics of the.

The garbage collectors mentioned above are collected in the whole new generation or the old age, and G1 is no longer the case. The memory layout of the Java heap differs greatly from other collectors when using the G1 collector, which divides the entire Java heap into separate, equal-sized regions (region), although it retains the concept of the Cenozoic and the old, but the new generation and the old age are no longer physical

Estrangement, they are all part of the set of region ( which can be discontinuous).

The garbage collectors mentioned above are collected in the whole new generation or the old age, and G1 is no longer the case. With the G1 collector, the memory layout of the Java heap differs greatly from that of other collectors, which divide the entire Java heap into separate, equal-sized regions (region), while preserving the concepts of the Cenozoic and old, but the Cenozoic and the old are no longer physical barriers, they are all part (Can be discontinuous) a collection of region.

                                              

G1 's Cenozoic collection is similar to that of Parnew, where the new generation takes up a certain percentage of time to start collecting. Similar to CMS, the G1 collector collects an old age object with a short pause.

Collection steps:    

    1, marking phase , first initial mark (Initial-mark), this stage is paused (Stop the World Event), and will trigger a normal mintor GC. corresponding GC log:gc pause (Inital-mark)

    2, Root region scanning, the program will be in the process of recycling survivor area (survive to the old age), this process must be completed before young GC.
    3, Concurrent marking, in the entire heap for concurrent tagging (and application concurrent execution), this process may be interrupted by young GC. In the concurrency tagging phase, if all objects in a zone object are found to be garbage, the area is immediately recycled (x in the figure). At the same time, the object activity of each region (the proportion of the surviving objects in the region) is calculated during the concurrent tagging process.

                                                

    

    4, Remark, re-mark , there will be a short pause (STW). The re-tagging phase is used to collect the concurrency tag phase to generate new garbage (concurrent phases and applications run together), and the G1 uses a faster initial snapshot algorithm than the CMS: Snapshot-at-the-beginning (SATB).

    5, Copy/cleanup, multi-threaded removal of inactivated objects, there will be STW. G1 copies the surviving objects of the reclaimed area to the new zone, clears the remember sets, and empties the Reclaim area and returns it to the list of idle areas.

                                                  

    6. After the copy/purge process . The active object of the reclaimed area has been concentrated back into the deep blue and dark green areas.

JVM Default garbage collector combination:

jdk1.7 Default garbage collector Parallel Scavenge (Cenozoic) +parallel old (older)

jdk1.8 Default garbage collector Parallel Scavenge (Cenozoic) +parallel old (older)

jdk1.9 Default garbage collector G1

Common collector combinations:

                                                  

Reference:

"1", http://mp.weixin.qq.com/s?__biz=MzI4NDY5Mjc1Mg==&mid=2247483952&idx=1&sn= ea12792a9b7c67baddfaf425d8272d33&chksm= Ebf6da4fdc815359869107a4acd15538b3596ba006b4005b216688b69372650dbd18c0184643&scene=21#wechat_redirect

"2" in-depth understanding of Java Virtual machines: JVM advanced Features and Best Practices (second edition) Zhou Zhiming

"3" blog, http://www.importnew.com/20468.html

jvm-garbage collection algorithm, garbage collector, memory allocation, and collection strategy

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.