This article is based on Zhou Zhiming's "in-depth understanding of Java virtual machines"
If the collection algorithm is the method of memory recycling, the garbage collector is the specific implementation of memory recycling. There is no provision for how the garbage collector should be implemented in the Java Virtual Machine specification, so different vendors, different versions of the virtual machines, may have a large difference in the garbage collector, and generally provide parameters for users to combine their own application characteristics and requirements of the collectors used in various eras.
The figure shows 7 collectors that act on different generations (including the early Access G1 collector introduced after JDK 1.6_update14), and if there is a connection between the two collectors, they can be used together.
Before we introduce the characteristics of these collectors, let's make a clear point: although we are comparing the individual collectors, we are not going to pick out the best collectors. Because up to now no best collectors have appeared, and there is no universal collector, so we have chosen only the most appropriate collectors for specific applications. This does not require a lot of explanation to prove that if there is a perfect collector that is universally acceptable and suitable for any scenario, there is no need for the hotspot virtual machine to implement so many different collectors.
--------------------------New Generation Collector------------------------
1. Serial collector: The most basic and longest-growing collector
Apply : It doesn't seem to work, but it's still the default Cenozoic collector that the virtual machine is running in client mode until now.
Features:
1). single-threaded collector , stating that it only uses one CPU or one collection thread to complete garbage collection work
2). When it is garbage collected, all other worker threads must be paused (sun will call this thing "Stop the World") until it collects the end. This work is actually automatically initiated by the virtual machine in the background and auto-completion, the user is not visible in the case of the user's normal working thread all stop, which is unacceptable for many applications.
Collection algorithm : using copy algorithm
Pros : Simple and efficient (single-line turndown with other collectors), for a single CPU-constrained environment, the serial collector is naturally able to achieve the highest single-threaded collection efficiency due to the lack of thread interaction overhead. In the user's desktop scenario, the memory assigned to the virtual machine management is generally not very large, collecting a few 10 trillion or even one hundred or two hundred trillion of new generation (only the new generation of memory used, desktop applications are basically no longer large), the pause time can be controlled in dozens of milliseconds up to 100 milliseconds, as long as it does not occur frequently, This pause is acceptable. Therefore, the serial collector is a good choice for virtual machines running in client mode.
cons : Pausing threads during GC to bring bad experiences to users
collocation : CMS or serial old (MSC)
-----------------------------------------------------------------------------------------------------------
2 parnew Collector
The Parnew collector is actually a multithreaded version of the serial collector, and in addition to using multiple threads for garbage collection, the rest of the behavior is exactly the same as the serial collector, both of which have a considerable amount of code to implement.
for : New generation in virtual machines running in server mode
features :
1). Multithreaded GC (parallel):P Arnew is a multithreaded version of serial, which uses a lot of code.
2). Pausing all user threads at GC
algorithm : Using the replication algorithm
Advantages : High efficiency
disadvantage : When the GC pauses the thread to give the user the bad experience, the single-threaded effect is not necessarily better than serial
collocation : CMS or serial old (MSC)
The Parnew collector has a performance-independent but important reason is that, in addition to the serial collector, only it can work with the CMS collector at this time. After JDK 1.5 uses the CMS to collect the old age, the Cenozoic can only choose one of the parnew or serial collectors.
---------------------------------------------------------------------------------------------------------
3. Parallel Scavenge collector
applicable:The new generation of collectors, which operate in the background without requiring too many interactive tasks.
Features:1. Multithreaded GC (parallel)
2. Pausing all user threads at GC
different from other collectors:
1). The focus of collectors such as PARNEW,CMS is to minimize the downtime of user threads when garbage collection is minimized, while the parallel scavenge collector's goal is to achieve a controllable throughput.
Throughput: Run code Time/(run user code time + garbage Collection Time)
The shorter the pause time, the more suitable the program that needs to interact with the user, the good response speed can improve the user's experience, while the high throughput can make the most efficient use of CPU time and complete the operation task of the program as soon as possible, which is mainly suitable for the background operation without needing too much interactive task.
2). Parallel scavenge can adopt a GC adaptive Tuning strategy (this is the other 21 important differences)
parameters: For precise control of throughput
-xx:maxgcpausemillis Maximum garbage collection pause time
-xx:gctimeratio the ratio of garbage collection time to running user code time = garbage collection time/running user code time, equivalent to the reciprocal of the throughput.
Implement: Reduces GC Pause time: Sacrificing throughput and Cenozoic space (reduces cenozoic space, GC frequency increases, throughput decreases)
Adaptive Tuning strategy for GC
-xx:+useadaptivesizepolicy uses an adaptive adjustment strategy that does not need to specify the size of the Cenozoic, the proportions of Eden and Surivior, the age of the old ages, and other details, The virtual machine automatically adjusts these parameters according to the state of the current system to provide the most appropriate pause time or maximum throughput.
algorithm: Using a copy algorithm
Advantages:Efficient
collocation: Parallel old or serial old (MSC)
---------------------------------------------------------------------------------------------------------
-------------------------old age collector----------------------------
4. Serial Old collector
applicable
1). The old age of virtual machines running in client mode
2). In server mode, it has two major uses
①. Paired with parallel scavenge
②. As a backup plan for the CMS collector, use when concurrent collection occurs concurrent Mode failure
Features: 1. The old version of the single-threaded gc,serial collector
2. Pausing all user threads at GC
Algorithm: Using marker-Collation algorithm
Advantages: simple, efficient
Cons: Pausing threads during GC to bring bad experiences to users
Collocation: Serial old (MSC) or parnew
-----------------------------------------------------------------------------------------------------
5. Parallel Old collector
applies to: New generation in virtual machines running in server mode. Parallel scavenge plus parallel old collector is a priority when it comes to throughput and CPU-sensitive situations.
features
1). Multithreaded GC (parallel):P Arallel scavenge version of the old age
2). Pausing all user threads at GC
3). This collector is only available in JDK 1.6
Algorithm: Using marker-Collation algorithm
Advantages: High Efficiency
Disadvantage: When the GC pauses the thread to give the user the bad experience, the single-threaded effect is not necessarily better than serial
Collocation: Parallel Scavenge
-------------------------------------------------------------------------------------
6. CMS (Concurrent Mark Sweep) collector: The first real-world concurrency collector on a hotspot. The CMS (Concurrent Mark Sweep) collector is a collector that targets the shortest recovery pause time. At present, a large part of the Java applications are concentrated in the Internet or B/s system services, such applications pay particular attention to the response of the service, the hope that the system will be the shortest time to pause, in order to bring a better user experience.
Applicable: The old age in virtual machines running in server mode is suitable for applications with high response time requirements.
Algorithm: Using the "mark-clear" algorithm
Features: Multithreading concurrency
Process:
1). Initial tag: Pauses the user thread, marks the object that the GC roots can directly relate to, fast
2). Concurrency token: The process of GC Roots tracing by user thread and tagged thread concurrency
3). Re-tagging: In order to fix the tag records of the part of the object that the tag has changed during the concurrent tagging period, the period of time will generally be slightly longer than the initial marking phase, but it is much shorter than the concurrent tag.
4). Concurrent Purge: The user thread is concurrent with the purge thread.
Where the initial tag, re-tagging these two steps still need "Stop the World".
Because the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup process throughout the process, the memory reclamation process for the CMS collector is performed concurrently with the user thread, in general.
Through figure 3-10, you can see more clearly the concurrency in the operation steps of the CMS collector and the time required to pause.
Advantages:Concurrent collection, low pauses-the system has a very short pause time because the longest concurrent tagging and concurrent cleanup phases work in parallel with the user thread.
Disadvantages:
1). Very sensitive to CPU resources.
Cause: Programs that are designed for concurrency are more sensitive to CPU resources. When concurrency occurs, because the application slows down by taking up a portion of the thread (or CPU resources), the overall throughput decreases, the application slows, and is especially noticeable when the number of CPUs is low.
Workaround: Incremental concurrency Collector (I-CMS): Allows GC threads to run alternately with the user thread when concurrent tagging, purging, to reduce the time that GC threads monopolize the CPU. When the GC time becomes longer, the effect is generally discarded.
2). Unable to process floating garbage, there may be a "Concurrent Mode" Failure failure that causes another full GC to occur.
Floating garbage: During the concurrent purge phase, the user thread is still running, and the resulting garbage cannot be processed in that secondary collection.
At the same time, in order to ensure concurrency, it is necessary to reserve memory for user threads, so the CMS can not wait until the old age is almost completely filled with the collection. The CMS in JDK 1.5 is triggered by default when the old age is used 68%. 1.6 is 92%.
A "Concurrent Mode" Failure fails when the memory reserved during the CMS runs does not meet the needs of the program, and the virtual machine will start a fallback plan: temporarily use the serial old collector to re-collect garbage from the older age, so that the pause time will be long.
3). Create Space debris, which affects the allocation of large objects.
This is due to the fact that the collector is implemented by the "tag-purge" algorithm. So there's always a lot of room left, and when you can't find enough contiguous space to allocate the current object, you have to start out a full GC in advance.
Solve:
The 1.-xx:+usecmscompactfullcollection switch parameter (default on) is used to turn on the memory fragmentation when the CMS is going to perform full GC, the process cannot be concurrent, so the pause time becomes longer.
The 2.-xx:cmsfullgcsbeforecompaction is used to set how many times an uncompressed full GC is followed by a full GC with compression. The default is 0, which means defragmenting each time you enter the full GC.
Pairing: Serial or Parnew
-----------------------------------------------------------------------------------------
---------------------------the new generation and the old age are applicable---------------------
7. G1 Collector
Application: For service-side applications, for the new generation and the old age. The most cutting-edge results of current collector technology development
Features:
1. Parallel + concurrency. Take advantage of CPU resources
2. Collection of generations.
3. Space integration. G1, as a whole, is the "mark-and-organize" algorithm, which is a "copy" algorithm from the local (two region). Does not produce space debris.
4. Predictable pauses. Establishing a predictable attitude time model that allows the user to explicitly specify that the time spent in garbage collection should not exceed n milliseconds in a length of M milliseconds, which is almost a feature of the real-time Java (RTSJ) garbage collector.
The origin of garbage first name
The G1 collector can achieve low-pause memory recovery without sacrificing throughput, because it is able to aggressively avoid garbage collection across the region. G1 divides the memory into region, tracks the value of the garbage accumulation in each region (the amount of space collected and the experience of the time it takes to reclaim), maintains a prioritized list in the background, and prioritizes recovering the region with the highest value per time, based on the allowable collection times.
Difficulty: Although memory is divided into region, garbage collection cannot really be done in region, because region cannot be isolated, there is a reference to more than one region of an object, and when the accessibility judgment determines if the object is alive, does it need to scan the entire heap space? Note: This problem exists in all collectors, such as the presence of references between the new generation and the old age.
Resolution: 1. Use remembered set to avoid lap heap scanning.
Process: Each region in the G1 has a corresponding remembered Set, and the Virtual machine Discovery program writes to the reference type of data, resulting in a write barrier temporary interrupt operation. Check to see if the object referenced by the reference type is in a different region (in the generational example, check whether an object in the old age refers to an object in the Cenozoic), and if so, The relevant reference information is recorded in the remembered set of the region to which the referenced object belongs by Cardtable. When memory reclamation is in progress, adding the remembered set to the enumeration scope of the GC root node guarantees that no full heap scan will be missed.
Memory Layout: G1 heap Memory Layout Unlike other collectors, G1 divides the entire heap memory space into areas of equal size, although there are still concepts of the Cenozoic and older ages, but the Cenozoic and the old are no longer physically isolated, they are part of the region (no continuous) collection.
process (similar to CMS)
1. Initial tag: Suspend user thread, mark the object that GC roots can directly relate to
2. Concurrency token: User thread and tag thread concurrency, trace of GC roots
3. The final tag corrects the tag record of the part of the object that caused the markup to change as the user thread continues to run, in the concurrency tagging phase.
4. Filter collection:
Algorithm: Global tagging-collation + local replication algorithm
Advantages: High efficiency, controllable and predictable pause time
8. Summary of GC related parameters
1. Parameters related to the serial collector
-XX:+USESERIALGC: Use the serial collector in the new generation and the old age.
-xx:+suivivorratio: Sets the scale of the Eden area size and the size of the survivor area.
-xx:+pretenuresizethreshold: Sets the threshold for large objects to enter the old age directly. When the size of the object exceeds this value, it is allocated directly in the old age.
-xx:maxtenuringthreshold: Sets the maximum age for an object to enter the old years. Each time Minor GC, the object age is added 1. Any object greater than this age is bound to enter the old age.
2. Parameters related to parallel GC
-XX:+USEPARNEWGC: Using a parallel collector in the Cenozoic.
-XX:+USEPARALLELOLDGC: The old age uses the parallel collection collector.
-xx:parallelgcthreads: Sets the number of threads to use for garbage collection. Typically, the number of CPUs can be equal. However, it is reasonable to set a relatively small number in the case of more CPU.
-xx:maxgcpausemills: Sets the maximum garbage collection pause time. Its value is an integer greater than 0. When the collector is working, it adjusts the Java heap size or some other parameters, as much as possible to control the pause time within Maxgcpausemills.
-xx:gctimeratio: Sets the throughput size, whose value is an integer between 0-100. Assuming that the value of Gctimeratio is N, the system will spend no more than 1/(1+n) for garbage collection.
-xx:+useadaptivesizepolicy: Turn on adaptive GC policy. In this mode, the size of the Cenozoic, the proportions of Eden and survivor, and the age of the older age are automatically adjusted to achieve a balance between heap size, throughput, and pause time.
3. Parameters related to the CMS collector
-XX:+USECONCMARKSWEEPGC: The new generation uses the parallel collector, the older generation uses the cms+ serial collector.
-xx:+parallelcmsthreads: Sets the number of threads for the CMS.
-xx:+cmsinitiatingoccupancyfraction: Set the CMS collector to trigger after the old age space is used, default is 68%.
-xx:+usefullgcsbeforecompaction: Sets the number of times the CMS garbage collection is performed, and then the memory is compressed once.
-xx:+cmsclassunloadingenabled: Allows the collection of class metadata.
-xx:+cmsparallelremarkendable: Enables parallel re-tagging.
-xx:cmsinitatingpermoccupancyfraction: When the occupancy of the zone reaches this percentage, start the CMS recycle (provided the-xx:+cmsclassunloadingenabled is activated).
-xx:usecmsinitatingoccupancyonly: Indicates that the CMS is recycled only when the threshold is reached.
-xx:+cmsincrementalmode: Use incremental mode, which is more suitable for single CPU.
4. Parameters related to the G1 collector
-XX:+USEG1GC: Use the G1 collector.
-xx:+unlockexperimentalvmoptions: Allows the use of experimental parameters.
-xx:+maxgcpausemills: Sets the maximum garbage collection pause time.
-xx:+gcpauseintervalmills: Sets the time between pauses.
5. Other parameters
-XX:+DISABLEEXPLICITGC: Disables the display of GC.
In-depth understanding of jvm--Chapter III-Garbage collector