This series provides Reading Notes for deep understanding of Java Virtual Machine (by Zhou Zhiming.
Garbage Collector
The JVM specification has no provisions on the implementation of the garbage collector. Therefore, the garbage collector provided by virtual machines of different vendors and versions may be very different. Here we will discuss Based on the sun hotspot virtual machine version 1.6 update22. The Garbage Collector contained in this virtual machine is as follows:
There are 7 types of garbage collectors in hotspot 1.6. If there is a connection between the two garbage collectors, they can be used together. I will introduce these garbage collectors one by one.
The serial collector is one of the longest garbage collectors in history and is a single-threaded collector. When collecting garbage, it must pause other working threads until the collection ends. The garbage collection process of the seiral collector is initiated and completed by the virtual machine in the background. It suspends all working threads invisible to the user. This will bring a bad user experience, shows how it works:
Although the serial/serial old collector runs the serial collector, it is still the default New Generation collector for virtual machines to run in client mode until 1.6, despite its poor user experience. This is mainly because of its simplicity and efficiency, and in desktop applications, the memory allocated to JVM management is generally not very large, collecting dozens to 100 or 200 MB of new generation, the pause time can be totally controlled within dozens of milliseconds, up to one hundred milliseconds. As long as the pause does not happen frequently, it is completely acceptable. The parnew collector is actually a multi-threaded version of the serial collector, the other behaviors include collection algorithms, stop the world, object allocation rules, and collection policies, all of which are exactly the same as those of the serial collector. In fact, these two collectors share a lot of code, shows how the parnew collector works:
Parnew/serial old collector running parnew is the preferred new generation collector for many virtual machines running in server mode. An important reason is that apart from the serial collector, only when it works with the CMS collector. CMS is the first real concurrency collector in the hotspot virtual machine. We will discuss the CMS collector in detail below. Concurrent and parallel (parallel) in the garbage collector: the multi-head Garbage Collector thread is working in parallel, but the user thread is still waiting. Concurrent: the user thread and the recycler thread work simultaneously. Parallel scavenge collector parallel scavenge is similar to parnew. It is a new generation of parallel multi-thread collector using the replication algorithm. So what is the difference between it and parnew? The biggest difference between them is that the focus is different. The focus of collectors such as parnew is to minimize the pause time of user threads during garbage collection, the goal of the parallel scavenge collector is to achieve a controllable throughput (throughput ). Throughput refers to the ratio of the CPU time used to run the user code to the total CPU time, that is, the throughput = running user code time/(running user code time + garbage collection time ). Another feature of the parallel scavenge collector is that adaptive adjustment policies can be used. Using this policy, the virtual opportunity Collects performance monitoring information based on the current system running conditions, dynamically adjusts detailed parameters such as the new generation size, Eden-region ratio, and age of objects promoted to the old age. Shows the working process:
The serial old collector serial old is the elder care agent of the serial collector, as shown in the following figure:
The serial/serial old collector runs parallel old collector parallel old, which is an old version of the parallel scavenge collector. the working process of parallel old collector is as follows: the CMS collector CMS (concurrent mark sweep) collector is a concurrent Collector Based on the "mark-clear" algorithm. It is designed to obtain the shortest recovery pause time. Its operation process is more complex than the collector described above. The whole process is divided into four steps, including:
- Initial tag
- Concurrent tag
- Remark
- Concurrent cleanup
The initial mark and re-mark still need to suspend all user threads, that is, stop the world, but the initial mark only marks the objects that can be directly associated with GC roots, and the re-mark is only to correct the concurrent mark period, the user program continues to run and changes that part of the object, the pause time of this phase is much shorter than the previously introduced stop the world. The longest concurrent mark and concurrent purge in the entire collection process work with the user thread. In general, the GC thread in CMS is executed concurrently with the user thread. The process can be clearly explained: the CMS collector runs CMS as a breakthrough collector, which greatly shortens the pause time of user threads and can be considered as implementing concurrent garbage collection, however, CMS still has the following defects:
- Sensitive to CPU resources. Almost all parallel/concurrent systems are CPU-sensitive. Although it rarely causes user freezing, it will slow down the application and reduce the total throughput because it occupies a part of the thread.
- Floating garbage cannot be processed, and "concurrent mode failure" may fail, resulting in another full GC. Because the user thread is still executing at the same time in the CMS concurrent cleanup phase, the spam CMS generated by these threads cannot be processed, so they have to stay in the next GC for cleanup, this part of garbage is called floating garbage ". Because the user thread is still running in the garbage collection stage, CMS needs to reserve enough space for these threads to use, instead, it cannot be recycled when it is almost full in the old age like other collectors. By default, the CMS collector is activated after 68% is used in the old age.
- This disadvantage comes from the "mark-clear" algorithm used by CMS. This method is easy to generate a large number of fragments. When there are too many fragments, it is easy to see that there is a large amount of space available in the old age, but the continuous space cannot be found for allocation to large objects, so that GC has to be triggered in advance.
G1 collector G1 (garbage first) collector is currently the most cutting-edge achievement in the development of collector technology. It has two significant improvements compared with CMS:
- Use the "tag-sorting" algorithm to avoid fragments.
- You can precisely control choppy videos. This allows you to specify a parameter to control the time n for garbage collection in a time slice with a length of M.
G1. the garbage collection can be completed without sacrificing throughput because G1. The previous collectors were collected in the new or old generation, while G1 divided the entire Java heap (including the new generation and old generation) into multiple independent regions with fixed sizes, it also tracks the accumulation of these areas in Chengdu, maintains a priority list in the background, and selects areas from the list each time according to the priority for collection.