Java programmers go from stupid birds to cainiao () to Java Virtual Machine (4)-garbage collection mechanism of Java Virtual Machine

Source: Internet
Author: User

The Java language has been at the forefront of programming languages since its appearance. One major reason is that it is related to the platform where Java applications run. We all know that Java applications run on java virtual machines. This greatly reduces the frequency of dealing with Java applications and underlying operating systems. This provides a good foundation for cross-platform Java programs. An important mechanism provided for us in the Java Virtual Machine is the automatic memory management mechanism of the Java Virtual Machine. This is what we call the garbage collection mechanism, which allows developers to manage the memory in the application by themselves. C/C ++ developers need to use functions such as malloc/free and new/Delete to explicitly allocate and release memory. This puts forward high requirements for developers, which may cause problems such as memory access errors and Memory leakage. Today, let's take a look at the automatic garbage collection mechanism, a powerful feature provided by Java virtual machine.


In C/C ++ programs, they do not have an automatic garbage collection mechanism in Java, which requires developers to manually allocate and release memory, this requires our developers to have a certain degree of care and experience in memory management. If the memory management is poor, the two most common problems may occur. One is "hanging reference", and the other is memory overflow. The Hanging reference is that the memory block pointed to by an object reference has been mistakenly recycled and re-allocated to the new object, if the program continues to use this reference, it will cause unexpected results. The second memory overflow is easy to understand. during the development process, developers only display the applied memory and forget to use it to release the memory. This will cause memory overflow for a long time. For a language like Java that has an automatic memory management mechanism, we developers only need to consider the use of the reference, the memory management can be handed over to our language runtime environment for management .. Developers do not need to care about the underlying details of memory allocation and recovery. Java platform uses the Garbage Collector for automatic memory management. This greatly reduces the workload of developers.


I. Java garbage collection mechanism

The Java garbage collector is responsible for completing three tasks:

1. allocate memory

2. Ensure that the memory of the referenced object is not recycled incorrectly.

3. Reclaim the memory space of objects that are no longer referenced.

Garbage collection is a complex and time-consuming operation. If the JVM spends too much time on garbage collection, the running performance of the application will inevitably be affected. Generally, when the Garbage Collector recycles an application, the execution of the entire application is temporarily suspended (STOP-the-world. This is because the Garbage Collector needs to update the actual memory address referenced by all objects in the application. Different hardware platforms support different garbage collection methods. For example, on a multi-CPU platform, garbage collection can be performed in parallel. A single CPU platform can only be used in serial mode. Different applications may expect different garbage collection methods. The server-side application may wish to spend less time on garbage collection during the entire running time of the application, the better. For applications that interact with users, it is possible that the interval between application pauses caused by garbage collection is smaller, the better. In this case, JVM provides a variety of garbage collection methods and corresponding performance tuning parameters. Applications can be customized as needed.


Ii. Determine whether the object is recycled.


1. Reference Counting Algorithm

Add a reference counter to the object. Every time a counter is referenced, the counter value is added with 1. When the reference fails, the counter value is reduced by 1; when the counter value is 0 at any time, it indicates that it cannot be used. This algorithm is simple to implement, but it is difficult to solve the issue of circular references between objects. Therefore, Java does not use this algorithm! This is something many people have misunderstood.


2. Root Search Algorithm

Using a series of objects named "GC root" as the starting point, you can start searching down from these nodes. The path you search for is called a reference chain, when an object to GC root is not connected with any reference chain, it is proved that this object is unavailable. If the object does not find a reference chain connected to the GC root after the root search, it will be marked for the first time and check whether the object needs to execute finalize () method (forget the finalize () method. It can be replaced by try-finally or another method). When it is marked for the second time, the object will be recycled.

Iii. Java Virtual Machine basic garbage collection algorithm:


1. Mark-clear (mark-sweep)


This algorithm is executed in two phases. In the first stage, all referenced objects are marked from the reference root node. In the second stage, the whole heap is traversed to clear unmarked objects. It stops all work. The Collector accesses every active node from the root and Marks every node it accesses. After all references are passed, the collection is complete, and then the heap is cleared (that is, every object in the heap is checked ), all unlabeled objects are collected as garbage and the idle list is returned. The heap before garbage collection is displayed. The shadow blocks are garbage because user programs cannot reach them:

Accessible and inaccessible objects


 

Mark-clearing is easy to implement. It can easily recycle the loop structure and not add the burden on the compiler or value assignment function as the reference count. But it is also insufficient-the collection pause may be very long and the entire heap is accessible during the cleanup phase, this has a very negative impact on the performance of the virtual memory system that may have a Page Swap heap.

The biggest problem with Mark-clearing is that every active (that is, allocated) object, whether or not it is reachable, can be accessed in the clearing phase. Because many objects may become garbage, it means the collector spends a lot of energy checking and processing garbage. The mark-clear collector can also easily cause fragments in the heap, which can cause regional problems and cause allocation failure, even if it seems that there is enough free memory available. This algorithm suspends the entire application and generates memory fragments.


 

2. Copy)


This algorithm divides the memory space into two equal regions and uses only one of them at a time. During garbage collection, traverse the current region and copy the objects in use to another region. The secondary algorithm only processes objects in use at a time, so the replication cost is relatively small. At the same time, the corresponding memory can be organized after the replication, but there is a "Fragmentation" problem. Of course, the disadvantage of this algorithm is also obvious, that is, it requires two times of memory space.


3. Mark-compact)


This algorithm combines the advantages of "tag-clear" and "copy" algorithms. It is also divided into two phases. In the first phase, all referenced objects are marked from the root node. In the second stage, the whole heap is traversed to clear unlabeled objects and compress the surviving objects to one of the heap, discharge in sequence. This algorithm avoids the "tag-clear" fragmentation problem, and also avoids the space problem of the "copy" algorithm.


 

4. incremental collecting)


The garbage collection algorithm is implemented, that is, garbage collection is performed simultaneously by the application. I don't know why the Collector in jdk5.0 does not use this algorithm.


5. Generational collecting)

Divides the heap into the new generation (EDEN, from distributed vor, to distributed VOR) and the old generation, and uses the replication algorithm (minor-GC) in the new generation, when some objects remain in the new generation after many minor-GC operations, they will be moved to the old age. In the old age, the Mark-cleaning or mark-sorting algorithm is used, that is, Major GC/full GC.


-XX: pretenursesizethreshold = 1024. objects larger than the second parameter will be directly allocated to the old age (do not write "short-lived big objects" as much as possible "!)

-XX: maxtenuringthreshold = 15. After 15 times of survival in the same vor space, it will be moved to the old age.

If the total size of all objects of the same age in the region vor space is greater than half of the region vor space, objects of the same age or age can directly enter the old age.

During minor GC, the virtual opportunity checks whether the average size of each previous promotion to the old age is greater than the remaining space of the old age. If it is greater than, a full GC is performed directly.



When evaluating garbage collection algorithms, we may consider all of the following criteria:

· Pause time. Does the collector stop all work for garbage collection? How long does it take to stop? Does pause have a time limit?

· Predictability of suspension. Is the garbage collection pause scheduled to take place at a convenient time in the user program instead of the garbage collector?

· CPU usage. What is the percentage of the total available CPU time used in garbage collection?

· Memory size. Many garbage collection algorithms need to split the heap into independent Memory Spaces, some of which are inaccessible to user programs at some time points. This means that the actual size of the heap may be several times larger than the maximum heap resident space of the user program.

· Virtual memory interaction. In a system with limited physical memory, a complete garbage collection process may mistakenly place the abnormal page in the memory for inspection. Because the cost of page errors is high, it is necessary for the garbage collector to correctly manage the referenced region (locality.

· Cache interaction. Even if the entire heap can be placed on the system in the main memory-in fact, almost all Java applications can do this, garbage collection often overwrites the data used by the user program to the cache, this affects the performance of your programs.

· Impact on the program culture. Although some people think that the work of the garbage collector is only to reclaim the inaccessibility of memory, others think that the garbage collector should also try to improve the reference regions of the user program. Sorting collectors and copying collectors reschedule objects during collection, which may improve the culture.

· Compiler and runtime impact. Some garbage collection algorithms require an important combination of compilers or runtime environments, such as updating the reference count when pointers are allocated. This increases the compiler's work because it must generate these bookkeeping commands and increase the runtime environment overhead because it must execute these additional commands. What are the impact of these requirements on performance? Does it interfere with compilation optimization?


Regardless of the algorithm, the development of hardware and software makes garbage collection more practical. Empirical studies in the 25% S and 40% s show that for large LISP programs, garbage collection consumes to of runtime time. Garbage collection cannot be completely invisible, so there must be a long way to go.


Iv. Three garbage collectors


Currently, three collectors are available: Serial collector, parallel collector, and concurrent collector.


1. Serial collector


A single thread is used to process all garbage collection tasks. Because multi-thread interaction is not required, the efficiency is relatively high. However, the advantages of multi-processor cannot be used, so this collector is suitable for single-processor machines. Of course, this collector can also be used on a multi-processor machine with a small amount of data (about MB. You can enable-XX: + useserialgc.


2. Parallel collector


1) Parallel garbage collection for the young generation can reduce the garbage collection time. Generally used on multi-threaded Multi-processor machines. Use-XX: + useparallelgc. To open. The parallel collector is introduced in the sixth update of j2se5.0 and is enhanced in Java se6.0-it can be used for parallel collection by heap elders. If the old generation does not use concurrent collection, it uses a single thread for garbage collection, which restricts the scalability. Use-XX: + useparalleloldgc.

2) use-XX: parallelgcthreads = <n> to set the number of parallel garbage collection threads. This value can be set to be equal to the number of machine processors.

3) This collector can be configured as follows:

Maximum garbage collection pause: specifies the maximum pause time for garbage collection, which is specified by-XX: maxgcpausemillis = <n>. <N> millisecond. If this value is specified, the heap size and garbage collection parameters are adjusted to reach the specified value. Setting this value may reduce the application throughput.

Throughput: the ratio of the garbage collection time to the non-garbage collection time. The formula is 1/(1 + n) by-XX: gctimeratio = <n> ). For example, when-XX: gctimeratio = 19, it indicates that 5% of the time is used for garbage collection. The default value is 99, that is, 1% of the time is used for garbage collection.


3. Concurrent collector


It can ensure that most of the work is performed concurrently (the application does not stop), and garbage collection only suspends a small amount of time. This collector is suitable for medium and large scale applications with high response time requirements. Enable-XX: + useconcmarksweepgc.

1) the concurrent collector mainly reduces the pause time of the old generation. It uses an independent garbage collection thread to track reachable objects without stopping the application. In each old generation garbage collection cycle, the concurrent collector will temporarily pause the entire application at the early stage of collection, and pause again during collection. The second pause will be a little longer than the first one. In this process, multiple threads perform garbage collection at the same time.

2) Concurrent collectors use the processor for a short pause. In a system with N processors, the concurrent collection part is recycled using k/N available processors. Generally, 1 <= k <= N/4.

3) use the concurrent collector on a host with only one processor. If it is set to incremental mode, a short pause time can also be obtained.

4) floating garbage collection: because the application is running at the same time for garbage collection, some garbage may be generated when the garbage collection is completed, resulting in "floating garbage ", this garbage can be recycled only in the next garbage collection cycle. Therefore, the concurrent collector generally requires 20% of the reserved space for these floating spam.

5) Concurrent mode failure: the concurrent collector collects data when the application is running. Therefore, ensure that the heap has sufficient space for the application during the garbage collection period. Otherwise, garbage collection is not completed, the heap space is full first. In this case, "concurrent mode failure" will occur, and the entire application will be paused for garbage collection.

6) Start the concurrent COLLECTOR: Because concurrent collection is collected during application running, you must ensure that there is sufficient memory space for the program to use before the collection is complete. Otherwise, "concurrent mode failure" appears ". You can set-XX: cmsinitiatingoccupancyfraction = <n> to specify the number of remaining heaps to start concurrent collection.


5. Some Supplements to garbage collection

According to the above descriptions, it can be found that garbage collection has the following characteristics:

(1) unpredictability of the occurrence of garbage collection: Because different garbage collection algorithms are implemented and different collection mechanisms are adopted, it may occur on a regular basis, it may occur when the system idle CPU resources occur, or it may be the same as the original garbage collection, when the memory consumption limit occurs, this is related to the selection and specific settings of the garbage collector.

(2) Accuracy of garbage collection: mainly includes two aspects: (a) the garbage collector can accurately mark the living objects; (B) the garbage collector can accurately locate the reference relationship between objects. The former is the premise to completely recycle all discarded objects, otherwise it may cause memory leakage. The latter is a necessary condition for implementing algorithms such as merging and copying. All reachable objects can be reliably recycled, and all objects can be re-allocated. This allows object replication and Object Memory reduction, effectively preventing memory fragmentation.

(3) There are many different types of garbage collectors, each of which has its own algorithms and their performance is different. They both stop the application running when the garbage collection starts, in addition, when the garbage collection starts, the application thread is also allowed to run, and at the same time, the garbage collection is run in multiple threads.

(4) The implementation of garbage collection is closely related to the specific JVM and JVM memory models. Different JVMs may adopt different garbage collection methods, and the JVM memory model determines which types of garbage collection can be used by the JVM. Now, the memory systems in the hotspot series JVM use advanced object-oriented framework design, which enables the series JVM to adopt the most advanced garbage collection.

(5) with the development of technology, modern garbage collection technology provides many optional garbage collectors, and different parameters can be set when configuring each collector, this makes it possible to obtain the optimal application performance based on different application environments.



References

L Java Theory and Practice: A Brief History of garbage collection

L Java SE 6 hotspot [Tm] Virtual Machine garbage collection Tuning

L Java hotspot VM options

L a collection of JVM options

L Java garbage collection mechanism




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.