Java performance optimization [3]: Garbage Collection (GC)

Source: Internet
Author: User

The previous post talked about performance problems caused by memory allocation differences between the reference type and basic type. Today, let's talk about topics related to memory release (mainly GC.
Statement in advance: although sun has been swallowed up by Oracle, sun is still known as sun due to habits and lazy typing.

JVM memory
In the Java Virtual Machine specification (For details, refer to "here"), the following types of memory space are mentioned:
◇ Stack memory (stack): Private to each thread.
◇ Heap memory (HEAP): common to all threads.
◇ Method Area ", it stores the reflection information of each loading class, the code of the class function, and the compilation time.
◇ Native method Stack: it is mainly used for native code in JNI, which is rarely involved.

Stack memory and heap memory have been used for literacy in the previous post. Today we will discuss the "Garbage Collection" topic, mainly related to heap memory. Other things are not the focus of today's discussion. You may have a chat later.

Introduction to the garbage collection mechanism
In fact, Java virtual machine specification does not specify details about garbage collection. The specific implementation of garbage collection depends entirely on the JVM designers. Therefore, GC behavior may be different between different JVMs. Next, let's take Sun's official JVM.Brief IntroductionThe GC mechanism.
◇ When Will garbage collection be performed?
Generally, the JVM starts garbage collection when it finds that the heap memory is insufficient. But everyone needs to recognize the cruel fact that the time for jvm gc is unpredictable. The GC startup time is affected by various running environment factors, which is too random.
Although we cannot predict it accurately, it is quite convenient if you want to know how garbage collection is executed every time. The JVM command line parameter "-XX: + printgc" can be used to print the relevant information.
In addition, calling system. GC () only recommends JVM for GC. It is hard to say whether or not the JVM will be implemented. Generally, it is not recommended that you manually call system. GC () or make the JVM decide on its own. In addition, you can use the JVM command line parameter "-XX: + disableexplicitgc" to disable system. GC.
◇ Who is responsible for garbage collection?
In general, the JVM has one or more dedicated garbage collection threads, which are responsible for clearing the garbage collection memory.
◇ How to discover spam objects?
The garbage collection thread starts to traverse object references from the root set. The so-called "root SET" is a running thread that can be accessedReference variable(For example, parameters and local variables of the current function of all threads, member variables of the current class, etc ). The garbage collection thread first finds all objects directly referenced by the root SET (Set 1), and then finds all objects directly referenced by set 1 (Set 2 ), then find all objects directly referenced by set 2 ...... this repeats until all objects that can be traversed are completely traversed.
All objects that can be reached through the above traversal from the root set are called reachable objects or valid objects; otherwise, they are inaccessible objects or invalid objects (that is, garbage ).
◇ How to clear/recycle garbage?
Through the above stage, we will find all the junk objects. Then, the garbage collection thread cleans up and recycles the garbage, including re-converting the garbage memory into available memory, sorting out the memory to eliminate memory fragments, and so on. This process involves several algorithms. If you are interested, refer to "here ". We will not go into further discussions due to space limitations.
◇ Generational
Early JVMs did not adopt generational technology, and all GC-managed objects were stored in the same heap. The disadvantage of doing so is obvious: Every GC has to traverse all objects, with a high overhead. In fact, most objects have a short life cycle (short-lived objects), and only a few objects have a long life. Among these short-lived objects, only a few objects occupy a large memory space; A large number of other short-lived objects belong to small objects (which conforms to the principle ).
With this in mind, since JDK 1.2, JVM began to use generational garbage collection (generational garbagecollection ). JVM divides GC-related memory into tenured, nursery, and permanent ).MajorityObjects are created in the young generation. If an object is still alive after several rounds of GC, move it to the old generation. In addition, if an object is large during creation, it may be directly thrown into the old generation. With this policy, the young generation always saves small short-lived objects. In terms of space size, the young generation is relatively small, while the old generation is relatively large.
With the generational technology, jvm gc is also divided into two types: main collection (major collection) and secondary collection (minor collection ). The primary collection is to clean up both the old and young generations, so the overhead is very large and not often carried out; secondary collection is only to clean up the young generation, the overhead is very small, often carried out.

What is the impact of GC on performance?
I just introduced the general principle of GC. What are the impacts of GC on performance? There are several main aspects:
◇ Cause the pause of the current running thread
Early GC is relatively retarded. During its operation, all other threads are suspended (to avoid affecting garbage collection ). Wait until the GC completes, and other threads continue to run. Therefore, once the gc of the early jdk starts to work, the entire program will be in a false state and will lose various responses.
After years of technological improvement (including the adoption of generational technology), GC has become more savvy since JDK 1.4. During its operation, it only occasionally suspends the running of other threads (from a long-time suspended death to a temporary shock ).
◇ Overhead of Object Reference Traversal
Imagine if there are many objects in the JVM, it would be difficult to traverse all reachable objects. This overhead is not small.
◇ Clear and recycle garbage overhead
After traversing the object reference, it also has a large overhead to clear and recycle the garbage. This part of overhead may include copying memory blocks and updating object references.

Several collectors
◇ Two performance indicators
Because we are talking about performance today, we will inevitably mention two important indicators to measure GC performance: throughput and pause time ). The term throughput is not very intuitive. It can be explained as JVM.Not usedThe percentage of GC time to total time. The larger the throughput, the better the pause time.
Different applications have different concerns on these two indicators (as described in the latter mask), that is, the so-called "difficult to adjust the public opinions ". Many JVM vendors have to provide several garbage collectors for users to choose to cater to the "public interface. Different collectors use different collection policies.
◇ Serial collector)
Use the command line option "-XX: + useserialgc" to specify.
This kind of collector is the most traditional collector. It uses a single thread for garbage collection, which is suitable for a single CPU machine. In addition, the serial collector can be used for small applications that have no special requirements for the preceding two metrics.
◇ Parallel collector (parallel throughput collector)
As the name suggests, this collector uses multiple threads for garbage collection to achieve high throughput. The number of garbage collection threads is specified by the command line option "-XX: parallelgcthreads = N. You can set this value to make full use of multiple CPUs/cores.
When you use the command line option "-XX: + useparallelgc": it uses multiple garbage collection threads for the young generation, and uses the serial mode of a single thread for the old generation. This option was first introduced in JDK 1.5.
When you use the command line option "-XX: + useparalleloldgc", it uses multiple garbage collection threads for both the young and old generations. However, this option is introduced only from JDK 1.6.
◇ Concurrent low pause collector)
Use the command line option "-XX: + useconcmarksweepgc" to specify.
This type of collector gives priority to program response. It tries its best to run the garbage collection thread and the application thread at the same time, thus reducing the pause time. This option is supported from JDK 1.4.1.
◇ Incremental collector)
Sun has stopped maintaining the collector since JDK 1.4.2. That's why I saved some saliva.

How can we reduce the impact of GC?
◇ Minimize heap memory usage
GC is intended for objects stored in heap memory. If we reduce the allocation of referenced objects in the Program (also reduce the heap memory allocation), it is very helpful to improve GC performance. The previous post "string filtering practices" provided an example to demonstrate how to improve performance by reducing the number of heap memory allocations.

◇ Set the appropriate heap memory size
The JVM heap memory is exquisite. It cannot be too large or too small. If the heap memory is too small, the JVM always feels that the memory is insufficient, which may lead to frequent garbage collection, affecting performance. If the heap memory is too large, as a result, most of the physical memory of the operating system is occupied by the JVM itself, which may affect the performance of other applications and even the operating system itself.
In addition, the size of the young generation (or the ratio of the young generation to the old generation) has a significant impact on GC performance. If the young generation is too small, secondary collection may be frequent; if the young generation is too large, the pause of secondary collection may be obvious.
JVM provides several command line options related to the heap memory size, as shown in the following code:
------------------------------
-Set the initial heap memory in XMS.
-Xmx: set the maximum heap memory.
-Xmn: set the size of the young generation.
-XX: newratio = N: Set the ratio of the young to the old to "N"
-XX: newsize = N: Set the young generation size to "N"
------------------------------
Generally, the default JVM parameter value is enough. So don't use the above options easily. If you cannot adjust the performance, you must perform in-depth performance comparison tests to ensure that the adjusted performance is indeed better than the default parameter value.

◇ Throughput and pause
As mentioned above, it is difficult to adjust the public opinions of different applications. There are two common tastes: (1) throughput is important, and pause time is indifferent; (2) pause time.
Some background computing-intensive applications are the first. For example, some scientific computing applications. Parallel collectors are recommended.
The second type is applicable to scenarios involving user UI interaction with high real-time requirements and fast response of programs. For example, some desktop games and some telecommunications exchange systems. We recommend that you use the concurrent collector.

References
◇ GC optimization data
Sun provides several instructions on JVM garbage collection optimization. For JDK 1.4.2, see "here"; for JDK 1.5, see "here"; for JDK 1.6, see "here ".
◇ JVM command line option description
This is a list of various command line parameters carefully organized by Sun, "here ". Including the JDK version applicable to each parameter.
◇ VM specifications
"Here" is Sun's official JVM specification.

Copyright Notice
All original articles in this blog are copyrighted by the author. This statement must be reprinted to keep this article complete, and the author's programming preferences and original addresses in the form of hyperlinks should be noted:

Http://program-think.blogspot.com/2009/04/java-performance-tuning-3-gc.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.