Java garbage Collection Pristine-part1

Source: Internet
Author: User

The Java garbage collection essence is divided into 4 parts, this is the 1th part of this article. The tradeoff points, object lifecycles, and global pause events are described in part 1th.

Serial, parallel, concurrent, CMS, G1, Young Gen, Cenozoic (New gen), Legacy generation (Old Gen), Perm Gen, Eden (Eden), old age (tenured), surviving area (Survivor Spaces), Security Point (safepoints) and hundreds of JVM startup parameters. Does it overwhelm you when you try to tweak the garbage collector to get the throughput and latency you need for your Java application? If the answer is yes, you don't have to worry about it, there are a lot of people like that. The description of the garbage collection in the document feels as complex as the aircraft operating manual, with each knob and turntable sufficiently detailed, but not clear how to fly. This article attempts to explain the tradeoffs that need to be learned when selecting and sizing garbage collection algorithms for specific workloads.

This article will focus on Oracle's hotspot JVM and OpenJDK, both of which are the most common. Finally, other commercial JVMs are also discussed as alternatives.

Tradeoff point (the tradeoffs)

Folk have a saying: "Reluctant children can't bear the wolf." If we want something, we usually have to give up something else. When this sentence is used for garbage collection, the main goal is to set the GC by three variables:

    1. Throughput: The amount of time spent on garbage collection is the proportion of the overall application effort. Through "-xx:gctimeratio = 99;" To set the target throughput; The default value is 99, which means that 1% of the time is spent on garbage collection.
    2. Latency: The time that the response is paused due to garbage collection. Set the delay for GC pauses to use "-Xx:maxgcpausemillis =< n>".
    3. Memory: Our system memory is typically used to store state, which is often replicated and moved during management. At any point in time, the set of active objects that are reserved by the application is called a survival set. "-xmx<n>" is used to set the maximum heap size, which can be used to adjust the heap size available to the application.

Note: usually the hotspot does not reach these metrics, and there is no warning that it will continue to run silently, and the farther away it is to set the parameter values.

The delay is spread throughout the garbage collection process. It will take a little bit of the average delay time to reduce the delay in the worst case or make the delay appear too frequently, so we can accept it. We should not interpret the term "real-time" as the smallest possible delay, but rather as a definite short delay regardless of how the throughput is guaranteed.

Of course, the workload throughput for some programs is the most important. For example, a long-running batch job, and if a batch job is occasionally paused for garbage collection for a few seconds, as long as it can be done faster overall, that's fine.

Virtually all other workloads, from human-computer interaction applications to financial trading systems, can be troublesome if the system does not respond for more than a few seconds, and in some cases even milliseconds. It is often worthwhile to sacrifice throughput change delays in a financial trading system. It is also possible that the application is limited by the amount of physical memory available and must remain operational, in which case we have to abandon the performance pursuit of latency and throughput.

The tradeoff results are usually:

    • To a large extent, it is possible to consume more garbage collection algorithms using memory to reduce the average garbage collection time.
    • A predictable worst-case scenario is a delay-induced stop response. This can be reduced by limiting the activity set (live set) and maintaining a smaller heap.
    • You can reduce the frequency of pauses by managing the heap and generation size, and by controlling the application's object allocation rate.
    • The frequency of longer pauses can be reduced by running the GC and the application in parallel, but sometimes it affects throughput.

Object life cycle

The optimization of garbage collection algorithms usually assumes that most objects survive a very short period of time, with only a small portion surviving for long periods. In most applications, most objects have a definite life cycle, and only a small subset of the objects run through the entire process. In the garbage collection theory, this phenomenon is often referred to as "infant mortality" (infant mortality) or "weak generational hypothesis" (weak-generation hypothesis). For example, most loop iterators have very short life cycles, while static strings are valid throughout the entire program life cycle.

Experiments have shown that generational garbage collectors often have an order of magnitude improvement over the generations, thus using generational garbage collection in almost all server JVMs. By substituting for objects we know that there are often very few surviving objects in the area of a newly allocated object. Therefore, using a collector to clean up these newborn few surviving objects and copying them into another old generation area will work. The hotspot garbage collector uses the number of times it survives in the GC cycle as an object's age.

Note: If your application continues to produce a large number of objects and has survived for a long time, it is obvious that your application will spend a lot of money on garbage collection. It is also clear that you will spend a lot of time tuning the hotspot's garbage collector. In this case, substituting "filter" is not very effective. It frequently collects long-lived generations and spends a lot of time, eventually leading to inefficient GC. Older generations are so scattered that the old generation collection algorithms are less efficient. Generational garbage collectors tend to operate in two different recycling cycles: After collecting short-lived objects for secondary recovery (Minor collections), and the lower frequency of major recoveries (Major collections), after recycling the old generation region.

Global Pause Event (Stop-the-world events)

A program pause during garbage collection is called a "global pause event." In real-world engineering, the garbage collector must do this regularly for memory management. Depending on the algorithm, different collectors will pause the entire program at different times and different places. In order to stop the program completely, you must pause all running threads. A "security point" is a point that arrives in the execution of a program, when the GC root object is all known and the contents of all heap objects are consistent. When running to the "safe point", the garbage collector sends a signal to suspend the thread. Depending on what the thread is doing, it may take some time to reach the "safe point". The "Safe point" check is usually done at the return of the function or at the end of the loop boundary, but can be optimized to make it more dynamic. For example, a thread is copying a large array, cloning a large object, or performing a finite count cycle, which can take several milliseconds to reach a "safe point." "Secure Point Arrival Time" (TTSP) is important for low-latency applications. In addition to the other GC flags, enable the "Xx:+printgcapplicationstoppedtime" flag to output this time.

Note: for applications with a large number of parallel threads, when a global pause occurs, the system will have significant dispatch pressure and recover after the end. Algorithms that rely on global pauses (Stop-the-world) are therefore more efficient.

Java garbage Collection Pristine-part1

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.