Java Memory area and GC mechanism

Last Update:2015-09-23 Source: Internet

Author: User

Tags compact

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Java garbage Collection

The Java GC (garbage Collection, garbage collection, garbage collector) mechanism is one of the main differences between Java and C + +. There is an automatic memory management and garbage cleaning mechanism in the Java virtual machine. In a nutshell, this mechanism flags memory in the JVM, and determine which memory needs to be recycled, automatically reclaim memory based on a certain recycling strategy, and never cease to guarantee the JVM's memory space to prevent memory leaks and overflow problems.

The Java GC mechanism only completes three things:

1 determine which memory needs to be recycled

2 Determining when to perform GC

3 How to perform GC

Learn the Java GC mechanism to troubleshoot various memory overflow or leak issues, address performance bottlenecks, achieve higher concurrency, and write more efficient programs

To understand the Java GC mechanism, you must first understand the partitioning of memory areas in the JVM. In the Java Runtime Data area, the memory area managed by the JVM is divided into several modules:

1) Procedure counter (program Counter Register)

A program counter is a small area of memory that indicates the number of bytes executed by the current thread is executed to the first line, which can be understood to be the line numbers indicator of the current thread. When the bytecode interpreter is working, it removes a statement instruction by changing the value of the counter. Each program counter is used only to record the line number of a thread. So it's thread-private (one thread has a program counter)

If the program executes a Java method, the counter records the executing virtual machine bytecode instruction address; If you are performing a local (native, written by C language) method, the value of the counter is undefined, because the program counter indicates that the current instruction address is logged, so there is no memory overflow, so The program counter is also the only area in any JVM memory area that does not have a outofmemoryerror defined.

2) virtual machine stack (JVM stack)

When each method of a thread executes, a stack frame is created, and a stack frame is stored with a local variable table, an operation station, a dynamic link, a method exit, and so on, when the method is called, the stack frame is stacked on the JVM stack, and the stack frame is stacked when the method execution is complete.

The local variable table stores the relevant local variables of the method, including various basic data types, object references, return addresses, and so on. In a local variable table, only long and double types occupy 2 local variable spaces (slots, for 32-bit machines, one Slot is 32 bits), and the others are 1 slots. It is important to note that the local variable table is determined at compile time, and the space required for the method to run is fully deterministic in the stack frame and will not change during the lifetime of the method.

Two exceptions are defined in the virtual machine stack, which throws a statckoverflowerror (stack overflow) if the thread call has a stack depth greater than the maximum allowable depth for the virtual machine, but most Java virtual machines allow the size of the virtual machine stack to be dynamically extended (with a small number of fixed lengths). So the thread can always request the stack until it is out of memory, and at this point it throws a OutOfMemoryError (memory overflow). Each thread corresponds to a virtual machine stack, so the virtual machine stack is also a thread-private

3) Local methods Stack (Native method Statck)

The local method stack is the same as the virtual machine stack in terms of function, operating mechanism, exception type, and the only difference is that the virtual machine stack executes the Java method, and the local method stack is used to execute the native method, in many virtual machines (such as the Sun's JDK default hotspot virtual machine), The local method stack is used together with the virtual machine stack.

4) Heap area

The heap area is the most important area to understand the Java GC mechanism, and there is no one. In memory managed by the JVM, the heap area is the largest chunk, and the heap area is the main memory area managed by the Java GC mechanism, which is shared by all threads and created when the virtual machine is started. The heap area exists to store object instances , in principle, all objects allocate memory on the heap area. According to the Java Virtual Machine specification, heap memory needs to be logically continuous (not physically necessary), when implemented, can be fixed-size or extensible, the current mainstream of virtual machines are extensible. If there is still not enough memory allocation and no expansion after the garbage collection is performed, the OutOfMemoryError:Java Heap Space exception will be thrown.

5) Method area

In the Java Virtual Machine specification, the method area is treated as a logical part of the heap, but in fact, the method area is not a heap (non-heap); In addition, in many people's blogs, the Java GC's generational collection mechanism is divided into 3 generations: The green age, the old age, the permanent generation, these authors defined the method area as " Permanent generation ", this is because, for the previous implementation of the Hotspot Java Virtual machine, the idea of generational collection is extended to the method area, and the method area is designed as a permanent generation. However, most virtual machines other than hotspot do not treat the method area as a permanent generation, and the hotspot itself also plans to cancel the permanent generation. In this article, because the author uses Oracle JDK6.0 primarily, the term permanent is still used.

A method area is a zone shared by each thread that stores the class information that has been loaded by the virtual machine (that is, information that needs to be loaded when the class is loaded, including information such as version, field, method, interface, and so on), final constants, static variables, code that the compiler compiles immediately, and so on.

The method area is not physically required to be contiguous, you can choose a fixed size or an extensible size, and the method area has one more limit than the heap: You can choose whether to perform garbage collection. Generally, garbage collection performed on the method area is rare, which is one of the reasons that the method area is known as a permanent generation (HotSpot), but this does not mean that there is no garbage collection on the method area at all, and that garbage collection on it is mainly for the memory reclamation of the constant pool and unloading of the loaded classes.

Garbage collection on the method area, the conditions are harsh and very difficult, the effect is not satisfactory, so generally do not do too much to consider, you can stay for further in-depth study later use.

OutOfMemoryError is defined on the method area:PermGen space exception, thrown when memory is low.

The runtime Constant pool is a part of the method area that stores the literal constants, symbolic references, and direct references that are generated by the compilation period (the symbolic reference is that the encoding is the location of a variable, an interface by a string. The direct reference is the translated address according to the symbol reference, which will be translated in the class link phase); Run a constant pool in addition to storing compile-time constants, you can also store constants generated at runtime (such as The Intern () method of the String class, which acts as a constant pool maintained by string. If the character "ABC" that is called is already in the constant pool, the string address in the pool is returned, otherwise a new constant is added to the pool and the address is returned.

6) Direct Memory: Direct memory is not a JVM-managed memory, so to understand, direct memory, is the machine memory outside the JVM, for example, you have 4G of memory, the JVM occupies 1G, then the rest of 3G is direct memory, In the JDK, there is a way to allocate memory based on channels (channel) and buffers (buffer), and the native libraries implemented by the C language are allocated in direct memory and referenced by the directbytebuffer stored in the JVM heap. OutOfMemoryError exceptions can also occur because direct memory is limited to the memory of this machine.

How Java objects are accessed

In general, a reference access to Java involves 3 memory areas: The JVM stack, the heap, the method area.

Refer to the simplest local variable: Object obj = new Object () for example:

Object obj represents a local reference, stored in the local variable table of the JVM stack, representing a reference type data;
The new object () is stored as instance object data in the heap;
The heap also records the address of type information (interface, method, field, object type, and so on) of the object class, and the data executed by these addresses is stored in the method area;

In the Java Virtual Machine specification, there is no provision for accessing HTTP://WWW.BAIDU.COM/S?CL=3&WD=PERMANENT&TN=UBUNTUU_CB specific objects through the reference type reference, There are two main ways of realizing the current mainstream:

1, accessed through a handle

In the implementation of access through a handle, the JVM heap will have an area dedicated to the handle pool that stores the instance data addresses (including addresses in the heap and addresses in the method area) that are executed by the related handle. This implementation method is very stable because it represents the address with a handle.

2, accessed by direct pointer:

In the way of direct pointer access, reference stores the actual address of the object in the heap, and the object information stored in the heap contains the corresponding type data in the method area. The biggest advantage of this approach is that it is fast and is used in hotspot virtual machines.

Java memory allocation mechanism:

The memory allocation referred to here mainly refers to the allocation on the heap, the mechanism of Java memory allocation and recycling is summed up as follows: Generational allocation, generational recycling.

The object will be divided according to the time of survival: younger generation (young Generation), older generation (old Generation), permanent generation (Permanent Generation, which is the method area)

1) Young Generation: When an object is created, the allocation of memory occurs first in the younger generation (large objects can be created directly in the old age), and most of the objects are not used soon after they are created, and therefore quickly become unreachable, It was then cleaned up by the younger generation of GC mechanisms (IBM's research shows that 98% of objects are dying quickly), a GC mechanism called minor GC or young GC. Note that the Minor GC does not mean that the younger generation is out of memory, it actually represents only the GC on the Eden Zone.

(1) Memory allocations for young generations can be divided into 3 regions: Eden and two survival (Survivor 0, Survivor 1).
(2) Most of the objects just created will be allocated to the Eden area, most of which will soon perish. The Eden Zone is a contiguous memory space, so allocating memory on it is extremely fast;
Initially, when the Eden area is full, the minor GC is executed, the extinct objects are cleared, and the remaining objects are copied to a surviving area Survivor0 (at this time, Survivor1 is blank, two survivor always have a blank);
(3) The next time the Eden area is full, perform a minor GC, clean up the extinct objects, copy the surviving objects into the Survivor1, and empty the Eden area;
(4) The object of extinction in the Survivor0, the object can be promoted to the old area, the surviving objects are also copied to the Survivor1 area, and then empty the Survivor0 area;
(5) When two survival zones have been switched several times (the hotspot virtual machine default 15 times, with-xx:maxtenuringthreshold control, greater than this value into the old age), the surviving objects (actually only a small part, for example, our own definition of the object), will be copied to the old age.

As can be seen from the above process, the Eden area is a contiguous space, and survivor always has a null. After a GC and copy, a survivor holds the currently alive object, while the contents of the Eden and the other survivor areas are no longer needed, and can be emptied directly to the next GC, with two survivor roles swapped. As a result, this way of allocating memory and cleaning up memory is extremely efficient, and this garbage collection is known as "Stop-copy (stop-and-copy)" The cleanup method, which copies the Eden area and an object that is still alive in the survivor to another survivor, does not mean that it is efficient to stop the copy cleanup, in fact, it is only efficient in this case, and if it is used in the old age to stop copying, it is quite tragic.

Because most of the objects are short-lived, and even survived the survivor, so the Eden area and the survivor ratio is larger, the hotspot default is 8:1, that is, the 80%,10%,10% of the Cenozoic respectively. If the amount of memory survived in the Survivor+eden is more than 10% in a single collection, some of the objects need to be allocated to the old age. use the-xx:survivorratio parameter to configure the capacity ratio of the Eden Zone Survivor area, which is 8 by default, representing Eden:survivor1:survivor2=8:1:1.

In the Eden area, the hotspot virtual machine uses two techniques to speed up memory allocation. respectively, Bump-the-pointer and Tlab (thread-local Allocation buffers), the two techniques are: Because the Eden Zone is continuous, Therefore, the core of Bump-the-pointer technology is to track the last object created, when the object is created, only need to check if there is enough memory behind the last object, which greatly accelerates the memory allocation speed, and for Tlab technology is multi-threading, the Eden area is divided into several segment, each thread uses a separate paragraph to avoid mutual influence. Tlab combined with Bump-the-pointer technology will ensure that each thread uses a segment of the Eden Zone and allocates memory quickly.

2) older generation (old Generation): Objects that survive long enough in a young generation without being cleaned up (that is, survived several times after the youth GC) are replicated to older generations, where older generations tend to have more space than younger generations and can store more objects. The number of GCS that occur on older generations is also less than in younger generations. When the old generation is out of memory, the major GC will be executed, also called the full GC.

You can use the-xx:+useadaptivesizepolicy switch to control whether a dynamic control strategy is used, and dynamically adjust the size of each area in the Java heap and the age at which you enter the old age.

If the object is larger (such as a long string or a large array), and young is not in space, then the large object is allocated directly to the old age (large objects may trigger an advance GC, should be used sparingly, and should avoid the use of short-lived large objects). use-xx:pretenuresizethreshold to control the size of objects directly ascending into older generations, and objects larger than this value are allocated directly to the old age.

There may be cases where older generations of objects refer to a new generation of objects, and if you need to perform a young GC, you may need to query the entire old age to determine whether you can clean up the collection, which is obviously inefficient. The solution is to maintain a block of byte-"card table" in the old generation, where all the records of older objects referencing the Cenozoic objects are recorded here. Young GC, as long as check here can, no longer to check all the old age, so performance greatly improved.

The old age stores more objects than the younger generation, and there are large objects, in the old age of memory cleanup, if you use the stop-copy algorithm, it is quite inefficient. In general, the algorithm used in the old age is the marker-collation algorithm , that is, mark out the surviving objects (which are referenced) and move all the surviving objects to one end to ensure the memory is contiguous.

In the event of a minor GC, the virtual opportunity checks whether the size of each promotion into the old age is greater than the remaining space size of the old age, and if it is greater, it triggers a full GC directly, otherwise it will see if-xx:+handlepromotionfailure is set (allow warranty failure), and if allowed, only MINORGC is tolerated, memory allocation failure is tolerable, and if not allowed, full GC is still performed (this represents if the setting-xx:+ Handle promotionfailure, Trigger MINORGC will trigger the full GC at the same time, even if there is a lot of memory in the old age, so it is best not to do so. Method Area (permanent generation):

There are two types of recovery for a permanent generation: constant pool constants, useless class information, constant recycling is simple, no references can be recycled. For recycling of useless classes, 3 points must be guaranteed:

All instances of Class 1 have been recycled
2 The ClassLoader of the load class have been recycled
Class object 3 is not referenced (i.e. where the class is not referenced by reflection)

Recovery of a permanent generation is not required, and parameters can be used to set whether the class is recycled. Hotspot provides-XNOCLASSGC for control
Use-verbose,-xx:+traceclassloading,-xx:+traceclassunloading to view class load and unload information
-verbose,-xx:+traceclassloading can be used in the product version of the hotspot;
-xx:+traceclassunloading requires Fastdebug version of hotspot support garbage collector in the GC mechanism, the important role is the garbage collector, garbage collector is the implementation of the GC, the Java Virtual Machine specification for the garbage collector does not have any provisions, so different vendors implement the garbage collector is not the same, the HotSpot 1.6 version of the garbage collector (figure from the " Deep understanding of Java Virtual machines: JVM advanced effects and best practices, the two collectors in the diagram have lines between them, indicating that they can be used togetherbefore introducing the garbage collector, it is important to be clear that in the new generation of stop-replication algorithms, the Meaning of "stop (Stop-the-world)" is to suspend execution of all other threads while reclaiming memory. This is inefficient, and now a variety of new generation collectors are optimizing this point, but still only the time to stop is shortened, not completely canceled stop. (1) Serial collector: The new generation collector, using the Stop copy algorithm, uses one thread for GC, serial, and other worker threads to pause. use-XX:+USESERIALGC to run memory reclamation using serial+serial old mode (this is also the default value that the virtual machine runs in client mode) (2) parnew Collector: The New generation collector, using the Stop copy algorithm, serial The multi-threaded version of the collector, using multiple threads for GC, parallel, other worker threads to pause, focus on shortening the garbage collection time. Use the -XX:+USEPARNEWGC switch to control the collection of memory using the parnew+serial old collector combination, and use-xx:parallelgcthreads to set the number of threads that perform memory reclamation. (3) Parallel scavenge collector: The new generation collector, using the Stop replication algorithm, focus on CPU throughput, that is, the time/total time to run user code, such as: The JVM runs for 100 minutes, which runs user code 99 minutes, garbage collection 1 minutes, the throughput is 99% , this collector can use CPU most efficiently, suitable for running background operation (focus on the collector of shortening garbage collection time, such as CMS, waiting time is very little, so it is suitable for user interaction, improve user experience). Use the-XX:+USEPARALLELGC switch to control the garbage collection using the parallel scavenge+serial old collector (which is also the default value in server mode); Use-XX: Gctimeratio to set the proportion of user execution time to total time, default 99, which is 1% of the time used for garbage collection. Use-xx:maxgcpausemillis to set the maximum pause time for GC (this parameter is only valid for parallel scavenge), and the switch parameter-xx:+useadaptivesizepolicy can be used for dynamic control, such as automatic adjustment eden/ Survivor ratio, age of old age, Cenozoic size, etc., this parameter is not in the parnew.
(4) Serial old collector: older collector, single-threaded collector, serial, using tag finishing (the finishing method is sweep (clean) and compact (compressed), Cleanup is to kill the discarded objects, leaving only the surviving objects, compression is moving the object, Fill the space to ensure that the memory is divided into 2 pieces, a whole object, a piece of idle) algorithm, using single-threaded GC, other worker thread paused (note, in the old age of the labeling algorithm cleanup, also need to suspend other threads), before JDK1.5, Serial The old collector is used in conjunction with the Parallelscavenge. (5) Parallel old collector: older collector, multi-threading, parallel, multi-threaded mechanism and Parallel scavenge bad, using tag collation (unlike serial, The arrangement here is Summary (summary) and compact (compression), which means that the surviving objects are copied to the prepared area, rather than the algorithm that cleans up the discarded objects like sweep (cleanup), and the other threads still need to be paused when the parallel old executes. Parallel old is useful in multicore computing. Parallel Old appeared (JDK 1.6), with the Parallel scavenge with a good effect, fully embodies the Parallel scavenge collector throughput first effect. Use the-XX:+USEPARALLELOLDGC switch to control the collection using the Parallel scavenge +parallel old combo collector. (6) CMS (Concurrent Mark Sweep) Collector: The old age collector, dedicated to obtaining the shortest payback time (i.e. shortening the time of garbage collection), using the tag cleanup algorithm, multi-threading, the advantage is concurrent collection (the user thread can work concurrently with the GC thread), the pause is small. use-XX:+USECONCMARKSWEEPGC for memory reclamation of parnew+cms+serial old, take precedence over PARNEW+CMS (see later), and when the user thread is out of memory, use the fallback scenario Serial old collection. (7) G1 Collector: Officially released in JDK1.7, with the current situation of the new generation, the concept of old age is very different, the current use less, do not introduce. Note the difference between concurrency (Concurrent) and Parallelism (Parallel): concurrency refers to the simultaneous execution of a user thread with a GC thread (not necessarily parallel, possibly alternately, but in general), without the need to pause the user thread (in fact, the user thread in the CMS still needs to be paused, just very short, The GC thread executes on another CPU); Parallel Collection means that multiple GC threads work in parallel, but at this point the user thread is paused; Therefore, the serial is serial, the parallel collector is parallel, and the CMS collector is concurrent.excerpt from: http://www.cnblogs.com/zhguang/p/3257367.html

Java Memory area and GC mechanism

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More