A detailed explanation of Java memory area and GC mechanism __JVM

Source: Internet
Author: User
Tags scalar stack pop java reference

This article on the Java Virtual Machine Memory management mechanism to explain, I feel the content is very good and clear, so the article reproduced over, I suggest that readers can calm down to read. The original address Java memory area and the GC mechanism most detailed.

Here I also write my own personal understanding for the reader to refer to:

When it comes to memory, for our programmers, the most I can think of is stacks and heaps.

Stack

Stack:

For many of the basic types of data in a program, such as: int, float and other basic types (except string type) of data will be stored directly in the stack, in the stack of the specific performance of the stack frame, the stack frame stored in the local variables table, operand stack, dynamic connection, return address.

The local variables table mainly includes: int, Boolean, Byte, char, short int, long, double and other basic data types of variable names, as well as reference, ReturnAddress, and so on. Reference allows virtual machines to directly or indirectly find instances of their references. And ReturnAddress is the address that points to a byte code instruction.

The operand stack is mainly used to store the value of variables, the value of the variable in the code into the stack, you can do operations such as operations.

Dynamic connection: Each stack frame contains a reference to the method in the constant pool, which can be used to dynamically connect when the program is run, and there are many symbolic references in the class, and the invocation of the method in the bytecode is referred to as the parameter by this symbol. However, the application of this symbol will be in the class loading phase or the first run when the direct conversion to address value, so you can find directly. Some will change to the actual address at the time of the dynamic connection.

Return address: One is the method run out and return the result, this belongs to the normal completion of the export. The other is the exception that is encountered during the execution of the method, which is an abnormal completion exit. method returns, his corresponding stack frame will be corresponding from the top of the stack pop-up, if there is a return value, the results will be directly stored in the return address of the stack frame of the operand stack, the other parts will be directly recycled.

Heap:

The main storage object in the heap of some data, in general, there will be a reference in the stack corresponding to the value, then the object has a reference to its pointer, that is to say that the object can not be directly recycled, otherwise, the object will be recycled at the appropriate time. The data for objects in the heap can be as follows: 1. The handle of an instance (a pointer to an object type and a pointer to an object instance data) is stored in a handle pool. A pointer to an object type generally points to constant data in the method area. The pointer to the object instance data points to the instance data in the instance pool. 2. Direct pointer mode: The actual address of the instance is placed in the reference, and the information stored in the heap contains the corresponding type information in the method area.

Memory Recycle

Generally divided into young generation, old age, permanent generation

The younger generation divides into three Eden, Survivor0, Surivor1 three parts. Allocated in proportion to 8:1:1, some of which will be from and to and Eden

Young in the Eden area, when Eden is full, will trigger MINORGC, most of the memory released, the remaining objects are not released into the SURVIVOR0

When the data in the SURVIVOR0 is full, the data is copied into the Surivor1, and the objects that Eden retains will be copied into the Surivor1.

If after a specified number of times, there are no recycled objects directly deposited in the old age.

Old age: Memory ratio is large, if not enough memory will trigger MAJORGC is also full GC.

Permanent generation: Mainly constants in the pool, useless class information, the collection of constants is very simple, no reference can be recycled. For unwanted classes to be recycled, 3 points must be guaranteed:

1. All instances of the class have been reclaimed.

2. The ClassLoader of the load class has been reclaimed.

3. Class object is not referenced (i.e. where the class is not referenced by reflection)


Directory Java garbage Collection overview Java Memory Area Java Object Access method Java memory allocation mechanism Java GC Mechanism garbage collector

Overview of Java Garbage collection

Java GC (Garbage Collection, garbage collection, garbage collector) mechanism, is one of the main differences between Java and C++/C, as Java developers, generally do not need to write memory and garbage cleaning code, memory leaks and overflow problems, It doesn't need to be as jittery as a C programmer. This is because in the Java virtual machine, there are automatic memory management and garbage cleaning mechanisms. In a nutshell, the mechanism marks memory in the JVM (Java Virtual Machine) and determines which memory needs to be recycled, automatically reclaims memory based on a certain recycling strategy, and never Stops (Nerver Stop) to guarantee the memory space in the JVM, There is a memory leak and overflow problem with the placement.

With regard to the JVM, it needs to be explained that in the Sun Company's JDK, which is currently the most used, the default virtual machines are hotspot, since the JDK1.2 of 1999 began to be widely used JDK6. Oracle acquired Sun in 2009, plus the previously acquired EBA, Oracle had two of the 3 largest virtual machines: JRockit and Hotspot,oracle also indicated their intention to integrate the two large virtual machines, but now in the newly released JDK7, The default virtual machine is still hotspot, so the virtual machines introduced in this article are hotspot, and the related mechanism is mainly the GC mechanism of hotspot.

The Java GC mechanism mainly accomplishes 3 things: determining which memory needs to be recycled, determining when the GC is to be performed, and how to execute the GC. After such a long period of development (in fact, there is a GC mechanism before the advent of the Java language, such as the Lisp language), the Java GC mechanism is getting better, and almost automatically doing most of the work for us. However, if we are engaged in the development of larger application software, there is a need for memory optimization, we must study the Java GC mechanism.

Learning the Java GC mechanism can help us to troubleshoot various memory overflow or leak problems in our daily work, solve performance bottlenecks, achieve higher concurrency, and write more efficient programs.

We will learn from 4 aspects of the Java GC mechanism, 1, how the memory is allocated, 2, how to ensure that memory is not incorrectly recycled (that is, which memory needs to be recycled), 3, under what circumstances GC and how to execute GC, and 4, how to monitor and optimize the GC mechanism.

Java Memory Area

  

To understand the Java GC mechanism, you must first understand the partitioning of memory areas in the JVM. In the Java runtime's data area, the memory area managed by the JVM is divided into the following diagram modules:

which

1, Program counter (program Counter Register): Programs counter is a relatively small memory area, used to indicate the current thread execution of bytecode execution to the first few lines, can be understood to be the current thread line number indicator. When the bytecode interpreter is working, a statement instruction is removed by changing the value of this counter.

Each program counter is used only to record the line number of a thread, so it is thread-private (one thread has a program counter).

If the program executes a Java method, the counter records the executing virtual machine byte-code instruction address, and if the execution is a local (native, written by C language) method, the counter's value is undefined, because the program counter only records the current instruction address. So there is no memory overflow, so the program counter is the only region in all JVM memory areas that does not have a defined outofmemoryerror.

2, Virtual machine stack (JVM stack): Each method of a thread executes at the same time will create a stack frame (statck frame), stored in the stack frame with local variables table, operator station, dynamic link, method exit, etc., when the method is called, the stack frame in the JVM stack, when the method execution is completed , stack frame out stack.

The Local variables table stores the relevant local variables of the method, including various basic data types, object references, return addresses, and so on. In a local variable table, only the long and double types occupy 2 local variable spaces (Slot, for 32-bit machines, one Slot is 32 bit), and the others are 1 Slot. It should be noted that the local variable table is at compile time has been determined, the method of operation required to allocate the space in the stack frame is completely determined, in the life cycle of the method will not change.

Two exceptions are defined in the virtual machine stack that throw a statckoverflowerror (stack overflow) if the thread calls a stack depth greater than the maximum depth allowed by the virtual machine; however, most Java virtual machines allow the dynamic expansion of the virtual machine stack size (with a small number of fixed-length). So the thread can always apply for stacks, know that there is not enough memory, at this time, will throw OutOfMemoryError (memory overflow).

Each thread corresponds to a virtual machine stack, so the virtual machine stack is also thread-private.

3, local methods Stack (Native method Statck): The local method stack in the role, operating mechanism, exception types and so on are the same as the virtual machine stack, the only difference is: the virtual machine stack is the implementation of Java methods, and the local method stack is used to execute the Native method, In many virtual machines (such as the Sun's JDK default hotspot virtual machine), the local method stack is used with the virtual machine stack.

The local method stack is also thread-private.

4, heap area (Heap): Heap area is the most important area to understand the Java GC mechanism, not one. In the memory managed by the JVM, the heap area is the largest piece, and the heap area is the main memory area managed by the Java GC mechanism, and the heap area is shared by all threads and created when the virtual machine is started. Heap area exists to store object instances, in principle, all objects are allocated memory on the heap area (but in modern technology, it is not so absolute, there are also directly distributed on the stack).

In general, according to the Java Virtual Machine specification, heap memory needs to be logically continuous (physically unwanted), can be fixed size or extensible when implemented, and the current mainstream virtual machines are extensible. If you do not have enough memory allocations or extensions after the garbage collection has been performed, you will throw a Outofmemoryerror:java heap space exception.

There is much more to the heap area, which is described in detail in the next section, "Java Memory allocation mechanism."

5, methods area: In the Java Virtual Machine specification, the method area is treated as a logical part of the heap, but in fact, the method area is not a heap (non-heap); In addition, many people blog, the Java GC's generational collection mechanism is divided into 3 generations: The green age, the old age, Permanent generations, these authors define the method area as a "permanent generation" because, for the implementation of the previous Hotspot Java Virtual machine, the idea of generational collection is extended to the method area and the method area is designed as a permanent generation. However, most virtual machines other than hotspot do not treat the method area as a permanent generation, hotspot itself, and also plan to cancel the permanent generation. In this article, because the author mainly uses Oracle JDK6.0, it will still use the term permanent generation.

A method area is an area shared by individual threads to store class information that has been loaded by a virtual machine (that is, information that needs to be loaded when the class is loaded, including version, field, method, interface, etc.), final constant, static variable, compiler Just-in-time code, and so on.

The method area is not physically required to be contiguous, you can choose a fixed size or a scalable size, and the method area has one more limit than the heap: You can choose whether to perform garbage collection. Generally, the garbage collection performed on the method area is very small, this is also one of the reasons why the method area is called a permanent generation (HotSpot), but it does not mean that there is no garbage collection on the method area, and that the garbage collection on it is mainly for the memory reclaim of the constant pool and the unload of the loaded class.

Garbage collection in the method area, the conditions are harsh and very difficult, the effect is not satisfactory, so generally do not do too much thinking, can be left for further in-depth study later use.

The Outofmemoryerror:permgen space exception is defined on the method area and is thrown when there is not enough memory.

The runtime (Runtime Constant Pool) is part of the method area used to store literal constants, symbolic references, translated direct references (symbolic references that encode a string representing the position of a variable, an interface) generated at compile time. A direct reference is a translated address based on a symbolic reference that will complete the translation at the class link stage; The Run-time constant pool, in addition to storing compile-time constants, can also store constants generated at runtime (such as The Intern () method of the String class, which maintains a constant pool of If the called character "ABC" is already in a constant pool, the string address in the pool is returned, otherwise a new constant is added to the pool and the address is returned.

6, Direct Memory: Direct memory is not a JVM-managed memory, so it can be understood that direct memory is the machine memory outside the JVM, for example, you have 4G of memory, the JVM is occupied by 1G, the remaining 3G is direct memory, In JDK, there is a memory allocation method based on channel (Channel) and buffer (buffer), where the native function library implemented by C is allocated in direct memory and referenced by Directbytebuffer stored in the JVM heap. Because direct memory is limited by the memory of this machine, outofmemoryerror exceptions may occur.

How Java objects are accessed

In general, a Java reference access involves 3 areas of memory: The JVM stack, the heap, and the method area.

In the simplest local variable reference: Object obj = new Object () For example: Object obj represents a local reference, stored in a local variable table in the JVM stack, representing a reference type data; new Object () As instance object data is stored in the heap, the address of type information (interface, method, field, object type, etc.) of the object class is also recorded in the heap, and the data executed by these addresses is stored in the method area;

In the Java Virtual Machine specification, there are two main ways of implementing a specific object through the reference type reference:

1, access via handle (figure from deep understanding Java Virtual Machine: JVM advanced effects and best implementations):

In the implementation of handle access, there is a special area in the JVM heap that is used as a handle pool to store the instance data addresses (including the addresses in the heap and the addresses in the method area) that are executed by the relevant handles. This implementation method is stable because it represents an address with a handle.

2, through direct pointer access: (Figure from "Deep understanding Java Virtual Machine: JVM Advanced effects and best implementation")

In the way of direct pointer access, the reference stores the actual address of the object in the heap, and the object information stored in the heap contains the corresponding type of data in the method area. The biggest advantage of this approach is its speed, which is the way it is used in hotspot virtual machines.

Java memory allocation mechanism

The memory allocation referred to here is mainly about the allocation on the heap, generally, the memory allocation of objects is done on the heap, but modern technology also supports splitting objects into scalar types (scalar type, atomic type, representing a single value, can be a basic type or string, etc.), then allocated on the stack, rarely seen on the stack, We don't think about it here.

Java memory allocation and recycling mechanism in general, that is: generational distribution, generational recycling. The objects will be divided according to the time of survival: young Generation, older generation (old Generation), Permanent generation (permanent Generation, which is the method area). The following figure (from "become JAVAGC expert part I", http://www.importnew.com/1993.html):

    

Younger generation (young Generation): When an object is created, the allocation of memory first occurs in the younger generation (large objects can be created directly in the old generation), and most objects are no longer used after they are created, so they quickly become unreachable, and are then cleared by the younger generation's GC mechanism (IBM research shows that 98% of objects are soon extinct), this GC mechanism is called the minor GC or the young GC. Note that the Minor GC does not represent a lack of memory in the young generation, which in fact represents only the GC on the Eden area.

The younger generation is divided into 3 regions: the Eden area (where the Eden, Adam and Eve eat the Forbidden Fruit dolls), the area where memory was first allocated, and the two surviving areas (Survivor 0, Survivor 1). The memory allocation process is (from "becoming a JAVAGC specialist part I", http://www.importnew.com/1993.html):

Most of the objects that have just been created will be allocated in the Eden area, most of which will soon die out. The Eden area is a contiguous memory space, so allocating memory on it is extremely fast; When Eden is full, execute the minor GC, clean out the extinct objects, and copy the remaining objects to a surviving area Survivor0 (at this point, Survivor1 is blank, Two survivor always have one is blank); Thereafter, each time the Eden District is full, a minor GC is added and the remaining objects are appended to the SURVIVOR0, and when Survivor0 is full, the objects that are still alive are copied directly to the Survivor1, After the Eden area executes the minor GC, the remaining objects are added Survivor1 (at this point, the Survivor0 is blank). When two surviving areas have been switched several times (Hotspot virtual machine default 15 times, with-xx:maxtenuringthreshold control, greater than the value of the old age), the surviving objects (in fact only a small number, such as our own defined objects), will be copied to the old age.

From the above process can be seen, the Eden area is a continuous space, and survivor always have one is empty. After a GC and replication, a survivor holds the currently alive object, and the contents of the Eden and another survivor area are no longer needed and can be emptied directly to the next GC, where the two survivor roles are interchanged. As a result, this way of allocating memory and cleaning up memory is highly efficient, and this garbage collection is the famous "stop-copy (stop-and-copy)" Cleanup (copy of the Eden area and the surviving object in a survivor to another survivor), This does not mean that the stop copy cleaning method is very efficient, in fact, it is only in this case efficient, if the old age to use stop copying, it is very tragic.

In the Eden area, the hotspot virtual machine uses two techniques to speed up memory allocation. respectively, Bump-the-pointer and Tlab (thread-local allocation buffers), the two techniques are: Because the Eden area is continuous, Therefore, the core of Bump-the-pointer technology is to track the last created object, when the object is created, only need to check if there is enough memory after the last object, so that the memory allocation speed greatly faster; for Tlab technology is for multithreading, the Eden area is divided into several segment, each thread uses a separate section to avoid interacting with each other. Tlab combined with Bump-the-pointer technology will ensure that each thread uses a section of the Eden area and allocates memory quickly.

Older generation (old Generation): If the object survives long enough in the young generation without being cleaned up (ie survived several young GC), it will be copied to the old age, where the older generation is generally larger than the younger generation, and can store more objects, The number of GC occurrences in older generations is also less than in younger generations.      When older generations were out of memory, the major GC, also called full GC, was executed. You can use the-xx:+useadaptivesizepolicy switch to control whether dynamic control policies are used, and if dynamic control, dynamically adjust the size of each area in the Java heap and the age of the old age.

If the object is large (such as a long string or large Array), Young is not enough space, then the large object will be directly assigned to the old age (large objects may trigger the GC, should be less used, should avoid the use of short-lived large objects). Using-xx:pretenuresizethreshold to control the size of the object directly ascending into the older generation, objects larger than this value are directly distributed in the old age.

There may be cases in which older generation objects refer to a new generation of objects, and if a young GC is required, it may be inefficient to query the entire old age to determine whether the collection can be cleaned up. The solution is to maintain a block of byte in the older generation-"card table", where all old-age objects refer to a new generation of objects recorded here. Young GC, as long as the check here, no longer to check all the old age, so performance greatly improved.

Java GC Mechanism

The basic algorithm of GC mechanism is: the collection of generational, this does not need to repeat. The collection method for each generational is described below.

  

Young generation:

In fact, in the previous section, has introduced the new generation of main garbage collection methods, in the Cenozoic, using the "Stop-copy" algorithm to clean up the new generation of memory into 2 parts, 1 part of the Eden region larger, 1 parts survivor relatively small, and is divided into two equal parts. Each time the cleanup is done, copy the Eden area and the surviving objects in a survivor to another survivor, and then clear out Eden and the survivor just now.

It is also found that in the stop-replication algorithm, the two parts used to replicate are not always equal (the traditional stop-copying algorithm is equal to two parts of memory, but the new generation uses 1 large Eden areas and 2 small survivor areas to avoid this problem)

Because most of the objects are short-lived, or even survive survivor, so, Eden area and survivor ratio is large, hotspot default is 8:1, that is, respectively, the new generation of 80%,10%,10%. If you have more than 10% of the memory surviving in a survivor+eden, you need to allocate some of the objects to the old age. The-xx:survivorratio parameter is used to configure the capacity ratio of the survivor area in the Eden region, which defaults to 8, representing the eden:survivor1:survivor2=8:1:1.

Old age: old age storage objects much more than the younger generation, and there are large objects, in the old era of memory cleanup, if the use of stop-copy algorithm, it is very inefficient.      In general, the algorithm used in the old age is the tag-collation algorithm, which is to mark the surviving object (there is a reference) and move all the surviving objects to one end to keep the memory contiguous. In the event of a minor GC, the virtual opportunity checks whether the size of the older age is greater than the amount of space left in the old age for each promotion, or if it is greater than, triggers a full GC directly, otherwise, see if the-xx:+handlepromotionfailure is set (Allow warranty failure) , if allowed, the memory allocation failure can be tolerated, and if not, the full GC (which means that if the-xx:+handle promotionfailure is set, the trigger MINORGC will trigger the full GC at the same time, if the MINORGC is not allowed. Even in the old age there is a lot of memory, so it is best not to do so.

Method Area (permanent generation):

There are two kinds of recycling for permanent generations: constants in a constant pool, useless class information, and a simple collection of constants that can be recycled without reference. For unwanted classes to be recycled, 3 points must be guaranteed: all instances of the class have been reclaimed the class object of the ClassLoader has been reclaimed without being referenced (i.e. where the class is not referenced by reflection) forever

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.