JVM Memory management and GC mechanism

Source: Internet
Author: User
Tags compact garbage collection xms java reference
I. Overview

Java GC (Garbage Collection, garbage collection, garbage collector) mechanism, is one of the main differences between Java and C++/C, as Java developers, generally do not need to write memory and garbage cleaning code, memory leaks and overflow problems, It doesn't need to be as jittery as a C programmer. After such a long period of development, the Java GC mechanism has been improving, almost automatic for us to do most of the things.

Although Java does not require developer-displayed allocation and reclaim memory, it does reduce programming difficulty for developers, but it can also have some side effects:

1. It is possible to unknowingly waste a lot of memory
2. The JVM spends too much time on memory recycling
3. Memory leaks

Therefore, as a Java programmer, you must learn the JVM memory management and recycling mechanism, which helps us to troubleshoot various memory overflow or leak problems, address performance bottlenecks, achieve higher concurrency, and write more efficient programs in our daily work. second, JVM memory space management

According to the JVM specification, the JVM divides memory into the following areas:

1. Method Area
2. Heap Area
3. Local method Stack
4. Virtual Machine stack
5. Program counter 

Where the method area and heap are shared by all threads.

2.1 Method Area

The method area holds information about the class to load, such as the class name, modifier), static variables in a class, final defined constants, field in a class, method information, and when a developer invokes a method such as GetName, Isinterface, in a class object to get information, the data comes from the method area. The method area is shared globally and, under certain conditions, it is also a GC. When the method area uses more memory than it allows, it throws a Outofmemory:permgen space exception.

In the hotspot virtual machine, this area corresponds to the permanent Generation (persistent generation), generally, the garbage collection performed on the method area is very small, so the method area is also known as one of the reasons for the persistent generation. However, this does not mean that there is no garbage collection on the method area, where garbage collection is primarily for memory recycling of constant pools and unloading of loaded classes. Garbage collection on the method area is harsh and difficult, and is described back later.

The runtime (Runtime Constant Pool) is part of the method area used to store literal constants, symbolic references, translated direct references (symbolic references that encode a string representing the position of a variable, an interface) generated at compile time. A direct reference is a translated address based on a symbolic reference that will complete the translation at the class link stage; The Run-time constant pool, in addition to storing compile-time constants, can also store constants generated during run-time, such as The Intern () method of the String class, which maintains a constant pool. If the called character "ABC" is already in a constant pool, the string address in the pool is returned, otherwise a new constant is added to the pool and the address is returned.

The relative parameters of the JVM method area, Minimum:--xx:permsize;--xx:maxpermsize max value. 2.2 Heap Area

The heap area is the most important area to understand the JAVAGC mechanism. In the memory managed by the JVM, the heap area is the largest piece, and the heap area is the main memory area managed by the JAVAGC mechanism, and the heap area is shared by all threads and created when the virtual machine is started. The heap area is used to store object instances and array values, and you can assume that all objects created in Java are allocated here.

For heap size, it can be controlled by parameter-xms and-XMX,-xms the most recent heap memory requested for JVM startup, with a default of 1/64 of physical memory but less than the maximum 1gb;-xmx memory that can be requested by the JVM, which defaults to 1/4 of physical memory but less than 1GB. By default, when the remaining heap space is less than 40%, the JVM increases the heap to-xmx size, which is controlled by the-xx:minheapfreeradio parameter, and when the free heap memory is greater than 70%, the JVM reduces the heap size to the-XMS specified size, which can be achieved by-XX: Maxheapfreeratio to specify this ratio. For systems, to avoid the frequent resizing of heap during run time, we usually set the-XMS and-xmx to be the same.

In order to make memory recycling more efficient (and later specifically why you want to divide it), the heap has been managed from Sun JDK 1.2, as shown in the following illustration:
younger generation (young Generation)

When an object is created, memory is first assigned in the younger generation (note that large objects can be assigned directly in the old age). The Minor GC(also known as young GC) is triggered when the younger generation needs to be recycled.

The young generation is composed of Eden space and two Survivorspaces (also known as S0 and S1) of the same size, which can be adjusted by the-XMN parameters to adjust the Cenozoic size, or by-xx:survivorradio to adjust Eden Space and survivor space size. Different GC methods will divide Eden space and survivor space according to this value in different ways, and some GC methods will dynamically adjust the size of Eden, S0, and S1 according to their health.

The memory of the Eden area of the young generation is continuous, so the distribution will be very fast; Likewise, the Eden area is recovering very quickly (because in most cases the Eden object survives a very short period of time, and the Eden area uses a replication recovery algorithm that is very efficient in the case of a small number of surviving objects, described later in detail).

If you do not have enough memory allocations or extensions after the garbage collection has been performed, you will throw a Outofmemoryerror:java Heap space exception. older generation (old Generation)

The old age is used to store objects that are still alive in the younger generation after repeated garbage collection, it can be understood as an older object, such as a cached object, and a newly created object may be allocated memory directly in the old age, which is mainly in two cases: a large object that can be set-xx by the startup parameter: pretenuresizethreshold=1024, said more than the younger generation is not assigned, but directly in the old age distribution. This parameter is not valid for young generations to adopt the Parallel scavenge GC , because it determines what objects are allocated to memory directly in the old age, and the other is a large array object, and no external object is referenced in the array object.

When the old age is full, it is necessary to recycle the old age, garbage collection in the old age is called the Major GC(also known as the fully GC).

The memory size of the older generation is-xmx corresponding to the value minus the-xmn corresponding value. 2.3 Local methods Stack (Native method Stack)

The local method stack is used to support the execution of the native method, storing the state of each native method call. The local method stack is consistent with the virtual machine method stack, and the only difference is that the virtual machine stack executes the Java method, and the local method stack is used to execute the native method, in many virtual machines (such as the default hotspot virtual machine for Sun's JDK), The local method stack is used with the virtual machine stack. 2.4 Programs Counter (program Counter Register)

A program counter is a smaller area of memory, possibly a CPU register or operating system memory, which is primarily used to indicate that the byte code executed by the current thread executes to the first few lines, which is understood to be the line number indicator of the current thread. When the bytecode interpreter is working, a statement instruction is removed by changing the value of this counter. Each program counter is used only to record the line number of a thread, so it is thread-private (one thread has a program counter).

If the program executes a Java method, the counter records the executing virtual machine byte-code instruction address, and if the execution is a local (native, written by C language) method, the counter's value is undefined, because the program counter only records the current instruction address. So there is no memory overflow, so the program counter is the only region in all JVM memory areas that does not have a defined outofmemoryerror. 2.5 Virtual Machine stacks (JVM stack)

The virtual machine stack occupies the operating system memory, each thread corresponds to a virtual machine stack, it is thread-private, and the allocation is very efficient. Each method of a thread executes at the same time as the will create a stack frame (statck frame), stored in the stack frame with local variables table, operator station, dynamic link, method exit, etc., when the method is called, stack frames in the JVM stack into the stack, when the method is completed, stack frame stack.

The Local variables table stores the relevant local variables of the method, including various basic data types, object references, return addresses, and so on. In a local variable table, only the long and double types occupy 2 local variable spaces (Slot, for 32-bit machines, one Slot is 32 bit), and the others are 1 Slot. It should be noted that the local variable table is at compile time has been determined, the method of operation required to allocate the space in the stack frame is completely determined, in the life cycle of the method will not change.

Two exceptions are defined in the virtual machine stack that throw a statckoverflowerror(Stack overflow) if the thread calls a stack depth greater than the maximum depth allowed by the virtual machine; however, most Java virtual machines allow the dynamic expansion of the virtual machine stack size (with a small number of fixed-length ones). , so the thread can always apply for stacks until there is not enough memory to throw outofmemoryerror(memory overflow). 2.6 Java Object access method

In general, a Java reference access involves 3 areas of memory: The JVM stack, the heap, and the method area. In the simplest local variable reference: Object OBJREF = new Object () For example: Object OBJREF represents a local reference, stored in a local variable table in the JVM stack, representing a reference type data; new Object () As the instance object data is stored in the heap, the address of the type data (interface, method, field, object type, etc.) that can be queried to this object is recorded in the heap, and the actual data is stored in the method area;

In the Java Virtual Machine specification, only the reference to the object is specified, and the way to access the specific object through the reference type is not stipulated, but there are two main ways to implement the current mainstream: 2.6.1 access through the handle

In the implementation of handle access, a separate chunk of memory is divided into the JVM heap as a handle pool, with pointers to object instance data (in the heap) and object type data (in the method area) stored in the handle pool. This implementation method is stable because it represents an address with a handle.
2.6.2 through direct pointer access

In the way of direct pointer access, the reference stores the actual address of the object in the heap, and the object information stored in the heap contains the corresponding type of data in the method area. The biggest advantage of this approach is its speed, which is the way it is used in hotspot virtual machines.
third, JVM memory allocation

The memory consumed by the Java object is implemented primarily on the heap, because the heap is thread-shared, so it is necessary to lock the memory on the heap, which causes the overhead of creating the object to be more costly. When there is insufficient space on the heap, the GC will be set off, and if there is still insufficient space after the GC, the OutOfMemory exception will be thrown.

In order to improve memory allocation efficiency, the hotspot virtual machines in the Young's Eden area use two technologies to speed up memory allocation, respectively, bump-the-pointer and Tlab (thread-local allocation Buffers). Because Eden area is continuous, so the core of Bump-the-pointer technology is to track the last created object, when the object is created, only need to check if there is enough memory behind the last object, thus greatly speed up memory allocation While for Tlab technology is for multithreading, it allocates a separate space for each newly created thread in the new generation of Eden Space, called Tlab (Thread Local allocation Buffer), The size is calculated by the JVM based on the operating conditions. The-xx:tlabwastetargetpercent can be used to set the percentage of Eden space that it can occupy, by default 1%. allocating memory on the Tlab does not require a lock, and the general JVM allocates memory on the Tlab, and if the object is too large or the Tlab space is exhausted, it is still allocated on the heap. As a result, multiple small objects are more efficient to allocate than large objects when writing programs. You can add-xx:+printtlab to the startup parameters to view the use of Tlab space.

If the object survives long enough in the young generation without being cleaned up (i.e. surviving after several minor GC), will be copied to the older generation, the older generation of space is generally larger than the younger generation, can store more objects, in the older generation of the number of GC times less than the younger generation. When older generations were out of memory, the major GC, also called full GC, was executed.

You can use the-xx:+useadaptivesizepolicy switch to control whether dynamic control policies are used, and if dynamic control, dynamically adjust the size of each area in the Java heap and the age of the old age.

If the object is relatively large (such as long string or large array), the young generation space is not enough, then large objects will be directly assigned to the old age (large objects may trigger the GC, should be less used, but also avoid the use of short-lived large objects). Using-xx:pretenuresizethreshold to control the size of the object directly ascending into the older generation, objects larger than this value are directly distributed in the old age. Four, the memory recovery method

The JVM uses GC to reclaim memory in the heap and method areas, and this process is performed automatically. When it comes to the Java GC mechanism, it basically completes 3 things: determining which memory needs to be recycled, determining when the GC is to be performed, and how to execute the GC. The JVM implements the GC primarily in the form of collectors, which have reference count collectors and trace collectors. 4.1 Reference count collector

The reference counter is managed in a decentralized manner, recording whether the object is referenced by counter. When the counter is 0 o'clock, the object is no longer in use and can be recycled, as shown in the figure:

In the above illustration, when Objecta releases a reference to OBJECTB, the OBJECTB reference counter becomes 0, which reclaims the memory occupied by OBJECTB.

The reference counter needs to be referenced every time the object is assigned a reference counter to increase or decrease, he has a certain consumption. In addition, reference counters do not have the means to recycle the scenes referenced by the loop. For example, in the above example, if OBJECTB and OBJECTC refer to each other, then even if Objecta releases references to OBJECTB and OBJECTC, it cannot reclaim OBJECTB, OBJECTC, Therefore, reference counters are very unsuitable for Java, a language that forms a complex reference relationship, and SUNJDK does not use this approach when implementing GC. 4.2 Tracking Collector

The trace collector uses a centralized management method that globally records the state of the data reference. Triggers based on certain conditions, such as timing, lack of space, need to scan the object's reference relationship from the root collection, which may cause the application to pause. There are three implementations of replication (copying), tag-purge (Mark-sweep) and tag-compress (mark-compact) . replication (copying)

Replication takes the form of scanning the surviving object from the root collection and copying the found surviving object into a new completely unused space, as shown in the figure:

The replication collector only needs to scan all the surviving objects from the root set, and when there are fewer surviving objects in the space to be recycled, the replication algorithm is more efficient (this algorithm is used in the young Eden area), and the cost is to increase the space of an empty memory and move the object. mark-Clear (marking-deleting)

Mark-Purge takes the form of a scan from the root collection, marks the surviving object, and then scans the unmarked objects in the entire space and clears them, marking and clearing the process as shown in the following illustration:

The blue part of the image above is a recyclable object with reference to the surviving object, the brown part not referenced. In the marking phase, all objects are scanned for Mark objects, and scanning the process can be time-consuming.

The purge phase reclaims objects that are not referenced, and the surviving objects are preserved. The memory allocator holds a list of references in the free space, and queries the free space reference list for allocation when there is an allocation request.

Tag-The purge action does not require object movement and is processed only for objects that are not alive. It is more efficient in the case of more surviving objects in space, but it can result in memory fragmentation because the token-purge reclaims the memory occupied by the surviving object directly. mark-Compress (mark-compact)

Tag-Compress and mark-clear like, is to mark the living object, but after clearing the processing is not the same, marking-compression after clearing the memory occupied by the object, will move all living objects to the left free space, and then update the pointer to its object, as shown in the following illustration:

Obviously, tag-compression on the basis of mark-clear on the survival of the object to move the regular movement, solve the memory fragmentation problem, get more contiguous memory space to improve the allocation efficiency, but because of the need to move the object, so the cost is relatively high. v. GC process in virtual machines 5.1 Why to recycle.

At the outset, the JVM's GC was done with tag-purge-compression, which is not very efficient, because as objects are allocated more and more, the list of objects becomes more and more time-consuming, scanning and moving more and more slowly, resulting in slower memory recycling. However, according to the analysis of Java application, found that most of the objects are very short survival time, only a small number of data survival cycle is relatively long, see the face of Java Object Memory survival time statistics:

As can be seen from the chart, most objects live for very short periods of time, and fewer objects are allocated. 5.2 The process of GC in virtual machines

After the introduction, we know why the JVM has to be recycled, so let's take a look at the entire recycling process.

In the initial phase, the newly created object is assigned to the Eden area, and the survivor two blocks of space are empty.

When Eden was full, minor garbage was triggered.

After scanning and marking, the surviving objects are copied to the S0, and the surviving objects are recycled.

In the next minor GC, the Eden area is consistent with the above, the objects that are not referenced are recycled, and the surviving objects are copied to the Survivor area. In the survivor area, however, all of the data in S0 are replicated to S1, and it should be noted that two of the objects moved to S0 during the last minor GC process are 1 older when copied to S1. At this time the S0 area of Eden is emptied, all surviving data is replicated to the S1 area, and the S1 area has an age-different object, as shown in the following illustration:

The next time the MINORGC repeats the process, this time survivor two areas are swapped, the surviving objects are copied to S0, the surviving object age plus the 1,eden area and another survivor area are emptied.

The following is a demonstration of the promotion process, and after several minor GC, when the age of the surviving object reaches a threshold (configurable by parameter, default is 8), it is promotion from the younger generation to the old age.

With MINORGC again and again, there will be new objects promote to the old age.

It basically covers all the recycling processes in the young generation. Eventually, the MAJORGC will occur in the old age, and the space of the old will be cleared and compressed.

From the above process can be seen, the Eden area is a continuous space, and survivor always have one is empty. After a GC and replication, a survivor holds the currently alive object, and the contents of the Eden and another survivor area are no longer needed and can be emptied directly to the next GC, where the two survivor roles are interchanged. Thus, the efficiency of allocating and cleaning memory in this way is extremely high, and this garbage collection is the famous "Stop-copy (stop-and-copy)" Cleanup (Copy the Eden area and the surviving object in a survivor to another survivor), this does not mean that stopping the replication cleanup is efficient, but it is also only efficient in this case (based on the fact that most objects have a very short lifetime), and if you use stop replication in the old age, is very inappropriate.

Older generations store more objects than the younger generation, and there are large objects, the old age of memory cleanup, if using a stop-copy algorithm, it is very inefficient. In general, the algorithm used in the old age is the tag-compression algorithm, which is to mark the surviving object (there is a reference) and move all the surviving objects to one end to keep the memory contiguous. In the event of a minor GC, the virtual opportunity checks whether the size of the older age is greater than the amount of space left in the old age for each promotion, or if it is greater than, triggers a full GC directly, otherwise, see if the-xx:+handlepromotionfailure is set (Allow warranty failure) , if allowed, the memory allocation failure can be tolerated, and if not, the full GC (which means that if the-xx:+handle promotionfailure is set, the trigger MINORGC will trigger the full GC at the same time, if the MINORGC is not allowed. Even in the old age there is a lot of memory, so it is best not to do so.

On the method area that is, the collection of permanent generation, there are two kinds of recycling in the permanent generation: constants in a constant pool, useless class information, and a simple collection of constants that can be recycled without reference. For unwanted classes to be recycled, 3 points must be guaranteed:

1. All instances of the class have been reclaimed by
2. The ClassLoader of the load class has been reclaimed by
3. Class object is not referenced (that is, where the class is not referenced by reflection)

A collection of permanent generations is not necessary, and you can set whether to recycle the class by using parameters. vi. Garbage collector

Through the above introduction, we have learned of the JVM's memory recycling process, and in the virtual machine, the GC is implemented by the garbage collector, so in the actual application scenario we need to choose the appropriate garbage collector according to the application, we will introduce the garbage collector below. 6.1 serial (serial) collector

The default configuration used by the serial collector JavaSE5 and the client virtual machines in 6, which is the simplest collector, is more suitable for systems with only one processor. In a serial collector, the minor and major GC processes are garbage collected with one thread. Working with scenes

First, the serial GC is typically used in scenarios where application pauses are not very high and run in client mode, which uses only one CPU core for garbage collection. Under current hardware conditions, a serial GC can manage many small memory applications and can guarantee a relatively small pause (which takes about a few seconds in the case of a full GC). Another scenario that typically uses a serial GC is when a machine runs multiple JVM virtual machines (the number of JVM virtual machines is larger than the CPU core), in which case a JVM using only one processor when garbage collection does not have a significant impact on other JVMs. Finally, it is more suitable to use serial collectors in some hardware devices with smaller memory and less CPU core. Related Parameters Command

1 Enable serial collectors:-XX:+USESERIALGC

2 Command Line example:

Java-xmx12m-xms3m-xmn1m-xx:permsize=20m-xx:maxpermsize=20m-xx:+useserialgc-jar c:\javademos\demo\jfc\Java2D\ Java2demo.jar
6.2 Parallel Collectors

Parallel collectors use multithreading for garbage collection, and the parallel approach can bring great CPU throughput. It does not have any impact on the running application without garbage collection, and the process GC uses multithreading to increase the rate of recovery, so the parallel collector is very suitable for batch processing situations. Of course, it is recommended that the concurrency collector described below be used if the application is very demanding for program pauses. On the default one N-CPU machine, the number of threads in parallel recycle is n. Of course, the number of parallel can be controlled by parameters:-xx:parallelgcthreads=<desired number>. The parallel collector is the default collection of server-level machines (CPU greater than 2 and memory greater than 2G).

On a single core CPU machine, even if a parallel collector is configured, the default collector is still used in the actual recycle. If there are only two CPUs on a machine, the effect of using the parallel collector and the default collector is almost the same, and only if the number of CPUs is greater than 2, the pause time of the young generation is reduced. Application Scenarios

The parallel collector is suitable for multiple CPUs and short pause time requirements. In general, some batches of applications such as report printing, database query can be used in parallel collectors. in the young alternative multithreading, the elderly alternative single path

1 Enabling commands:-XX:+USEPARALLELGC

2 Command Line example:

Java-xmx12m-xms3m-xmn1m-xx:permsize=20m-xx:maxpermsize=20m-xx:+useparallelgc-jar c:\javademos\demo\jfc\Java2D\ Java2demo.jar
the younger generation and the old age all use multithreading

1 Enabling commands:-XX:+USEPARALLELOLDGC

When the-XX:+USEPARALLELOLDGC option is enabled, the collection of garbage collected by the younger generation and the old age is multi-threaded, and is multi-threaded in the compression phase. Because the hotspot virtual machine in the younger generation uses the stop-copy algorithm, the younger generation does not have the compression process, but the old age uses is the mark-purge-compression algorithm, therefore only has the compact process in the old age.

2 Command Line example:

Java-xmx12m-xms3m-xmn1m-xx:permsize=20m-xx:maxpermsize=20m-xx:+useparalleloldgc-jar c:\javademos\demo\jfc\ Java2d\java2demo.jar
6.3 CMS (Concurrent Mark Sweep) collector

The CMS collector is primarily used in the permanent zone, which attempts to reduce pauses in the garbage collection process in a multithreaded concurrency format. The CMS collector does not copy or move the surviving objects. Application Scenarios

The CMS collector is primarily used in scenarios where applications require high pause times, such as desktop UI applications that respond to user action events in a timely manner, that the server must be able to respond quickly to client requests, or that the database responds quickly to query requests, and so on. Related command Parameters

1 Enable CMS collector:-XX:+USECONCMARKSWEEPGC

2 Set the number of threads:-xx:parallelcmsthreads=<n>

3 Command Line example:

Java-xmx12m-xms3m-xmn1m-xx:permsize=20m-xx:maxpermsize=20m-xx:+useconcmarksweepgc-xx:parallelcmsthreads=2-jar C : \javademos\demo\jfc\java2d\java2demo.jar
6.4 G1 Collector

G1, Garbage, is the new collector that appears in Java 7, and its goal is to replace the existing CMS collector. G1 has the characteristics of parallelism, concurrency, incremental compression, pause time and so on, not to be introduced in detail here. Related command Parameters

1 Enable G1 Collectors:-XX:+USEG1GC

2 Command Line example:

Java-xmx12m-xms3m-xx:+useg1gc-jar C:\javademos\demo\jfc\Java2D\Java2demo.jar

Reprint please indicate the Source: original link

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.