JVM runtime memory structure and JVM runtime Memory Structure
1. JVM Memory Model
JVM runtime memory = shared memory zone + thread memory Zone
1). Shared Memory Zone
Shared Memory zone = persistent band + heap
Persistent tape = method zone + others
Heap = Old Space + Young Space
Young Space = Eden + S0 + S1
(1) Persistent tape
JVM uses Permanent Space to implement the method area, which stores all loaded class information, method information, and constant pools.
You can use-XX: PermSize and-XX: MaxPermSize to specify the persistent band initialization value and maximum value.
Permanent Space is not the same as the method area, except that the Hotspot JVM uses the Permanent Space to implement the method area. Some virtual machines use other mechanisms to implement the method area without the Permanent Space.
(2) Heap
Heap, mainly used to store information about class object instances.
Heap is divided into Old Space (also known as Tenured Generation) and Young Space.
Old Space stores long-lived objects in applications;
Eden (Eden) Stores new objects;
S0 and S1 are two memory regions with the same size. They mainly store the Eden survival objects after each garbage collection, as a buffer for the transition from an Eden to an Old Space (S refers to the English word routing vor Space ).
The reason why the heap should be divided is to facilitate object creation and garbage collection, which will be explained later by the garbage collection department.
2). Thread memory Zone
Thread memory zone = single thread memory + .......
Single thread memory = PC Regster + JVM stack + local method Stack
JVM stack = stack frame + .....
Stack frame = local variable area + operand area + frame data area
In Java, a thread corresponds to a JVM Stack, which records the running status of the thread.
The JVM stack is composed of stack frames. A stack frame represents a method call. Stack frames are composed of three parts: local variable zone, operand stack, and frame data zone.
(1) local Variable Area
The local variable area can be understood as a memory area managed in array form. It starts counting from 0, and the space of each local variable is 32 bits, that is, 4 bytes.
The basic types such as byte, char, short, boolean, int, float, and Object Reference occupy the space of a local variable, values of the short, byte, and char Types must be converted to int values before they are stored in the array. long and double occupy the space of two local variables. when accessing long and double local variables, you only need to take the index of the first variable space ,.
For example:
?
1234567 |
Public static int runClassMethod (int I, long l, float f, double d, Object o, byte B) {return 0;} public int runInstanceMethod (char c, double d, short s, boolean B) {return 0 ;} |
The corresponding local variable area is:
The first item in the local variable area of runInstanceMethod is a reference, which specifies the reference of the object itself, which is commonly used as this. However, this reference is not provided in the runClassMethod method, that's because runClassMethod is a static method.
(2) operand Stack
The same as the local variable area, the operand stack is also organized into an array in characters. But unlike the former, it does not access through indexes, but through inbound and outbound stacks. The operand stack is the storage area of temporary data.
For example:
?
123 |
Int a = 100; int B = 5; int c = a + B; |
The corresponding operand stack changes:
It can be concluded that the operand stack is actually a temporary data storage area, which is operated through the inbound and outbound stacks.
PS: In the JVM implementation, there is a stack-based instruction set (Hotspot, oracle JVM), and a register-based instruction set (DalvikVM, Android JVM). What is the difference between the two?
Stack-based instruction sets have advantages such as simple access, hardware independence, compact code, and stack allocation without considering physical space allocation. However, the same operation requires more inbound and outbound operations, therefore, the memory consumed is larger. The biggest advantage of the register-based instruction set is that the number of commands is small and the speed is fast, but the operation is relatively cumbersome.
Example:
?
1234567891011121314 |
Package com. demo3; public class Test {public static void foo () {int a = 1; int B = 2; int c = (a + B) * 5 ;} public static void main (String [] args) {foo ();}} |
The stack-based hot spot execution process is as follows:
The execution process of register-based DalvikVM is as follows:
The above two methods finally pass through the JVM execution engine. The Assembly command received by the CPU is:
(3) frame data Zone
The frame data area stores the pointer address pointing to the constant pool. When some commands need to obtain the data of the constant pool, they can access the data of the constant pool through the pointer address in the frame data area. In addition, the frame data area also stores data required for normal return and exceptional termination.
2. Garbage Collection Mechanism
1) Why garbage collection?
JVM automatically detects and releases memory that is no longer in use, improving memory utilization.
The JVM executes GC during Java runtime, so that the programmer no longer needs to explicitly release the object.
2) memory areas to recycle
Because the thread memory area is allocated and recycled as the thread is generated and exited, garbage collection is mainly concentrated in the shared memory area, that isPermanent Space and Heap).
3) how to determine whether an object is dead (Object Tag)
(1) reference counting method
The reference counting method records the number of times the object is referenced by a counter. The method is simple and efficient, but it cannot solve the problem of circular reference. For example, object A contains A reference pointing to object B, and object B also contains A reference pointing to object A, but does not reference pointing to A and B. In this case, if reference counting is used, the objects A and B will not be recycled if they are referenced for 1. JVM does not use this method.
(2) root search (Accessibility Analysis Algorithm)
ROOT search (Accessibility Analysis Algorithm) can solve the problem of circular object reference. The basic principle is to use a ROOT object called "gc root" as the starting point, and then search down the node based on the association relationship, A search path is called a reference chain, which is often referred to as a reference. If the ROOT object of "gc root" cannot find any object connected to the path, it is determined that the object can be recycled, which is equivalent to the feeling that the object cannot be found at home.
Example:
GC collects objects that are not GC root and are not referenced by GC root. An object can belong to multiple GC root.
GC root has the following types:
Objects referenced in the Virtual Machine stack (the local variable table in the stack frame)
Objects referenced by class static attributes in the Method Area
Objects referenced by constants in the method Area
Objects referenced by JNI (native method) in the local method Stack
It is used for special JVM objects, such as system class loaders.
Although there is a accessibility analysis algorithm to determine the object state, this is not a condition for whether the object is recycled. The condition for object recycling is far more complex than this. Objects that cannot be associated with gc root are not immediately recycled. If this object is not associated and mark2 is not marked, it enters a slow stage, marked for the first time (mark1), and then put into an F-Queue; if this object is marked by mark2, it will be recycled.
The F-Queue is executed by a lower-priority Finalizer thread. The mark1 object is waiting to execute its finalize () method (JVM does not guarantee that the finalize () method will end after it finishes running, because the finalize () method is slow or has an endless loop, other elements in the queue will be affected ). Run the finalize () method of the mark1 object to mark the second time (mark2 ). Later GC will execute "search, mark 1, mark 2" according to this logic ".
This "marking" process is the basis of subsequent garbage collection algorithms.
PS:
If the object is referenced again in the finalize () method, the object will be revived.
The finalize () method is executed only once, so the object has only one chance to be revived.
3) garbage collection Algorithm
There are three main garbage collection algorithms:
Mark-clear
Mark-copy
Mark-organize
These three methods all have a "tag" process, which is the root search (Accessibility Analysis Algorithm) described above ). The subsequent "clear", "copy", and "sort" actions are the implementation methods for specific objects to be recycled.
(1) mark-clear
After being marked by the root search (Accessibility Analysis Algorithm), the memory space occupied by objects marked as spam is directly released. The disadvantage of this algorithm is that there are many memory fragments.
Although the disadvantages are obvious, this policy is the basis of the latter two policies. Because of its shortcomings, the latter two strategies have been created.
Animation:
(2) mark-copy
After being marked by the root search (Accessibility Analysis Algorithm), the memory is divided into two parts, and all the objects retained in one memory are copied to another
A piece of free memory.
Animation:
The disadvantage of this algorithm is that the available memory is half. How can this problem be solved?
JVM divides heap into the young and old sections. The young zone includes eden, s0, and s1, and the size of the three zones is proportional to the size of the three zones. For example, you can divide the Eden into two small shard vor blocks at. During each GC, the surviving Eden and S0 objects are copied to another idle S1.
Garbage collection in the young area often occurs. It is called Minor GC (secondary garbage collection ). Generally, when a new object is generated and the Eden application fails, Minor GC is triggered to perform GC on the Eden region to clear non-surviving objects, and move the surviving objects to the same vor area. Then, sort out the two zones in the same vor. In this way, GC is performed on the Eden area of Young space and does not affect the Old space. Because most objects start from the Eden area and the Eden area is not allocated much, GC in the Eden area is performed frequently. Therefore, it is generally necessary to use fast and efficient algorithms so that Eden can be idle as soon as possible.
Main processes of Minor GC:
A. newly generated objects are allocated memory in the Eden area;
B. When the Eden area is full and an object is created, the minorGC is triggered because the requested space is not available to recycle the garbage in the young (eden + 1 small VOR) area. (Why is it eden + 1 primary vor: one of the two primary vor instances is always empty, and the empty one is marked as "To standby vor );
C. In minorGC, objects that cannot be recycled by Eden are put into an empty volume VOR (that is, To assign vor, And Eden is certainly cleared), and another volume VOR (From Region VOR) objects that cannot be recycled by GC will also be put into this same VOR (To another VOR), always ensure that a same VOR is empty. (After MinorGC is complete, the tags of To assign VOR and From assign vor are swapped );
D. When Step 4 is used, if the fully occupied vor of the object to be stored is found, the objects will be copied to the old area, or the fully occupied vor area will not be filled, however, some objects are Old enough (set by the XX: MaxTenuringThreshold parameter) and put in the Old area. (The age of an object increases by one year for every Minor GC time in the VOR region. When the age increases to a certain extent (15 years by default, will be promoted to the old age)
(3) mark-organize
Can the old space mark-copy policy? Of course not!
Most objects in young space are short-lived objects. After each GC, the remaining number of live objects is not large. Most of the objects in the old space are objects with a particularly long life cycle. Even after GC, there will still be a large number of live objects. If you still use the replication action, the recovery efficiency will be very low.
According to the features of the old space, you can organize the action. During sorting, clear the objects to be cleared, compress the surviving objects to the end of the heap, and discharge them in sequence.
Animation:
The garbage collection of Old space (+ Permanent Space) occurs occasionally and is called Full GC (Major garbage collection ). Full GC is slower than Minor GC because the whole heap needs to be recycled, including Young, Old, and Perm. Therefore, the number of Full GC times should be minimized. In the process of JVM optimization, a major part of the work is to adjust FullGC.
Full GC may occur due to the following reasons:
4) Garbage Collector
The garbage collection algorithm is the theoretical basis of memory collection, and the garbage collector is the specific implementation of memory collection.
Heap generation is currently adopted by most JVM. Its core idea is to divide the memory into several different regions based on the lifecycle of the object. Generally, heap partitioning is divided into old space and Young space. The old space features that only a few objects need to be recycled during garbage collection, the feature of Young space is that a large number of objects need to be recycled every time garbage collection is made. Therefore, the most suitable collection algorithm can be adopted based on the characteristics of different generations.
Currently, most garbage collectors use the "tag-Copy" algorithm for Young space. Because the Old space feature is that only a small number of objects are recycled each time, the "tag-sorting" algorithm is generally used.
(1) GC implementation on Young Space:
Serial (Serial ):The Serial collector is the most basic and oldest collector. It is a single-thread collector and must suspend all user threads for garbage collection. The Serial collector is designed for the new generation of collectors and uses the "tag-Copy" algorithm. Its advantage is its simplicity and efficiency, but its disadvantage is that it will bring a pause to users. This collector type applies only to single-core CPU desktops. Using the serial collector significantly reduces application performance.
ParNew (parallel ):The ParNew collector is a multi-threaded version of the Serial collector. It uses multiple threads for garbage collection.
Parallel Scavenge (Parallel ):The Parallel Scavenge collector is a new generation of multi-thread collectors (Parallel collectors). It does not need to suspend other user threads during collection. It uses the "tag-Copy" algorithm, the collector differs from the first two collectors to achieve a controllable throughput.
(2) GC implementation on the Old Space:
Serial Old (Serial ):The Old Space version of the Serial collector uses the "tag-sorting" algorithm. This collector type applies only to single-core CPU desktops. Using the serial collector significantly reduces application performance.
Parallel Old (Parallel ):Parallel Old is the Old Space version (Parallel collector) of the Parallel Scavenge collector. It uses multithreading and the "mark-Arrangement" algorithm.
CMS (concurrency ):The CMS (Current Mark Sweep) collector is a collector designed to get the shortest recovery pause time. It is a concurrent collector that uses the "Mark-clear" algorithm.
(3). G1
The G1 (Garbage First) collector is a new collector provided by JDK1.7. The G1 collector is implemented based on the "tag-sort" algorithm, which means no memory fragments are generated. Another feature is that the collection range of previous collectors is the whole new generation or the old generation, while G1 will be the whole Java heap (including the new generation and the old generation ).
3. JVM Parameters
1). Heap
-Xmx: Maximum heap memory, for example,-Xmx512m
-Xms: the initial heap memory, for example,-Xms256m.
-XX: MaxNewSize: maximum memory in the young Zone
-XX: NewSize: the memory in the young zone at the beginning. Usually 1/3 or 1/4 of Xmx. New Generation = Eden + 2 shards vor space. The actual available space is = Eden + 1 dedicated vor, that is, 90%
-XX: MaxPermSize: Maximum persistent memory
-XX: PermSize: Persistent memory at the beginning
-XX: + PrintGCDetails. Print GC Information
-XX: Ratio of NewRatio to the new generation and the old generation. For example, if-XX: NewRatio = 2, the new generation accounts for 1/3 of the total heap space, and the old generation accounts for 2/3.
-XX: Ratio of Eden to vor in the new generation of SurvivorRatio. The default value is 8. That is to say, Eden accounts for 8/10 of the new generation space, and the other two vbrs each account for 1/10.
2). Stack
-Xss: Set the stack size of each thread. JDK1.5 + the stack size of each thread is 1 M. Generally, if the stack is not deep, 1 M is definitely enough.
3). Garbage Collection
4). JVM client mode and server Mode
The Java_home/bin/java command has a-server and-client parameter, which identifies JVM startup in server or client mode.
The main difference between JVM Server mode and client mode is that-Server mode starts slowly, but the performance will be greatly improved once it starts up. When the virtual machine runs in-client mode, it uses a lightweight compiler codenamed C1, while the virtual machine started in-server mode adopts a relatively heavyweight compiler named C2. c2 is more thoroughly compiled than C1 compiler, and delivers higher performance after service.
(1) view the current default JVM Startup Mode
Java-version: Check whether the client or server is used by default.
(2) Automatic Detection in JVM default startup mode
From JDK 5, if the-client or-server parameter is not explicitly used, the JVM automatically determines the mode to be used based on the machine configuration and JDK version.
The definition of a server-class machine is one with at least 2 CPUs and at least 2 GB of physical memory.
For windows, the 64-bit JDK does not provide the-client mode. The server mode is used directly.
(3). Change the JVM startup mode through the configuration file.
To switch between the two modes, you can change the configuration (jvm. cfg configuration file:
The 32-bit JVM configuration file is in JAVA_HOME/jre/lib/i386/jvm. cfg,
The 64-bit JVM configuration file is in JAVA_HOME/jre/lib/amd64/jvm. cfg. Currently, 64-bit only supports server mode.
For example:
The content of the jvm. cfg file in JDK 5 of the 32-bit version:
?
123456 |
-Client KNOWN-server KNOWN-hotspot ALIASED_TO-client-classic WARN-native ERROR-green ERROR |
The jvm. cfg file of the 64-bit JDK 7 is as follows:
?
123456 |
-Server KNOWN-client IGNORE-hotspot ALIASED_TO-server-classic WARN-native ERROR-green ERROR |
4. Stack VS Stack
The JVM stack is the runtime unit, while the JVM stack is the storage unit.
The JVM stack represents the processing logic, while the JVM stack represents the data.
The JVM heap stores objects. The JVM stack stores basic data types and references of objects in the JVM stack.
The JVM stack is shared by all threads, and the JVM stack is unique to threads.
PS: in Java, what is the pass-through of parameters? Or transfer the address?
We all know that function parameters in C are passed in three forms: value transfer, address transfer, and reference transfer.However, in Java, there is only one way to pass method parameters: Value transmission.The so-called value transfer means to pass the copy (replica) of the actual parameter value into the method, and the parameter itself will not be affected.
To clarify this problem, we must first clarify two points:
1. Reference is a data type in Java, which is the same as the basic type int.
2. The program is always running in the JVM stack. Therefore, when passing parameters, there is only a problem of passing basic types and object references. The object itself is not directly transmitted.
When running the JVM stack, the basic type and reference processing are the same and both are passed values. If it is a reference transfer method call, it can be understood as a value transfer call for "reference value", that is, the "reference value" is a duplicate and then assigned to the parameter, the reference processing is exactly the same as the basic type. However, when a method is called, The passed reference value is interpreted (or searched) by the program to the object in the JVM heap, which corresponds to the real object. If the modification is made at this time, the corresponding object is modified instead of the reference itself, that is, the data in the JVM heap is modified. Therefore, this modification can be maintained.
For example:
?
1234567891011121314151617181920212223 |
Package com. demo3; public class DataWrap {public int a; public int B;} package com. demo3; public class ReferenceTransferTest {public static void swap (DataWrap dw) {int tmp = dw. a; dw. a = dw. b; dw. B = tmp;} public static void main (String [] args) {DataWrap dw = new DataWrap (); dw. a = 6; dw. B = 9; swap (dw );}} |
Corresponding memory diagram:
Appendix: