Http://liuwangshu.cn/java/jvm/1-runtime-data-area.html
Preface
Originally planned to write Android memory optimization, I think it is necessary to introduce the knowledge of Java Virtual machine, Java Virtual machine is not few words can be introduced, so open the Java Virtual Machine series, In this article we will learn about the structure and runtime data regions of a Java virtual machine.
1.Java Virtual Machine Overview
Oracle's officially defined Java technology architecture consists of the following components:
- Java programming language
- Java virtual machines for various platforms
- class file format
- Java API Class Library
- Third-party Java class libraries
The three parts of Java programming language, Java Virtual machine and Java API Class Library can be referred to as JDK (Java Development Kit), which is the minimal environment for Java program development. In addition, the Java SE API subset and Java Virtual machine in the Java API are collectively referred to as the JRE (Java Runtime Environment), which is the standard environment for Java programs to run.
From the above you can see the Java Virtual Machine and its importance, it is the cornerstone of the entire Java platform, is the Java language compiled code running platform. You can think of a Java virtual machine as an abstract computer with a variety of instruction sets and various runtime data regions.
1.1 Java Virtual machine family
Many students may think that the Java virtual machine is just a virtual machine, and it has a family? Or think of the Java Virtual machine as referring to Oracle's hotspot virtual machine. Here's a brief introduction to the Java Virtual Machine family, and since the Sun Classic VMs included in the JDK1.0 released by Sun in 1996, there are a number of virtual machines that have emerged and perished today, we'll simply describe the relative mainstream Java virtual machines that are currently surviving.
HotSpot VMS
Virtual machines from Oracle JDK and OPENJDK are the most mainstream and most widely used Java virtual machines. The technical articles on Java Virtual machines are introduced, most of which are introduced to hotspot VMS without special instructions. The HotSpot VM was not developed by Sun, but was designed by Longview Technologies, a small company that was acquired by Sun in 1997 and that Sun was acquired by Oracle in 2009.
J9 VMS
The J9 VM is a VM developed by IBM and is currently the main Java Virtual machine for its development. J9 VM's market positioning and hotspot VM approach, it is a design from the server to desktop applications and embedded in the consideration of a multi-purpose virtual machine, the current performance level of the J9 VM is roughly the same as the Hotspot VM is a grade.
Zing VMS
Based on Oracle's hotspot VMS, many of the details that affect latency are improved. The three biggest selling points are:
- 1. Low latency, "no pause" C4 GC,GC bring a pause can be controlled at a level below 10ms, the supported Java heap size can be 1TB;
- 2. Quick warm-up function after start-up.
- 3. Manageability: 0 overhead, can be opened in the production environment full-time, integrated within the JVM monitoring tools zing Vision.
1.2 Java Virtual machine execution process
When we execute a Java program, what is its execution process? As shown in.
There is no logical connection between the Java virtual machine and the Java language, it is only related to a particular binary: class file.
2.Java Virtual Machine Architecture
The architecture described here refers to the abstract behavior of the Java Virtual machine, rather than the implementation of the Hotspot VM as a specific one. Follow the Java Virtual Machine specification, as shown in the abstract Java virtual machine.
2.1 class file format
Java files are compiled into class files that do not depend on specific hardware and operating systems. Each class file corresponds to the definition information for the only class or interface, but the class or interface is not necessarily defined in the file, such as classes and interfaces that can be generated directly from the ClassLoader.
The file structure of the classfile is shown below.
Classfile {U4 magic;//magic number with a fixed value of 0xCAFEBABE to determine if the current file is a class file that can be processed by a Java virtual machineU2 minor_version;//Sub-version numberU2 major_version;//Major version numberU2 Constant_pool_count;//Chang counterCp_info constant_pool[constant_pool_count-1]; //Chang U2 access_flags;//class and interface-level access flagsU2 This_class;//Class indexU2 Super_class;//Parent-Class indexU2 Interfaces_count;//Interface counterU2 Interfaces[interfaces_count];//Interface TableU2 Fields_count;//Field counterField_info Fields[fields_count];//Field tableU2 Methods_count;//Method counterMethod_info Methods[methods_count];//Method tableU2 Attributes_count;//Property counterAttribute_info Attributes[attributes_count];//attribute table} |
Class 2.2 Loader subsystem
The ClassLoader subsystem finds and loads class files into a Java virtual machine through a variety of classloader. Java virtual machines have two kinds of loaders: the system loader and the user custom loader. The system loader consists of the following three types:
- The Boot class loader (Bootstrap class Loader): A loader that is implemented in C + + code to load the system classes required by the Java Virtual Runtime, which are in the {jre_home}/lib directory. The start of a Java virtual machine is accomplished by creating an initial class by booting the ClassLoader. Because the ClassLoader is implemented using the platform-related underlying c/+ + language, the loader cannot be accessed by Java code. However, we can query whether a class has been loaded by the boot class loader. The Boot class loader does not inherit java.lang.ClassLoader.
- Extension class loader (Extensions class Loader): For loading Java extension classes, extended classes are generally placed in the {jre_home}/lib/ext/directory, to provide additional functionality in addition to the system classes.
- Application class Loader (Application class Loader): This classloader is used to load user code and is a portal to user code. Applying the ClassLoader to extend the class loader as its own parent ClassLoader, when trying to load the class, first try to let the extension class loader load, if the extension class loader successfully loaded, then directly return the load result class instance, if the load fails, it will ask whether the boot class loader has loaded the class , and if not, the app ClassLoader will try to load itself.
The user-defined loader implements its own ClassLoader by inheriting the Java.lang.ClassLoader class.
In addition to loading class file classes into a Java virtual machine, the ClassLoader subsystem must also be responsible for verifying the correctness of the imported class class, allocating and initializing memory for class variables, and helping to parse symbol references. These actions must be performed in the following order:
1. Load: Find and load the class file.
2. Links: validation, preparation, and parsing.
- Validation: Ensure that the type being imported is correct.
- Prepare: Assign fields to the static fields of the class and initialize them with default values.
- Parsing: Dynamically determines a specific worthwhile process based on the symbolic reference of a running constant pool.
3. Initialize: Initializes the class variable to the correct initial value.
2.3 Data Types
Java virtual machines are similar to the Java language data types and can be divided into two categories: basic types and reference types. The Java Virtual machine expects the compiler to do the type checking as much as possible during compilation, so that the virtual machine does not need to perform type checking operations during the run.
2.4 Run-time data regions
Many people divide Java's memory into heap memory (heap) and stack memory (stack), which is not accurate enough, and Java's memory partitioning is actually far more complex.
Java Virtual machine in the process of executing Java program will divide the memory that it manages into different data region, according to the Java Virtual Machine specification (Java SE7 Edition), these data regions are program counter, Java Virtual machine stack, local method Stack, Java heap and method area, respectively. Let's introduce them in one by one below.
2.4.1 Program Counter
In order to ensure that the program executes continuously, the processor must have some means to determine the address of the next instruction, which is what the program counter does.
Program Counter Register, also known as a PC register, is a small memory space. In the virtual machine conceptual model, the bytecode interpreter works by changing the program counter to select the next byte-code instruction that needs to be executed, and the multithreading of the Java Virtual machine is implemented by rotating and assigning the processor execution time, at a certain moment only one processor executes the instruction in a thread, To be able to recover to the correct execution location after a thread switchover, each of the threads has a separate program counter, so the program counter is thread-private. If the thread executes a method other than the native method, the program counter holds the byte-code instruction address that is executing and, if it is the native method, the value of the program counter is empty (Undefined). The program counter is the only data region in the Java Virtual Machine specification that does not specify any outofmemoryerror conditions.
2.4.2 Java Virtual machine stack
Each Java Virtual machine thread has a Java VM stack (Java Virtual machine Stacks), which is a thread-private. It has the same life cycle as the thread and is created at the same time as the thread. The Java Virtual machine stack stores the state of a Java method call in a thread, including local variables, parameters, return values, and intermediate results of the operation. A Java Virtual machine stack contains multiple stack frames, and a stack frame is used to store information such as local variable tables, operand stacks, dynamic links, method exits, and so on. When a thread calls a Java method, the virtual machine presses a new stack frame into the thread's Java stack, and when the method executes, the stack frame pops up from the Java stack. What we normally call stack memory is the Java Virtual machine stack.
When compiling program code, the number of local variable tables required in the stack frame, the number of deep operand stacks are fully determined, and the code attribute of the method table is written. Therefore, how much memory a stack frame needs to allocate is not affected by the program run-time variable data, but only by the specific virtual machine implementation.
Two exception conditions are defined in the Java Virtual Machine specification:
- If a thread requests an allocated stack capacity that exceeds the maximum allowable capacity of a Java virtual machine, Java Virtual opportunity throws stackoverflowerror.
- If the Java Virtual machine stack can be dynamically extended (most Java virtual machines can be dynamically extended), but the extension cannot request enough memory, or if there is not enough memory to create the corresponding Java virtual machine stack when creating a new thread, a OutOfMemoryError exception is thrown.
There are some overlapping places: when the stack space cannot continue to be allocated, whether the memory is too small or the stack space used is too large, it is essentially just two descriptions of the same thing. In a single-threaded operation, either because the stack frame is too large or the virtual machine stack space is too small, when the stack space cannot be allocated, the virtual machine throws a Stackoverflowerror exception, and does not get the OutOfMemoryError exception. In a multithreaded environment, OutOfMemoryError exceptions are thrown.
The functions and data structures of each part of the information stored in the stack frame are described in detail below.
1. Local variable table
A local variable table is a set of variable value storage spaces that holds the local variables defined within the method parameters and methods, where the type of data stored is the various basic data types, object references (reference), and returnaddress types that are known at compile time (it points to the address of a bytecode directive). The memory space required for a local variable table is allocated during compilation, that is, when the Java program is compiled into a class file, the capacity of the maximum local variable table to be allocated is determined. When entering a method, this method needs to allocate how much local variable space in the stack is fully determined, and the local variable table size is not changed during the operation of the method.
The capacity of the local variable table is the smallest unit in the variable slot (slot). The virtual machine specification does not explicitly indicate the amount of memory space a slot should occupy (allowing it to change depending on the processor, operating system, or virtual machine), and a slot can hold a data type that is less than 32 bits: Boolean, Byte, char, short, int, float, reference and returnaddresss. Reference is the reference type of the object, and ReturnAddress is the service of the byte instruction, which executes the address of a bytecode instruction. For a 64-bit data type (long and double), the virtual opportunity assigns two contiguous slot spaces to it in the previous way.
The virtual machine uses the local variable table through index positioning, the range of index values is from 0 to the maximum number of slots in the local variable table, for 32-bit data type variables, index n for the nth slot, and for 64 bits, index n for Nth and n+1 two slots.
When the method executes, the virtual machine uses the local variable table to complete the pass of the parameter value to the parameter list, and if it is an instance method (not static), the slot of the No. 0-bit index in the local variable table defaults to the reference used to pass the object instance to which the method belongs, which can be passed through the keyword "this" To access this implicit argument. The remaining parameters are arranged in the order of the parameter tables, occupying a local variable slot starting at 1, and assigning the rest of the slots to the variable order and scope defined in the method body, after the parameter table has been allocated.
Slots in a local variable table are reusable, variables defined in the method body, scopes do not necessarily overwrite the entire method body, and if the value of the current bytecode PC counter exceeds the scope of a variable, the slot corresponding to that variable can be used by other variables. This design is not only to save space, in some cases, the reuse of the slot will directly affect the system and garbage collection behavior.
2, the operation of the stack
The operand stack is often called the Operation Stack, and the maximum depth of the operand stack is determined at compile time. The 32-bit data type occupies a stack capacity of 1,64 of 2 for the data type. When a method starts execution, its operation stack is empty, during the execution of the method, there will be various bytecode instructions (such as: add operation, assignment value, etc.) to the Operation Stack to write and extract content, that is, into the stack and the stack operation.
The Java Virtual machine's interpretation execution engine is called the "stack-based execution engine", where the "stack" is the operand stack. So we also call Java virtual machines stack-based, unlike Android virtual machines, which are register-based.
The main advantage of the stack-based instruction set is the portability, the main disadvantage is that the execution speed is relatively slow, and because the register is provided directly by the hardware, the main advantage of the register instruction set is the fast execution speed, the main disadvantage is the poor portability.
3. Dynamic Connection
Each stack frame contains a reference to the method in which the stack frame belongs to the running constant pool (described in the method area, later), and holds this reference to support dynamic connections in the method invocation process. A large number of symbolic references exist in the constant pool of the class file, and the method invocation directives in the bytecode are referenced as parameters to the symbol in the constant pool that points to the method. These symbolic references, some of which are converted to direct references (such as final, static, and so on) during the class loading phase or when they are first used, are called statically resolved, and the other part is converted to direct references during each run, which is called dynamic connections.
4. Method return Address
When a method is executed, there are two ways to exit the method: The execution engine encounters a bytecode instruction returned by either method or encounters an exception, and the exception is not processed in the method body. Regardless of the exit mode, after the method exits, it is necessary to return to the location where the method was called before the program can continue execution. When the method returns, you may need to save some information in the stack frame to help restore the execution state of its upper-level method. In general, the method normally exits, the caller's PC counter value can be used as the return address, the stack frame is likely to save the counter value, and the method exits unexpectedly, the return address is to be determined by the exception handler, the stack frame is generally not save this part of the information.
The process of exiting the method is actually the same as putting the current stack frame out of the way, so the actions you might take when exiting are: Restoring the local variable table of the upper method and the operand stack, and, if there is a return value, pressing it into the operand stack of the caller's stack frame, adjusting the value of the PC counter to a directive following the method
2.4.3 Local Method Stack
Java Virtual machine implementations may use C stacks to support the Native language, the C stacks is the local method stack (Native methods stack). It is similar to the Java Virtual machine stack, except that the local method stack is used to support the native method service. If the Java virtual Machine does not support the native method and does not rely on C Stacks, you can not support the local method stack. In the Java Virtual Machine specification, there is no mandatory requirement for the language and data structure of the local method stack, so a specific Java virtual machine can implement it freely, such as a hotspot VM combining the local method stack with the Java Virtual machine stack.
Similar to the Java Virtual machine stack, the local method stack throws Stackoverflowerror and OutOfMemoryError exceptions
2.4.4 Java Heap
The Java heap (Java heap) is a region of run-time memory that is shared by all threads. The Java heap is used to hold object instances where almost all object instances are allocated memory. The Java heap stores objects are managed by the garbage collector, and these managed objects do not have to be destroyed by the display. From the perspective of memory recycling, Java heap can be roughly divided into the new generation and the old age. From the point of view of memory allocation the Java heap may divide multiple thread-private allocation buffers. Regardless of the partitioning, the contents of the Java heap storage are constant and are partitioned to enable faster recycling or allocating memory.
The capacity of the Java heap can be fixed or dynamically extended. The Java heap used within the existence of the physical need not continuous, logically continuous can.
An exception condition is defined in the Java Virtual Machine specification:
- A OutOfMemoryError exception is thrown if there is not enough memory in the heap to complete the instance assignment and the heap cannot be extended.
Java heap can be divided into the Cenozoic and the old age of two districts, of which the Cenozoic can be divided into a Eden and two survivor, two survivor zones are named from and to in order to distinguish between the Cenozoic and the old age of 1:2, which together constitute the memory area of the heap, So the Cenozoic Jian 1/3, the old age accounted for 2/3, but this proportion can be modified, the following respectively to introduce the new generation and the old age.
1, "New Generation"
The Cenozoic is divided into three regions, one Eden area and two survivor area, the ratio of which is (8:1:1), which can also be modified. Typically, objects are allocated primarily to the new generation of Eden, and in rare cases may be allocated directly to the old age. Each time the Java Virtual Machine uses Eden and one of the Survivor (from) in the Cenozoic, after a minor GC, The surviving objects in Eden and survivor are copied one at a time to another survivor space (the copy algorithm used here is GC), and finally the Eden and the Survivor (from) space that was just used are cleared out. The age of the objects that survived in the survivor space is set to 1, and after each of these objects has survived a GC in the Survivor area, they age by 1, and when the object age reaches a certain age (the default value is 15), it is moved to the old age.
In the new generation of GC, it is possible to encounter another piece of survivor space does not have enough space to store the last generation of surviving objects collected, these objects will be directly through the allocation of security mechanism into the old age;
Summarize:
1, Minor GC is a new generation of garbage collection, the use of replication algorithm;
2, the new generation in each use of space not more than 90%, mainly for the storage of newborn objects;
3, Minor GC after each collection Eden area and a survivor area are emptied;
2, "The old age"
The old age is the object of long life cycle, for some larger objects (that is, the need to allocate a larger contiguous memory space), is directly deposited in the old age, there are many from the Cenozoic survivor region of the object.
The use of the full gc,full GC in the old age is the tag-purge algorithm. The full GC in the old age is not as frequent as the minor GC, and it takes longer to perform a full GC than the minor GC.
Summarize:
1. Using full GC in the old age, using the mark-clear algorithm
2.4.5 Method Area
The method area is a run-time memory area that is shared by all threads. Used to store the structure information of a class that has been loaded by a Java virtual machine, including:
Run data such as constant pool, field and method information, static variables, and so on. A method area is a logical part of the Java heap, which does not need to be contiguous physically, and optionally does not implement garbage collection in the method area. The method area is not equivalent to a permanent generation, only because the hotspot VM uses the permanent generation to implement the method area, and for other Java virtual machines such as J9 and JRockit, there is no permanent generation concept.
An exception condition is defined in the Java Virtual Machine specification:
- If the memory space of the method area does not meet the memory allocation requirements, the Java virtual opportunity throws a OutOfMemoryError exception.
Run a constant-rate pool
The runtime Constant pool is part of the method area. In this section of the 2.1 class file format, we learned that the class file not only contains information such as the version, interface, fields, and methods of the classes, but also contains a constant pool, which holds the literal and symbolic references generated during the compilation period, which are stored in the run-time pool of the method area after the class is loaded. A running constant pool can be understood as a run-time representation of a const pool of classes or interfaces.
An exception condition is defined in the Java Virtual Machine specification:
When a class or interface is created, the Java virtual opportunity throws a OutOfMemoryError exception if the amount of memory required to construct the run-time pool exceeds the maximum value that the method area can provide.
Direct Memory
Direct memory is not part of the data area when the virtual machine is running, nor is it a memory area defined in the Java VM Specification, which is allocated directly from the operating system and therefore is not limited by the Java heap size, but is subject to the size of the total memory of the machine and the processor addressing space. Therefore it can also cause outofmemoryerror anomalies to appear. In JDK1.4, a new NIO mechanism is introduced, which is a new I/O method based on channel and buffer, which can allocate direct memory directly from the operating system, that is, allocating memory outside the heap, which can improve performance in some scenarios because it avoids replicating data back and forth in the Java heap and the native heap. The detailed use of NiO can be found in the related articles on NIO in my Java Networking Programming series .
Resources
In-depth understanding of the second edition of Java virtual machines
Java Virtual Machine specification (Java SE7 Edition)
Understanding the Java Virtual Machine architecture
What are the current mainstream Java virtual machines?-Know
JVM Runtime Data Region resolution
Schematic diagram of the Java Virtual machine
Explore the Java class loader in depth
Java Virtual machine-structure principle and runtime data region