Chapter 6 bytecode execution methods-Explanation execution and JIT, bytecode jit
NOTE: Refer to "distributed java applications: basics and Practices" and "deep understanding of java virtual machines (version 2)".
1. Two execution methods:
- Explain and execute (the bytecode is interpreted and executed at runtime)
- Compile to machine code for execution (compile the bytecode into machine code and execute it. This compilation process occurs at runtime, called JIT compilation)
- Force use this mode:-Xcomp. The following two compilation modes are available:
- Client (C1): optimizes performance overhead with only a small amount of memory, and is suitable for desktop programs.
- Server (C2): optimizes a lot and occupies a large amount of memory, which is suitable for server programs. Will collect a large amount of runtime information.
Note:
- 32 is the default machine C1, you can add-client or-server at startup to specify, 64-bit machine if CPU> 2 and physical memory> 2G is the default C2, otherwise C1
- Hot Spot JVM code execution mechanism: compiles code that is frequently executed during execution, and continues to explain and execute code that is not frequently executed.
2. Explain the execution
View the Class Object Structure in Chapter 3 and the execution of the inc () method in javap usage
Or view the Java Virtual Machine (version 2) P272-P275
3. Compile and execute
- Compiled object
- Method
- Body of the cycle in the Method
- OSR Compilation: Compile the entire code segment. However, only the loop part executes the machine code.
- Trigger condition (the execution frequency is greater than the quantity)
- Method call counter: number of times a method is called
- Client: 1500 server: 10000
- The threshold value can be specified through-XX: CompileThreshold.
- Here, "Number of method calls" refers to the number of calls within a period of time (half-aging cycle). If the number does not reach the threshold within the half-aging cycle, the number is halved.
- -XX:-UseCounterDecay: Disable the above mechanism, that is, the infinity of the semi-aging period.
- -XX: CounterHalfLifeTime semi-aging Period
- Edge return counter: Number of times the code is executed in the loop body (that is, the number of times the code is cyclically executed in the)
- Client: 13995 server: 10700
- The threshold value can be-XX: OnStackReplacePercent (note that this OSRP is only a median value for calculating the return side count threshold ).
- Client: CompileThreshold * OSRP/100
- Server: CompileThreshold * (OSRP-InterPreterProfilePercentage)/100
- -XX: OnStackReplacePercent: 140 InterPreterProfilePercentage: 33
- Method compilation and execution
- When the interpreter calls a method, check whether there is an existing compilation version. If yes, run the machine code. If no, the method call counter is + 1. Then, determine whether the method call counter has exceeded the threshold, if the value exceeds, compile the code, and the background thread will compile the code, and the foreground thread will continue to explain the execution (that is, it will not block) until the next call to the method, if the compilation is complete, the machine code will be executed directly, if not compiled, explain the execution.
- Loop body compilation and execution
- When the interpreter runs to the loop body, it checks whether there are existing compilation versions. If yes, It executes the machine code. If no, the backend counter is + 1, then, determine whether the return side counter exceeds the threshold. If the threshold is exceeded, compile the background thread, and the foreground thread continues to explain the execution (that is, the execution will not be blocked) until the next execution reaches the loop body, if compiled, the machine code is directly executed. If not compiled, the machine code is interpreted and executed.
4. C1 Optimization
(About all the optimization technology list, view the in-depth understanding of Java Virtual Machine (second edition) P346-P347)
Only a small amount of optimization with high performance overhead and low memory usage, the main optimization includes:
- Method inline
- Redundancy Elimination
- Duplication Propagation
- Eliminate useless code
- Type inheritance relationship analysis (CHA, auxiliary)
- De-Virtualization
4.1 inline methods, Redundancy Elimination, duplication propagation, and useless code elimination
4.1.1. Method inline
Inline method Description: Assume that method A calls Method B and directly embeds the instruction of Method B into method.
Static class B {int value; final int get () {return value ;}} public void foo () {y = B. get (); // do something z = B. get (); sum = y + z ;}View Code
Note: In the above Code, B is an instance of B.
After the method is inline,
Public void foo () {y = B. value; // do something z = B. value; sum = y + z ;}View Code
Conditions for inline method:
- Number of bytes after get () Compilation <= 35 bytes (default)-XX: MaxInlineSize = 35
Inline method status:
- The most first method in the optimization series (because it is the basis of many other optimization methods)
- Eliminate the cost of method calling (create stack frames, avoid parameter transmission, avoid return value transmission, and avoid redirection)
4.1.2 Redundancy Elimination
Redundancy Elimination: for example, the above two B. value redundancy (premise, do something does not operate B. value, which is why we need to collect data before optimization)
Assuming that the do something part does not operate on B. value, after redundancy is eliminated,
Public void foo () {y = B. value; // do something z = y; sum = y + z ;}View Code
4.1.3 duplication Propagation
Of course, after the redundancy is eliminated, JIT analyzes the above Code and finds that the variable z is useless (can be replaced by y completely) for "rewrite propagation,
Public void foo () {y = B. value; // do something y = y; sum = y + y ;}View Code
4.1.4 eliminate useless code
After "reproduction propagation", we found that "y = y" is useless code, so we can perform the "useless code elimination" operation. After elimination,
Public void foo () {y = B. value; // do something sum = y + y ;}View Code
It should be noted that the "Elimination of useless code" here is based on the first three optimizations, in javac compilation, "Elimination of useless code" in the "Semantic Analysis" section directly eliminates some directly written code (for example, if (false ){})
4.2. type inheritance relationship analysis and de-Virtualization
Public interface Animal {public void eat ();} public class Cat implements Animal {public void eat () {System. out. println ("cat eat fish") ;}} public class Test {public void methodA (Animal animal) {animal. eat ();}}View Code
First, we analyze the entire "type inheritance relationship" of Animal and find that there is only one implementation class Cat. Then the code in methodA (Animal animal) can be optimized as follows,
Public void methodA (Animal animal) {System. out. println ("cat eat fish ");}View Code
However, if "type inheritance relationship" finds that Animal has another implementation class Dog in the later running process, then the compiled machine code is not optimized before execution, instead, explain and execute, that is, the following "Inverse Optimization ".
Inverse Optimization:
When the execution of the compiled machine code no longer meets the optimization conditions, the part corresponding to the machine code is returned to the interpretation execution.
For example, "Remove virtualization". If more than one class implementation method is found after compilation, You need to execute "reverse optimization"
5. C2 Optimization
A large number of optimizations have been made, occupying a large amount of memory and are suitable for server programs. For C2 optimization, there are many optimizations in addition to C1 optimization measures.
Escape analysis (Auxiliary ):
Enable:-XX: + DoEscapeAnalysis
Determine whether the variable in the method will be read by the method or external thread Based on the running status. If not, this variable will not escape. Based on this, C2 will do the following during compilation:
- Scalar replacement:-XX: + EliminateAllocations Enabled
- Stack allocation
- Synchronization:-XX: + EliminateLocks Enabled
5.1. scalar replacement
Meaning: scatter a java object and use the attributes of the object as scalar values according to the program.
Point point = new Point (1, 2); System. out. println ("point. x: "+ point. x + ", point. y: "+ point. y); // do afterView Code
If no other code accesses the "point Object" in // do after (that is, all the code after the first two sentences, splits the "point Object" and replaces the scalar,
Int x = 1; int y = 2; System. out. println ("point. x:" + x + ", point. y:" + y );View Code
Benefits:
- If all the variables defined in the object are not used, "scalar replacement" can save memory.
- During execution, you do not need to look for object references, and the speed will be faster.
5.2 stack allocation
Meaning: If the variable used to determine a method does not escape from the current method (that is, the variable is not referenced by other methods), the variable can be directly allocated to the stack and ends with the execution of the method, the stack frame disappears, and the variable also disappears, reducing the GC pressure.
Benefits:
- During execution, you do not need to find objects in the heap based on object references. The speed will be faster.
- Allocated to the stack. As the method execution ends, the stack frame disappears and the variable disappears, reducing the GC pressure.
5.3. Simultaneous Removal
Meaning: If the variable of a method does not escape from the current thread (that is, the variable is not used by other threads), the synchronization policy for the variable is eliminated, as shown below,
Synchronized (cat) {// do xxx}View Code
If cat does not escape the current thread, the synchronization block can be removed as follows,
// Do xxxView Code
Summary:
Interpreter:
- Program startup speed is faster than Compilation speed
- Memory saving (compilation is not required, so the compiled machine code is not needed)
JIT compiler:
- It takes a long time to execute "Hotspot code" quickly
Note:
- Using JIT instead of directly compiling the machine code during the compilation period, in addition to the two parts of the interpreter, it also aims to collect data during the runtime and compile it for a specific purpose.