010 Advanced (Operation period) optimization

Source: Internet
Author: User

1, interpreter and compiler throughout the virtual machine execution architecture, the interpreter and the compiler often work together,

Layered compilation divides different levels of compilation according to compiler compilation, optimization scale, and time-consuming, including:
    • The No. 0 layer, the program interpretation execution, the interpreter does not turn on the performance monitoring function (Profiling), may trigger 1th layer compiles.  
    • The 1th layer, also known as C1 compilation, compiles bytecode into native code, performs simple, reliable optimizations, and, if necessary, joins the logic of performance monitoring.
    • The 2nd layer (or above 2 layers), also known as C2 compilation, also compiles bytecode into native code, but it enables some compile-time-consuming optimizations, and even some unreliable radical optimizations based on performance monitoring information.
After implementing a layered compilation, client compiler and server compiler will work together, many of the code can be compiled multiple times, using the client compiler to get a higher compilation speed, with the server compiler for better compilation quality, There is no need to assume the task of collecting performance monitoring information when interpreting execution.
2. There are two types of "hotspot code" that will be compiled by the instant compiler in the process of compiling objects and triggering conditions:
    • The method that was called multiple times.
    • A loop body that is executed several times.
The latter, although the compilation action is triggered by the loop body, the compiler will still use the entire method (rather than the individual loop body) as the compilation object. This compilation occurs because the compilation takes place in the execution of the method, so the image is called stack substitution (on stack replacement, which is referred to as OSR compilation, that is, the method stack frame is still on the stack, the method is replaced).
There are two ways to determine the main hot spot detection,
    • Sampling-based hotspot detection (sample Based hot spot Detection): The virtual opportunity of this method periodically checks the top of each thread's stack, and if a certain (or some) method is found to be present at the top of the stack, this method is the "Hotspot method". The advantage of sampling-based hotspot detection is that it is simple, efficient, and easy to get a method call relationship (the call stack can be expanded), and the disadvantage is that it is difficult to pinpoint the heat of a method, and it is easy to disrupt hotspot probing because of thread blocking or other external factors.
    • Counter-based hotspot detection (Counter Based hot Spot Detection): The virtual opportunity of this method establishes counters for each method (even code block), counts the number of executions of the method, and considers it a "hot spot" if the number of executions exceeds a certain threshold. This statistical method is cumbersome to implement, need to establish and maintain counters for each method, and can not directly get the call relationship of the method, but its statistical results are relatively more accurate and rigorous.

If you do not make any settings, the method call counter does not count the absolute number of times the method was called, but rather a relative frequency of execution, that is, the number of times the method was called within a period of time. When a certain time limit is exceeded, if the number of calls to the method is still not sufficient for it to be submitted to the immediate compiler, the call counter of this method is reduced by half, which is called the attenuation of the method call counter heat (Counter Decay), This time is known as the half-decay period of this method statistic (Counter Half Life time). The action of heat attenuation is the same as when the virtual machine is garbage collected, the virtual machine parameter-xx:-usecounterdecay can be used to turn off the heat attenuation, so that the method counter counts the absolute number of method calls, so that, as long as the system is running long enough, Most methods will be compiled with cost code. In addition, you can use the-xx:counterhalflifetime parameter to set the time of the half-decay period, in seconds.

With respect to the threshold value of the back edge counter, although the Hotspot virtual machine also provides a parameter similar to the method call counter threshold-xx:compilethreshold-xx:backedgethreshold for the user to set, the current virtual machine does not actually use this parameter, So we need to set another parameter-xx:onstackreplacepercentage to indirectly adjust the threshold value of the back edge counter, the formula is as follows. Virtual machine running in client mode, the back edge counter threshold is calculated as: Method call Counter Threshold (Compilethreshold) XOSR ratio (onstackreplacepercentage)/100when the virtual machine is running in server mode, the back edge counter threshold is calculated as: Method call counter Threshold (Compilethreshold) x (OSR ratio (onstackreplacepercentage)-interpreter monitor ratio (interpreterprofilepercentage)/100
Unlike the method counter, the back-edge counter has no process of counting the heat decay, so this counter counts the absolute number of times the method is executed in the loop. When the counter overflows, it also adjusts the value of the method counter to the overflow state so that the standard compilation process is performed the next time the method is entered.

3, the compilation process in the default settings, whether it is the immediate compilation request generated by the method call, or the OSR compilation request, the virtual machine will continue to be interpreted as the code compiler is not completed, and the compilation action in the background of the compilation thread. The user can suppress the background compilation through the parameter-xx:-backgroundcompilation, after suppresses the background compiles, once achieves the JIT compilation condition, the execution thread submits the compilation request to the virtual machine to wait until the compilation process completes, then starts executes the compiler output the native code.

    • In the first phase, a platform-independent frontend creates a high-level intermediate code representation (High-level intermediate Representaion,hir) for bytecode construction. Prior to this, the compiler completed a subset of the underlying optimizations on bytecode, such as method inline, constant propagation, and so on, before the bytecode was constructed into hir.
    • In the second phase, a platform-dependent backend produces a low-level intermediate code representation (low-level intermediate representation,lir) from Hir, which is preceded by additional optimizations on hir, such as null check elimination, scope check elimination, etc. So that hir can achieve a more efficient code representation.
    • The final stage is to allocate registers on the Lir using the linear scan algorithm (Linear scan Register Allocation) on the platform-dependent backend, and to do a peep-hole (peephole) optimization on the Lir and then generate the machine code.

4. Compiling optimization technology
The ① method is more important than other optimization measures, its main purpose is two, one is to remove the cost of the method call (such as building stack frame, etc.), and the other is to establish a good foundation for other optimizations, the method can be easily expanded after the expansion of the following optimization methods, so as to obtain better optimization results.
In many cases, the virtual machine inline is an aggressive optimization
② Common subexpression elimination if this optimization is limited to the basic block of the program, it is called local common subexpression elimination (local Common subexpression elimination), if the scope of this optimization covers more than one base block, That is known as global common subexpression elimination (global Common subexpression elimination).

③ Escape Analysis The basic behavior of the escape analysis is to analyze the object dynamic scope: When an object is defined in a method, it may be referenced by an external method, such as a call parameter passed to another method, called a method escape. It is even possible to be accessed by external threads, such as assignment to class variables or instance variables that can be accessed in other threads, called thread escapes. If you can prove that an object does not escape to a method or thread, that is, other methods or threads cannot access the object by any means, you might have some efficient optimizations for that variable, as shown below.
  • stack Allocation: In a Java Virtual machine, allocating the memory space for creating objects on the Java heap is almost common sense for Java programmers, where objects in the Java heap are shared and visible to each thread, as long as they hold a reference to the object. You can access the object data stored in the heap. The garbage collection system of a virtual machine can reclaim the images that are no longer used in the heap , but it takes time to recycle the objects, whether they are filtered to reclaim the object, or to reclaim and defragment the memory. If it is determined that an object will not escape out of the way, it would be a good idea to have the object allocate memory on the stack, and the memory space occupied by the object can be destroyed with the stack frame. In general applications, the proportion of local objects that do not escape is very large, if you can use the stack allocation, the large number of objects will be automatically destroyed with the end of the method, the garbage collection system pressure will be much smaller.  
  • Synchronous elimination (Synchronization elimination): Thread synchronization itself is a relatively time-consuming process, and if the escape analysis can determine that a variable does not escape the thread and cannot be accessed by other threads, then the read and write of this variable must not compete. The synchronization measures implemented for this variable can also be eliminated.
  • Scalar substitution (scalar replacement): A scalar is a data that can no longer be decomposed into smaller data to represent the original data type in the Java Virtual machine (int, Long, such as numeric types and reference types, are no longer decomposed, they can be called scalars. In contrast, if a data can continue to decompose, it is called the aggregation amount (Aggregate), and the object in Java is the most typical aggregation. If you break up a Java object, it is called a scalar substitution to restore the original type of the member variable that is used to it, depending on the program's access. If the escape analysis proves that an object will not be accessed externally and that the object can be disassembled, then the program will probably not create the object when it is actually executed, instead of creating a number of the member variables that are used by the method instead. After splitting the object, in addition to the object's member variables on the stack (the data stored on the stack, there is a large probability that the virtual machine is allocated to the physical machine in the high-speed register storage) allocation and read-write, but also for further optimization means to create conditions.

5, Java and C + + compiler comparison first, because the real-time compiler run is occupied by the user program runtime, has a lot of time pressure, it can provide the optimization method is also severely constrained by the compilation cost. Second, the Java language is a dynamic type-safe language, which means that a virtual machine is required to ensure that the program does not violate linguistic semantics or access unstructured memory.third, although there is no virtual keyword in the Java language, the frequency of using virtual methods is much greater than the C + + language, which means that the runtime has a much greater frequency of polymorphic selection of the method receiver than the C + + language, which means that the instant compilersome optimizations, such as the ones mentioned earlier, are much more difficult to do than the static optimization compilers of C + +. The Java language is a dynamically extensible language, and loading new classes at run time can change the inheritance of program types, making it difficult to do many global optimizations because the compiler cannot see the full picture of the program, and many global optimizations can only be done in a radical and optimized way. The compiler has to keep an eye on it and undo or re-perform some optimizations at run time as the type changes. The memory allocations for objects in the Java language are on the heap, and only local variables in the method can be allocated on the stack.













From for notes (Wiz)

010 Advanced (Operation period) optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.