About lambda bytecode related articles, very early on wanted to write, through the use of Java8 lambda expression and Rxjava responsive programming framework, so that the code more concise and easy to maintain, more convenient way to call. This article introduces the method invocation related bytecode directives in the JVM, focuses on parsing the implementation mechanism provided by the new Invokedynamic directives after JDK7 (JSR-292) to the dynamic invocation characteristics of lambda expressions, and finally discusses the topic of lambda performance. byte-code directives for method calls
Before introducing the invokedynamic directive, review the bytecode directives for all method invocations in the JVM specification. Other related to bytecode execution can also refer to the JVM bytecode execution model and the bytecode instruction set previously written by the PO master.
In a class file, a method invocation is a reference to a symbol in the Chang (Constantpool) property sheet that can be determined at the parsing period of the class load or at run time. Invokestatic a static method primarily used to invoke the Static keyword tag invokespecial is primarily used to invoke private methods, constructors, and parent-class methods. Invokevirtual virtual method, not sure to call that implementation class, such as the overridden method invocation in Java. Examples can refer to: from bytecode directives to the implementation of the Invokeinterface interface method written in the JVM, the runtime can determine the object to implement the interface, that is, the runtime to determine the direct reference to the method, not the parsing period. Invokedynamic the execution method of this opcode is associated with a dynamic invocation point object (Call site object), which points to a specific bootstrap Method (the binary byte throttling information of the method is executed in the Bootstrapmethods property sheet), the invocation of the invokedynamic instruction will have a unique call chain, unlike the other four instructions that call the method directly, and the actual running process is also more complex than the first four. Combined with the following examples, it should be more intuitive to understand this directive.
Other detailed explanations of method calls can be referred to the official documentation "the Java®virtual Machine specification Java8 Edition"-2.11.8 methods Invocation and returnIns Tructions. the operation mechanism of lambda expression
Before looking at the bytecode details, let's take a look at how the lambda expression is desugar. The syntactic sugars of the lambda in the compiled byte stream class file, the invokedynamic instruction points to a Bootstrap method (hereinafter referred to as the "bootstrap Method"), which is a static method in the Java.lang.invoke.LambdaMetafactory. By means of debug, you can see the execution of the method, the source code for this method is as follows:
public static callsite Metafactory (methodhandles.lookup caller, String Invokedname, Methodtype Invokedtype, Methodtype sammethodtype, Methodhandle Implmethod, Me Thodtype Instantiatedmethodtype) throws Lambdaconversionexception {Abstractvalidatinglambdametafactor
y MF; MF = new Innerclasslambdametafactory (caller, Invokedtype, Invokedname, Sammet
Hodtype, Implmethod, Instantiatedmethodtype,
False, Empty_class_array, Empty_mt_array);
Mf.validatemetafactoryargs ();
return Mf.buildcallsite (); }
At run time, a virtual opportunity returns a Callsite (call point) object by calling this method. Briefly describe the execution of the method, first of all, initialize a Innerclasslambdametafactory object, and the Buildcallsite method of the object converts the lambda expression into an internal class first. This inner class is an inner class of the Methodhandles.lookup caller, also the inner class of the class that contains the lambda expression. This inner class is generated via bytecode generation technology (JDK.INTERNAL.ORG.OBJECTWEB.ASM) and then loaded into the JVM through the unsafe class. Then return the Callsite object that binds this inner class, the source of this process can also look at:
Callsite Buildcallsite () throws lambdaconversionexception {//) generates a class object that represents an inner class of the lambda expression body information through bytecode generation technology (JDK ASM) because
Is runtime generated, so there is no byte stream information for this inner class in the compiled bytecode information.
Final class<?> innerclass = Spininnerclass (); Incokedtype is the invocation method type of a lambda expression, as in the following example, the consumer method if (invokedtype.parametercount () = = 0) {Final Const ructor<?>[] CTRs = accesscontroller.doprivileged (new privilegedaction<constructor<?>[) > () {@Override public constructor<?>[] Run () {CONSTRUCTOR&L t;?
>[] CTRs = Innerclass.getdeclaredconstructors ();
if (ctrs.length = = 1) {//indicates that the inner class of a lambda expression is private, you need to get access to this inner class.
Ctrs[0].setaccessible (TRUE);
return ctrs;
}
}); if (ctrs.length!= 1) {throw new Lambdaconversionexception ("ExPECTed one Lambda constructor for "+ innerclass.getcanonicalname () +", got "+ Ctrs.length"); try {//through the constructor's Newinstance method, create an internal class object Inst = Ctrs[0].newinstanc
E (); The Methodhandles.constant method assembles and binds the information of this inner class object to a Methodhandle object, which is returned as the parameter "target" of the Constantcallsite constructor.
Subsequent calls to lambda expressions are invoked directly through Methodhandle without the need to generate callsite again.
return new Constantcallsite (Methodhandles.constant (sambase, inst)); catch (Reflectiveoperationexception e) {throw new Lambdaconversionexception ("Exception Inst
Antiating Lambda Object ", e);
} else {try {unsafe.ensureclassinitialized (innerclass); Return to New Constantcallsite (MethodHandles.Lookup.IMPL_LOOKUP. findst
Atic (Innerclass, Name_factory, Invokedtype)); catch (ReflectiveoPerationexception e) {throw new Lambdaconversionexception ("Exception finding constructor", e);
}
}
}
This procedure generates an inner class that represents the lambda expression information (that is, the innerclass of the first line of the method, which is an implementation class of the functional type interface), and the class byte stream of the inner class is passed through the JDK ASM The Classwriter,methodvisitor, the generation, Then the object of the inner class is generated by calling the Constructor.newinstance method, and the inner class object is bound to a Methodhandle object, and then the Methodhandle object is passed to the Callsite object (assigned by the Callsite constructor) 。 So this completes a lambda expression into an inner class object, and then binds the inner class through Methodhandle to a Callsite object. A Callsite object is equivalent to a hook for a lambda expression. The invokedynamic directive is linked to this Callsite object to implement Run-time binding, or the invokedynamic instruction, when invoked, This hook will find a functional interface object (or Methodhandle object) represented by the lambda. Therefore, the bootstrap of the lambda is transformed into a methodhandle process by using the bytecode information of the method at run time.
The result is eight by printing the classname ( greeter.getclass (). GetName () ) of the consumer object. functionnal$ $Lambda $1/659748578 The preceding character is the classname of the LAMBDA expression, and the following 659748578 is the hashcode value of the inner class that was just described.
The following is a detailed analysis of the mechanism of a lambda's sugar removal through a specific bytecode instruction, and look at how the invokedynamic directive is possible for a lambda implementation in the JVM. If the previous procedure is not clear, you can also refer to the Oracle engineers in the design of java8 Lambda expression when some thinking: translation of lambda Expressions lambda expression byte code instruction example analysis
Let's look at a simple example that uses the consumer below the java.util.function package.
Example code: (The following person object has only one string type property: Name, and a parameter constructor method)
Package eight;
Import Java.util.function.Consumer;
/**
* Created by Lijingyao on 15/11/2 19:13.
*
/public class Functionnal {public
static void Main (string[] args) {
consumer<person> greeter = (p)-& Gt System.out.println ("Hello," + p.getname ());
Greeter.accept (New person ("LAMBDA"));
}
Use the verbose command to look at the bytecode information of the method body, where the constant pool information is temporarily omitted, followed by a specific display of the symbol reference to the constant pool information.
public static void Main (java.lang.string[]);
Descriptor: ([ljava/lang/string;) V flags:acc_public, Acc_static code:stack=4, locals=2, args_size=1
0:invokedynamic #2, 0//invokedynamic #0: Accept: () Ljava/util/function/consumer;
5:astore_1 6:aload_1 7:new #3//class Eight/person 10:dup 11:LDC #4//String LAMBDA 13:invokespecial #5//Method Eight/per Son. " <init> ":(ljava/lang/string) V 16:invokeinterface #6, 2//Interfacemethod Java/util/function/co
Nsumer.accept: (ljava/lang/object;) V 21:return linenumbertable:line 11:0 line 12:6 Line 13:21 localvariabletable:start Length Slot Name Signature 0 0 args
[Ljava/lang/string;
6 1 greeter Ljava/util/function/consumer; LocalvariaBletypetable:start Length Slot Name Signature 6 1 greeter ljava/util/function/cons
umer<leight/person;>;
invokedynamic Instruction Features
You can see that the first instruction represents the implementation instruction for the lambda expression, theinvokedynamic instruction, which is the specification that JSR-292 started to apply, and given the compatibility and extension considerations ( You can refer to the Oracle Engineer for the reason for using the invokedynamic directive, andJSR-337 implements the lambda expression through this instruction. In other words, as long as there is a lambda expression, it corresponds to a invokedynamic instruction.
Take a look at the first line of byte code instruction information
0:invokedynamic #2, 0 0: represents the offset index of this byte code instruction opcode (Opcode) in the method. invokedynamic is the opcode mnemonic for this instruction. #2, 0 is the operand of the instruction (Operand), where the notation indicates that the operand is a symbolic reference to the class constant pool information. The 0 following the comma is the default value parameter of the invokedynamic instruction, and the current JSR-337 specification version has been and can only be equal to 0. So just take a look at the information in the constant pool.
Invokedynamic has a proprietary description structure in constants (unlike other methods that invoke directives, which are associated with constant_methodtype_info structures).
Invokedynamic is associated with a constant_invokedynamic_info structure in a constant pool, which can be defined as a guide to the invokedynamic instruction (bootstrap method), and dynamic invocation of method names and return information.
Constant Pool index location information is as follows:
#2 = Invokedyn amic #0: #44 //#0: Accept: () Ljava/util/function/consumer;
Combine the structure information of Constant_invokedynamic_info to see the information contained in this constant pool table entry.
The constant_invokedynamic_info structure is as follows:
Constant_invokedynamic_info {
u1 tag;
U2 Bootstrap_method_attr_index;
U2 name_and_type_index;
}
Briefly explain the structure of this constant_invokedynamic_info: tag: A tag that occupies a byte (U1), or a tagged value of invokedynamic, which translates into a byte tag value. You can look in the JVM spec, the Tag Value conversion table for the constant pool (here the tag value corresponds to =18):
Bootstrap_method_attr_index: A valid index value pointing to Bootstrap_methods, whose structure is in the bootstrap method structure of the property sheet, also described in the binary byte throttling information of the class file. The following is the contents of the Bootstrap Method property sheet corresponding to index 0 :
Bootstrapmethods:
0: #40 invokestatic java/lang/invoke/lambdametafactory.metafactory: (ljava/lang/invoke/ Methodhandles$lookup; ljava/lang/string; Ljava/lang/invoke/methodtype; Ljava/lang/invoke/methodtype; Ljava/lang/invoke/methodhandle; Ljava/lang/invoke/methodtype;) Ljava/lang/invoke/callsite;
Method arguments:
#41 (ljava/lang/object) V
#42 invokestatic eight/functionnal.lambda$main$0: (leight/ person;) v
#43 (Leight/person;) v
This byte-code message shows that the boot method is the lambdametafactory.metafactory method. Read with the source code in front of lambdametafactory.metafactory . With debug, first look at the parameter values for this method at run time:
The first three parameters of this method are generated by the JVM automatic link call site. The method finally returns a Callsite object that corresponds to the operand of the invokedynamic instruction.
-Name_and_type_index: A valid index value that represents the constant pool table information, and the constant pool property table structure that it points to must be a Constant_nameandtype_info property that represents the method name and method descriptor information. Then follow the #44 Index to see the description of the constant pool-related items:
#44 = Nameandtype #64: #65 //Accept: () Ljava/util/function/consumer;
#64 = Utf8 accept
#65 = Utf8 () Ljava/util/function/consumer;
Through the above items, we can get a clear description of invokedynamic information. remaining byte code instruction parsing
In summary, the implementation of LOMBDA expression on bytecode is introduced. Other instructions, if you are interested in bytecode instructions, you can continue reading, and you can skip over what you already know, and this section is not much associated with the lambda itself. The second instruction: the5:astore_1 instruction start offset position is 5, depending on the previous instruction (invokedynamic) has two operands, each operand occupies two bytes (U2) space, So the second instruction starts at the byte offset position 5 (the subsequent offset address will no longer be interpreted). After this instruction is executed, the stack frame structure of the current method is as follows (note: This diagram does not draw the dynamic link of the current stack frame and the data structure of the return address, in the figure: the left local variable table, the right-hand operand stack):
Here for drawing convenience, so according to the local variable table and the actual allocation of operand stack space first draw a few lattice. Because [ stack=4, locals=2, args_size=1 ] is already notified in bytecode information. That is, the actual run-time space of the local variable table occupies a maximum of two slot (one slot byte, the long,double type variable needs to occupy two slot), the operand stack is 4 slot, and the parameter occupies a slot. The args here is the string[] args parameter of the main method. Because it is a static method, there is no aload_0 instruction for this variable.
2. Third: 6:aload_1 will greeter pop-up local variable table, press into the operand stack.
3. Fourth: 7:new #3 initializes the person object directive, which is not the same as the New keyword, and the new opcode simply finds a symbolic reference to the constant pool, and when you execute this command , the runtime heap creates an object with a default value, and if it is the object type, the default value is NULL, and then the reference address for the default value is pressed into the operand stack. Where the #3 operand points to a reference to the constant pool Class property sheet, you can see that the constant pool entry is: #3 = Class #45//Eight/person. The Run-time stack frame structure at this point is as follows:
4. Fifth: 10:dup Copies the value of the top of the operand stack and adds the value to the operand stack top. The DUP instruction is a compile-time optimization for the initialization process. Because the new opcode does not really create objects, it pushes a reference to the operand stack, so after DUP, the copy reference at the top of the stack can be used to invoke the invokespecial of the initialization method (constructor). It is consumed while providing the operands, while the original reference value can be used for other opcode such as object reference. At this point the stack frame structure is as follows:
5. Sixth: 11:LDC #4 runs the value of a regular pool into the operand stack, where the value is a Lambda string. #4 The structure information in the constant pool property sheet is as follows:
#4 = String #46 //lambda
#46 = Utf8 Lambda
At this point the runtime stack frame structure is as follows:
6. Seventh:13:invokespecial #5 Initialize the instruction of the person object (#5 The initialization method Eight/person for the constant pool person. "":( ljava/lang/string) V), is also the instruction that invokes the person constructor. At this point, the reference to the "Lambda" constant pool and the DUP copy of the person reference address the operand stack. After this instruction is executed, a person object is actually created in the heap. At this point the stack frame structure is as follows:
Eighth:16:invokeinterface #6, 2 invokes the consumer accept interface Method {greeter.accept (person)}. The parameter 2 after the #6 comma is the parameter of the invokeinterface instruction, meaning the number of arguments to the interface method plus 1, because the Accpet method has only one argument, so here's 1+1=2. And then look at the constant pool item #6 property sheet information:
#6 = Interfacemethodref #48. #49 //java/util/function/consumer.accept: (ljava/lang/object;) V
#48 = Class #67 //java/util/function/consumer
#49 = Nameandtype #64: #62 //Accept: (ljava/lang/object;) V
# Utf8 = Java/util/function/consumer
#62 = Utf8 (ljava/lang/object;) V
#64 = Utf8 Accept
The above shows that the generics of the consumer interface are erased (during compilation, so the bytecode information does not contain generic information), so there is no known actual parameter operand type. But here you can get the reference value of the actual object, where the Accept method executes, the greeter and the person references the stack, as shown in the following figure:
8. Ninth: The21:return method returns because it is a void method, so the opcode is return. At this point the operand stack and the local variable table are empty and the method returns. Finally, draw a pen:
Conclusion
This article simply analyzes the byte code directives of lambda expressions through the consumer interface, and the process of sugar removal at run time. Also the operation code to forget almost, but also by the way to review. The
the lambda looks back to the source, so that the memory model at run time can be understood. The
lambda expression corresponds to a incokedynamic instruction, which can be obtained by means of a symbolic reference to the constant pool with the instruction Bootstrapmethods The boot method corresponding to the property sheet. At run time, the JVM generates a callsite that contains the Methodhandle (Callsite target attribute) object as a callback point for a lambda by calling this boot method. The expression information of a lambda is converted into an inner class by the bytecode generation technique in the JVM, which is bound to the Methodhandle object. Each time the lambda is executed, a callback point callsite execution of the expression is found. A callsite can be executed multiple times (at the time of multiple invocations). There is only one invokedynamic directive, as in the following case, When comparator calls the Comparator.compare or Comparator.reversed method, it finds its internal methodhandle through Callsite, and invokes the internal representation of the lambda through Methodhandle lambd Aform.
public static void Main (string[] args) {
comparator<person> Comparator = (P1, p2)-> p1.getfirstname (). Compa ReTo (P2.getfirstname ());
person P1 = new Person ("John", "Doe");
person P2 = new Person ("Alice", "Wonderland");
Comparator.compare (P1, p2); > 0
comparator.reversed (). Compare (P1, p2); < 0
}
Lambda is not only easy to use, performance in most cases is better than anonymous internal class, performance can refer to Oracle's Sergey Kuksenko published Lambda performance report. As can be seen from the above, although it is necessary to transform the lambda form at runtime (see Methodhandle Form Property generation process), and generate Callsite, but as the call point is frequently invoked, through JIT-compilation optimization, performance will be significantly improved. Also, the run-time removal of sugar enhances the flexibility of the compile period (in fact, before looking at bytecode, it is thought that the lambda might be the class of the anonymous inner class at compile time, rather than by providing a Boortrap method that links to the call point at run time). The way in which the call point is generated at run time the actual memory usage in most cases is lower than that of the anonymous inner class (the JAVA8 version). So, where we can use lambda expressions, we try to combine actual performance tests, write concise expressions, and minimize the internal capture variables of lambda expressions (because this creates extra variable objects), and if you need to capture variables inside an expression, Consider whether a variable can be written as a member variable of a class, or as little as possible to give the lambda extra arguments. I hope this article can give some references to the users of lambda. Resources
Translation of Lambda Expressions
JVM byte code execution model and bytecode instruction set
The java®virtual Machine specification Java8 Edition (JS R-337)
Invoke a interface method
Java 8 lambdas-a Peek Under the Hood
JDK 8:lambda Performance study