As one of the most widely used languages in the industry, Java has won the praise of many software vendors and developers, and is actively promoted by many JCP members including Oracle. However, the deep understanding and application of the Java language are rarely discussed. On the infoq Chinese site, Senior IBM engineer Cheng Fu was specially invited to write this "Java deep Adventure" column to share his experience on some of the depth and advanced features of Java.
In the general Java application development process, it is relatively simple for developers to use Java. Open the usual IDE, write the Java source code, and use the functions provided by the IDE to directly run the Java program. The process behind this development mode is: developers write Java source code files (. java), The IDE will call the Java compiler to compile the Java source code into byte code unrelated to the platform, and save it on the disk as a class file (. class ). The Java Virtual Machine (JVM) is responsible for loading and executing JAVA byte code. In this way, Java implements "write once, run everywhere (write once,
Run Anywhere. The Byte Code contained in java files can be used by JVM on different platforms. Java Byte Code not only exists on the disk as a file, but also can be downloaded over the network, and can only exist in the memory. The class loader in JVM defines the Java class from the byte array (byte []) that contains the byte code. In some cases, you may need to dynamically generate JAVA byte code or modify the existing Java byte code. At this time, we need to use the relevant technologies described in this article. First, we will introduce how to dynamically compile Java source files.
Dynamic compilation of Java source files
In general, developers write all the Java source code and compile it successfully before running the program. For some applications, the content of Java source code can be determined at runtime. At this time, you need to dynamically compile the source code to generate the java byte code, and then load and execute it by JVM. A typical scenario is an online evaluation system (such as PKU judgeonline) that allows users to upload Java code, which is compiled, run, and determined by the system in the background. When dynamically compiling Java source files, you can call the Java compiler directly in the program.
JSR 199 introduces Java compiler APIs. If JDK 6 is used, you can use this API to dynamically compile Java code. For example, the following code is used to dynamically compile the simplest Hello world class. The Java class code is saved in a string.
Public class compilertest {
Public static void main (string [] ARGs) throws exception {
String source = "public class main {public static void main (string [] ARGs) {system. Out. println (\" Hello world! \");}}";
Javacompiler compiler = toolprovider. getsystemjavacompiler ();
Standardjavafilemanager filemanager = compiler. getstandardfilemanager (null, null, null );
Stringsourcejavaobject sourceobject = new compilertest. stringsourcejavaobject ("Main", source );
Iterable <extends javafileobject> fileobjects = arrays. aslist (sourceobject );
Compilationtask task = compiler. gettask (null, filemanager, null, fileobjects );
Boolean result = task. Call ();
If (result ){
System. Out. println ("compiled successfully. ");
}
}
Static class stringsourcejavaobject extends simplejavafileobject {
Private string content = NULL;
Public stringsourcejavaobject (string name, string content )?? Throws urisyntaxexception {
Super (URI. Create ("string: //" + name. Replace ('.', '/') + kind. Source. Extension), kind. source );
This. content = content;
}
Public charsequence getcharcontent (Boolean ignoreencodingerrors )?? Throws ioexception {
Return content;
}
}
}
If you cannot use the Java compiler API provided by JDK 6, you can use the tool class COM. sun. tools. javac. main, but this tool class can only compile files stored on the disk, similar to directly using javac commands.
Another available tool is the compiler provided by ECLIPSE jdt core. This is an incremental Java compiler used in the eclipse Java development environment. It supports running and debugging of wrong code. The compiler can also be used independently. The play framework uses the jdt compiler internally to dynamically compile Java source code. In development mode, the play framework regularly scans the Java source code files in the project. Once any modifications are found, the Java source code is automatically compiled. Therefore, after modifying the code, refresh the page to see the changes. When using these dynamic compilation methods, make sure that the tools. jar in JDK is in the classpath of the application.
The following is an example of how to perform four arithmetic operations in Java, such as obtaining the value (3 + 4) * 7-10. The general practice is to analyze the input Calculation Expression and simulate the calculation process by yourself. Considering the existence of parentheses and the priority of operators, the calculation process is complicated and error-prone. Another method is to use the script language introduced by JSR 223 to directly execute the input expression as JavaScript or javafx scripts and get the result. The following code dynamically generates and compiles the Java source code, and then loads the Java class to execute and obtain the result. This method is fully implemented using Java.
Private Static double calculate (string expr) throws calculationexception {
String classname = "calculatormain ";
String methodname = "Calculate ";
String source = "public class" + classname
+ "{Public static double" + methodname + "() {return" + expr + ";}}";
// The code for dynamically compiling Java source code is omitted. For details, refer to the previous section.
Boolean result = task. Call ();
If (result ){
Classloader loader = calculator. Class. getclassloader ();
Try {
Class <?> Clazz = loader. loadclass (classname );
Method method = clazz. getmethod (methodname, new class <?> [] {});
Object value = method. Invoke (null, new object [] {});
Return (double) value;
} Catch (exception e ){
Throw new calculationexception ("internal error. ");
}
} Else {
Throw new calculationexception ("incorrect expression. ");
}
}
The above Code provides the basic mode for dynamically generated JAVA byte code, that is, to load byte code through the class loader and create an instance of Java class objects, then, call the methods in the object through the Java reflection API.
Java Byte Code Enhancement
Java Byte Code enhancement refers to modifying and enhancing JAVA byte code after it is generated. This method is equivalent to modifying the binary file of the application. This implementation method can be seen in many Java frameworks. Java Byte Code enhancement is usually used with annotation in Java source files. Annotations in the Java source code declare the behavior to be enhanced and the relevant metadata, the Framework completes the enhancement of the byte code at runtime. Java Byte Code enhancement is applicable in many scenarios, generally focusing on reducing redundant code and shielding developers from the underlying implementation details. Those who have used JavaBeans may feel complicated and difficult to maintain the getter/setter methods that must be added. With the byte code enhancement, developers only need to declare the attributes in the bean. The getter/setter method can be automatically added by modifying the byte code. When debugging a program, someone who has used JPA will find that some additional
Domain and method. These domains and methods are dynamically added by the JPA implementation at runtime. Byte Code enhancement is also used in some implementations of Aspect-oriented programming (AOP.
Before discussing how to enhance the byte code, we will first introduce the organization of the byte code that represents a Java class or interface.
Class file {
0 xcafebabe, minor version number, major version number, constant pool size, constant pool array,
Access Control tag, current class information, parent class information, number of Implemented interfaces, array of implemented interface information, number of domains,
Array of domain information, number of methods, array of method information, number of attributes, and array of attribute information
}
As shown above, the byte code of a class or interface uses a loose organizational structure, and the content contained in it is arranged in sequence. Content that may contain multiple entries, such as Implemented interfaces, domains, methods, and attributes, is represented in arrays. Before the array, the number of entries in the array is displayed. Different content types have different internal structures. For developers, if they directly manipulate byte arrays containing byte code, the development efficiency is relatively low and error-prone. There are already many open-source libraries that can modify the byte code or create the byte code content of the new Java class from the beginning. These class libraries include ASM, cglib, serp, and bcel. Using these class libraries can reduce the complexity of enhanced byte code to a certain extent. For example, consider the following simple requirement and output the corresponding logs before all methods of a Java class are executed. Anyone familiar with AOP knows that they can use a prior enhancement (Before)
To solve this problem. If ASM is used, the related code is as follows:
Classreader Cr = new classreader (is );
Classnode Cn = new classnode ();
Cr. Accept (CN, 0 );
For (Object object: CN. Methods ){
Methodnode Mn = (methodnode) object;
If ("<init>". Equals (Mn. Name) | "<clinit>". Equals (Mn. Name )){
Continue;
}
Insnlist insns = Mn. instructions;
Insnlist IL = new insnlist ();
Il. Add (New fieldinsnnode (getstatic, "Java/lang/system", "out", "ljava/IO/printstream ;"));
Il. Add (New ldcinsnnode ("Enter method->" + Mn. Name ));
Il. Add (New methodinsnnode (invokevirtual, "Java/IO/printstream", "println", "(ljava/lang/string;) V "));
Insns. insert (IL); Mn. maxstack + = 3;
}
Classwriter CW = new classwriter (0 );
CN. Accept (CW );
Byte [] B = CW. tobytearray ();
From classwriter, you can obtain the byte array containing the enhanced byte code. You can write the byte code back to the disk or use it directly by the class loader. In the above example, the logic of the enhancement section is relatively simple. It just traverses all the methods in the Java class and adds a call to the system. Out. println method. In byte code, the Java method body is composed of a series of commands. What we need to do is to generate commands that call the system. Out. println method and insert these commands to the beginning of the command set. ASM abstracts these commands, but it is difficult to get familiar with all the commands. ASM provides a tool class asmifierclassvisitor, which can print the structure information of the byte code of the Java class. When you need to enhance a class, you can first modify the source code, and then use this tool class to compare the differences between the byte code before and after the modification, so as to determine how to write the enhanced code.
The time to enhance class files is after Java source code compilation and before JVM execution. Common practices include:
The IDE performs the compilation operation. For example, Google App Engine's Eclipse plug-in will run datanucleus after compilation to enhance the object class.
Complete the build process, for example, using ant or Maven to perform relevant operations.
Implement your own Java class loader. After obtaining the byte code of the Java class, perform enhanced processing first, and then define the Java class from the modified byte code.
Use the java. Lang. instrument package introduced by JDK 5.
Java. Lang. Instrument
Due to the large number of requirements for modifying the java byte code, JDK 5 introduces the java. Lang. instrument package and is further enhanced in JDK 6. The basic idea is to add some agents at JVM startup ). Each proxy is a jar package, and a proxy class is specified in the manifest file. This class will contain a premain method. At startup, JVM will first execute the premain method of the proxy class, and then execute the main method of the Java program itself. You can modify the byte code of the program in the premain method. JDK 6 also allows dynamic addition of proxies after JVM startup. The Java. Lang. instrument package supports two types of modification scenarios. One is to redefine a Java class, that is, completely replace
The byte code of the Java class; the other is to convert the existing Java class, which is equivalent to the Class byte code enhancement mentioned above. As an example, we need to implement the java. Lang. instrument. classfiletransformer interface to convert existing Java classes.
Static class methodentrytransformer implements classfiletransformer {
Public byte [] transform (classloader loader, string classname,
Class <?> Classbeingredefined ,? Protectiondomain, byte [] classfilebuffer)
Throws illegalclassformatexception {
Try {
Classreader Cr = new classreader (classfilebuffer );
Classnode Cn = new classnode ();
// Omitting the code for Byte Code Conversion Using ASM
Classwriter CW = new classwriter (0 );
CN. Accept (CW );
Return CW. tobytearray ();
} Catch (exception e ){
Return NULL;
}
}
}
With this conversion class, you can use it in the premain method of the proxy.
Public static void premain (string ARGs, instrumentation insT ){
Inst. addtransformer (New methodentrytransformer ());
}
Compress the proxy class into a jar package and declare the name of the proxy class through premain-class in the jar package list file. When running the Java program, add the JVM startup parameter-javaagent: myagent. jar. In this way, the JVM completes the conversion operation before loading the byte code of the Java class.
Summary
It is interesting to manipulate JAVA byte code. It makes it easy to modify binary distribution Java programs and is suitable for performance analysis, debugging tracking, logging, and other tasks. Another important role is to free developers from the tedious Java syntax. Developers only need to write important Code related to business logic. You can dynamically generate byte code for codes that are added only because of syntax requirements or fixed patterns. Bytecode enhancement is different from source code generation. After the source code is generated, it has become a part of the program. developers need to maintain it: either manually modify the generated source code or regenerate it. The Byte Code enhancement process is completely transparent to developers. The proper use of the Java Byte Code Manipulation Technology can better solve a certain type of development problems.