As one of the most widely used languages in the industry, Java has won the admiration of many software vendors and developers, as well as many JCP members, including Oracle, who are actively promoting development. But the deep understanding and use of the Java language is, after all, a topic that few people can relate to. Infoq Chinese Station specially invited IBM Senior engineer Fu Cheng to write this "Java Deep Adventures" column, aimed at some of Java's deep and advanced features to share his experience.
In the General Java application Development process, the developer uses the Java method to be relatively simple. Open the idiomatic IDE, write the Java source code, and then run the Java program directly with the functionality provided by the IDE. The process behind this development pattern is that the developer is writing Java source code files (. java), which the IDE is responsible for invoking the Java compiler to compile the Java source code into a platform-independent bytecode (byte code), which is stored on disk (. Class) in the form of a class file. The Java Virtual machine (JVM) is responsible for loading and executing the Java byte code. In this way, Java implements its goal of "writing once, running everywhere (write once, run anywhere)." The byte code contained in the Java class file can be used by JVMs on different platforms. Java byte code can be found not only in the form of files on disk, but also in the network mode, and can only exist in memory. The ClassLoader in the JVM is responsible for defining the Java class from the byte array (byte[]) that contains the byte code. In some cases, it may be necessary to generate Java bytecode dynamically or modify existing Java bytecode code. This is the time to use the relevant technology that will be introduced in this article. Let's start by describing how to dynamically compile Java source files.
Dynamically compiling Java source files
In general, developers write all of the Java source code and compile successfully before the program runs. For some applications, the contents of the Java source code can be determined at run time. At this point, you need to dynamically compile the source code to generate the Java byte code, and then the JVM to load the execution. A typical scenario is an online evaluation system for many algorithmic contests (such as PKU Judgeonline) that allows users to upload Java code, which is compiled, run, and judged by the system in the background. When compiling a Java source file dynamically, it is used to invoke the Java compiler directly in the program.
JSR 199 introduces the Java compiler API. If you are using JDK 6, you can use this API to dynamically compile Java code. For example, the following code is used to dynamically compile the simplest Hello world class. The code for the Java class is saved in a string.
1 Public classCompilertest {2 Public Static voidMain (string[] args)throwsException {3String Source = "public class Main {public static void main (string[] args) {System.out.println (\" Hello world!\ ");}}";4Javacompiler compiler =Toolprovider.getsystemjavacompiler ();5Standardjavafilemanager FileManager = Compiler.getstandardfilemanager (NULL,NULL,NULL);6Stringsourcejavaobject SourceObject =NewCompilertest.stringsourcejavaobject ("Main", source);7iterable<extendsJavafileobject> fileobjects =arrays.aslist (sourceObject);8Compilationtask task = Compiler.gettask (NULL, FileManager,NULL,NULL,NULL, fileobjects);9 Booleanresult =Task.call ();Ten if(Result) { OneSystem.out.println ("compilation succeeded. "); A } - } - the Static classStringsourcejavaobjectextendsSimplejavafileobject { - - PrivateString content =NULL; - PublicStringsourcejavaobject (string name, string content)??throwsURISyntaxException { + Super(Uri.create ("string:///" + name.replace ('. ', '/') +Kind.SOURCE.extension), kind.source); - This. Content =content; + } A at PublicCharsequence Getcharcontent (Booleanignoreencodingerrors)??throwsIOException { - returncontent; - } - } -}
If you cannot use the Java compiler API provided with JDK 6, you can use the tool class Com.sun.tools.javac.Main in the JDK, but the tool class can only compile files that are stored on disk, similar to using the Javac command directly.
Another available tool is the compiler provided by Eclipse JDT Core. This is an incremental Java compiler used by the Eclipse Java Development environment to support running and debugging code with errors. The compiler can also be used alone. The play framework internally uses the JDT compiler to dynamically compile Java source code. In development mode, the play framework periodically scans the Java source code files in the project and automatically compiles the Java source code as soon as changes are found. So after modifying the code, refresh the page to see the changes. Using these dynamic compilation methods, you need to make sure that the Tools.jar in the JDK is in the classpath of your application.
Here is an example of how to do arithmetic in Java, such as finding out (3+4) the value of *7-10. The general practice is to analyze the input operation expressions and simulate the calculation process yourself. Considering the existence of parentheses and the precedence of operators, such calculations can be complex and error-prone. Another approach is to use the scripting language support introduced in JSR 223 to directly execute input expressions as JavaScript or JavaFX scripts to get results. The following code uses the practice of dynamically generating Java source code and compiling it, then loading the Java class to execute and get the results. This is done entirely using Java.
1 Private Static DoubleCalculate (String expr)throwscalculationexception {2String className = "Calculatormain";3String methodName = "Calculate";4String Source = "public class" +ClassName5+ "{public static Double" + MethodName + "() {return" + expr + ";}}";6 //omit the code for dynamically compiling Java source code, see the previous section7 Booleanresult =Task.call ();8 if(Result) {9ClassLoader loader = Calculator.class. getClassLoader ();Ten Try { OneClass<?> Clazz =Loader.loadclass (className); Amethod = Clazz.getmethod (MethodName,NewClass<?>[] {}); -Object value = Method.invoke (NULL,Newobject[] {}); - return(Double) value; the}Catch(Exception e) { - Throw NewCalculationexception ("Internal error. "); - } -}Else { + Throw NewCalculationexception ("Error expression. "); - } +}
The code above gives the basic pattern of using dynamically generated Java bytes Code, that is, by loading byte code through the ClassLoader, creating an instance of the Java class object, and invoking the methods in the object through the Java Reflection API.
Java byte Code enhancement
Java byte Code enhancement refers to modifying and enhancing the functionality of Java byte code after it has been generated. This approach is equivalent to modifying the binaries of the application. This implementation can be seen in many Java frameworks. Java byte code enhancements are typically used with annotations (annotation) in Java source files. Annotations in the Java source code that require enhanced behavior and related metadata are enhanced by the framework at run time for byte code. Java byte Code enhancement applications have many scenarios, typically focused on reducing redundant code and shielding developers from the underlying implementation details. People who have used JavaBeans may find it tedious and difficult to maintain the getter/setter methods that must be added. With byte code enhancements, developers only need to declare the attributes in the bean, and the Getter/setter method can be added automatically by modifying the byte code. People who have used JPA, when debugging a program, find that some additional fields and methods have been added to the entity class. These fields and methods are dynamically added by the JPA implementation at run time. Byte code enhancements are also used in some implementations of aspect-oriented programming (AOP).
Before discussing how to make byte code enhancements, first describe the organization of the byte code that represents a Java class or interface.
class file { 0xCAFEBABE, iteration number, large version number, constant pool size, constant pool array, access control token, current class information, parent information, number of interfaces implemented, array of interface information implemented, number of domains, array of domain information, number of methods, array of method information, Number of attributes, array of attribute information}
As shown above, the byte code for a class or interface uses a loosely organized structure that contains content that is arranged sequentially. For content that may contain multiple entries, such as interfaces, fields, methods, and properties that are implemented, are represented as arrays. Before the array is the number of entries in the array. Different content types have different internal structures. For developers, direct manipulation of byte arrays containing byte codes is less efficient and prone to error. There are already many open source libraries that can modify byte code or create new Java class byte code content from scratch. These class libraries include ASM, Cglib, Serp, and Bcel. Using these class libraries can reduce the complexity of the enhanced byte code to some extent. Consider, for example, the following simple requirement to output the appropriate log before all methods of a Java class are executed. People familiar with AOP know that a pre-enhancement (before advice) can be used to solve this problem. If ASM is used, the relevant code is as follows:
1Classreader CR =NewClassreader (IS);2Classnode cn =NewClassnode ();3Cr.accept (CN, 0);4 for(Object object:cn.methods) {5Methodnode MN =(Methodnode) object; 6 if("<init>". Equals (mn.name) | | "<clinit>". Equals (Mn.name)) { 7 Continue; 8 } 9Insnlist Insns =mn.instructions; TenInsnlist il =Newinsnlist (); OneIl.add (NewFieldinsnnode (getstatic, "Java/lang/system", "Out", "ljava/io/printstream;"))); AIl.add (NewLdcinsnnode ("Enter method" +mn.name)); -Il.add (NewMethodinsnnode (invokevirtual, "Java/io/printstream", "println", "(ljava/lang/string;) V")); -Insns.insert (IL); Mn.maxstack + = 3; the } -Classwriter CW =NewClasswriter (0); - cn.accept (CW); - byte[] B = Cw.tobytearray ();
From Classwriter, you can get a byte array that contains the enhanced byte code, write the byte code back to the disk, or use it directly by the ClassLoader. In the above example, the logic of the enhanced section is simpler, simply traversing all the methods in the Java class and adding a call to the System.out.println method. In the byte code, the Java method body consists of a series of instructions. The only thing to do is to generate instructions to call the System.out.println method and insert these instructions at the top of the instruction collection. ASM Abstracts These instructions, but it is difficult to be familiar with all the instructions. ASM provides a tool class Asmifierclassvisitor that can print out the structure information of the Java class's byte code. When you need to enhance a class, you can first make changes to the source code, and then use this tool class to compare the difference between the byte code before and after the modification to determine how to write the enhanced code.
The time to enhance a class file is required after the Java source code is compiled and before the JVM executes. The more common practices are:
- Performed by the IDE after the compilation operation has completed. The Eclipse plugin for Google App engine will run DataNucleus after compiling to enhance the entity class.
- Complete the build process, such as using Ant or Maven to perform related operations.
- Implement your own Java class loader. Once the byte code of the Java class is obtained, the enhanced processing is performed, and then the Java class is defined from the modified byte code.
- Complete with the Java.lang.instrument package introduced in JDK 5.
Java.lang.instrument
JDK 5 introduces the Java.lang.instrument package and is further enhanced in JDK 6 due to a large number of requirements for modifying Java bytecode code. The basic idea is to add some agents when the JVM starts. Each agent is a jar package, and a proxy class is specified in its manifest (manifest) file. This class contains a Premain method. The JVM starts by executing the Premain method of the proxy class, and then executes the main method of the Java program itself. In the Premain method, you can modify the byte code of the program itself. JDK 6 also allows you to dynamically add proxies after the JVM starts. The Java.lang.instrument package supports two modified scenarios, one redefining a Java class that completely replaces the byte code of a Java class, and the other is converting an existing Java class, which is equivalent to the aforementioned class-byte code enhancement. In the case of the previously mentioned output method execution log scenario, we first need to implement the Java.lang.instrument.ClassFileTransformer interface to complete the transformation of the existing Java classes.
1 Static classMethodentrytransformerImplementsClassfiletransformer {2 Public byte[] Transform (ClassLoader loader, String className,3Class<?> classbeingredefined,? Protectiondomain Protectiondomain,byte[] classfilebuffer)4 throwsillegalclassformatexception {5 Try {6Classreader CR =NewClassreader (classfilebuffer);7Classnode cn =NewClassnode (); 8 //omit code that uses ASM for byte code conversion9Classwriter CW =NewClasswriter (0);Ten cn.accept (CW); One returnCw.tobytearray (); A}Catch(Exception e) { - return NULL; - } the } -}
With this transformation class, you can use it in the proxy's Premain method.
1 Public Static void Premain (String args, Instrumentation inst) { 2 inst.addtransformer (new Methodentrytransformer ()); 3 }
The proxy class is made into a jar package, and the name of the proxy class is declared in the jar package manifest file by Premain-class. When running the Java program, add the JVM startup parameter-javaagent:myagent.jar. In this case, the JVM completes the related conversion operation before loading the byte code of the Java class.
Summarize
Manipulating Java bytecode is a very interesting thing to do. It makes it easy to modify binary distributed Java programs and is ideal for tasks such as performance analysis, debug tracking, and log logging. Another very important function is to liberate developers from tedious Java syntax. Developers should only be responsible for writing important code related to business logic. For those that are added only because of grammatical requirements, or where the pattern is fixed, it is entirely possible to dynamically generate its byte code. byte code enhancement and source code generation are different concepts. After the source code is generated, it becomes part of the program, and the developer needs to maintain it: either manually modify the generated source code or regenerate it. The enhanced process of byte code is completely transparent to the developer. Proper use of Java byte Code manipulation technology, can better solve a certain type of development problems.
Resources
- Java byte code format
- Java 6.0 Compiler API
- Explore the Java class loader in depth
Java byte code