Abstract:
This article describes how to apply the dynamic code generation technology in common Java programs and test and compare the performance of various implementation methods.
Outline:
I. Overview/II. Expression Calculator/III. Interpretation Method
Iv. Resolution/5. compilation/6. Generation/7. Performance and Application
Body:
I. Overview
Some people often criticize Java performance and think that Java programs cannot be compared with C or C ++ programs. To this end, Java has been making unremitting efforts in performance optimization, especially the performance optimization mechanism at runtime, which has put a lot of blame. However, no matter how much java improves the performance, there is no end to the thirst for code performance.
Obviously, the performance of some java operations cannot be compared with that of C/C ++, which is determined by the characteristics of the Java language, for example, an intermediate language (bytecode) mechanism is used for cross-platform purposes. On the other hand, because Java has many unique features, it can use optimization technologies that are difficult to use in many other languages. Dynamic code generation is one of them.
Dynamic code generation is a process in which the program dynamically generates code at runtime. The dynamically generated code runs in the same JVM as the program that generates it, and the access method is similar. Of course, similar to other optimization technologies, dynamic code generation only applies to certain types of tasks.
JSP is perhaps the most familiar example of dynamic code generation. The servlet engine can distribute client requests to the servlet for processing, but the servlet is a static structure. Before starting the server, the servlet must be compiled and configured first. Although servlet has many advantages, it is slightly inferior in terms of flexibility. JSP technology breaks through servlet restrictions and allows dynamic creation of servlets Based on JSP files during runtime.
When the client program sends a request to the JSP file, the servlet engine sends a request to the JSP engine. The JSP Engine processes the JSP file and returns the result. A JSP file is a text description of a series of actions. The execution result of these actions is the page that is returned to the user. Obviously, if every user's request arrives and executes the JSP page in an interpreted manner, the overhead will certainly be relatively large. Therefore, the JSP Engine compiles JSP pages to dynamically create servlets. Once the JSP page is changed, the JSP Engine dynamically creates a new servlet.
Here, the advantage of dynamic code generation technology is obvious-it meets both the flexibility requirements and does not affect the performance too much. When compiling the servlet or even starting the server, the system's behavior does not have to be completely fixed. At the same time, because you do not have to explain the execution of JSP files when responding to each request, the response time is reduced.
Ii. Expression Calculator
Let's take a look at how to use dynamic code generation technology in common Java programs. The example in this article is a simple arithmetic expression calculator, which can calculate the suffix expression like "4 $0 + $1, $0 and $1 indicate the variables 0 and 1 respectively. There may be three types of symbols in the expression: variables, constants, and operators.
A suffix expression is a stack-based computing expression. The processing process is performed from left to right. The previous expression is used as an example: Press 4 and variable 0 into the stack, the next character is the operator "+", so we add the two values (4 and 0) at the top of the stack, and then replace the two values at the top of the stack with the addition result. Then, press 1 into the stack. Because the operator "*" is next, perform the multiplication operation on the two values at the top of the stack. If you convert this expression to a common algebraic expression (that is, an infix expression), it is "(4 + $0) * $1 ". If the two variables are "[3, 6]", the expression is calculated as (4 + 3) * 6 = 42.
To compare the performance differences between dynamic code generation and conventional programming, we will implement the expression calculator in different ways, and then test the performance of each calculator.
All expression calculators in this article implement (or implicitly implement) the calculator interface. The calculator interface has only one evaluate method. Its input parameter is an integer array, and the return value is an integer representing the calculation result.
//Calculator.javapublic interface Calculator { int evaluate(int[] arguments);}
Iii. Interpretation
First, let's look at a simple but inefficient expression calculator, which uses the stack object to calculate the expression. The expression needs to be analyzed again for each calculation. Therefore, it can be called the interpretation method.
However, expression Symbolic analysis is executed only once when the object is created, avoiding the overhead of the stringtokenizer class.
// Simplecalculator. javaimport Java. util. arraylist; import Java. util. stack; import Java. util. stringtokenizer; public class simplecalculator implements calculator {string [] _ toks; // symbol list public simplecalculator (string expression) {// constructor list arraylist list = new arraylist (); stringtokenizer tokenizer = new stringtokenizer (expression); While (tokenizer. hasmoretokens () {list. add (tokenizer. nexttoken ();} _ toks = (string []) list. toarray (New String [list. size ()]);} // place the variable value into the variable in the expression, // then return the calculation result of the expression public int evaluate (INT [] ARGs) {stack = new stack (); For (INT I = 0; I <_ toks. length; I ++) {string Tok = _ toks [I]; // The variable if (Tok. startswith ("$") {int Varnum = integer. parseint (Tok. substring (1); stack. push (New INTEGER (ARGs [Varnum]);} else {char opchar = Tok. charat (0); int op = "+ -*/". indexof (opchar); If (OP =-1) {// constant stack. push (integer. valueof (Tok);} else {// operator int arg2 = (integer) stack. pop ()). intvalue (); int arg1 = (integer) stack. pop ()). intvalue (); Switch (OP) {// perform the specified operation on the two values at the top of the stack. Case 0: Stack. push (New INTEGER (arg1 + arg2); break; Case 1: Stack. push (New INTEGER (arg1-arg2); break; Case 2: Stack. push (New INTEGER (arg1 * arg2); break; Case 3: Stack. push (New INTEGER (arg1/arg2); break; default: Throw new runtimeexception ("operator illegal:" + Tok) ;}}}return (integer) stack. pop ()). intvalue ();}}
From the performance test data later in this article, we can see that the efficiency of this expression calculation method is quite low. It may be useful for occasional expressions, but we still have better
Processing Method.
Iv. Resolution
If you often want to calculate the value of an expression, a better way is to parse the expression first and apply the composite design pattern to construct an Expression Tree. We call this expression Calculation Method
Resolution method. As shown in the following code, the internal structure of the tree represents the computing logic of the expression, thus avoiding repeated Analysis of the computing logic each time the expression is calculated.
// Calculatorparser. javaimport Java. util. stack; import Java. util. stringtokenizer; public class calculatorparser {public calculator parse (string expression) {// analysis expression, which constructs the // tree structure composed of symbols of the expression. Stack stack = new stack (); stringtokenizer toks = new stringtokenizer (expression); While (toks. hasmoretokens () {string Tok = toks. nexttoken (); If (Tok. startswith ("$") {// The variable int Varnum = integer starting with '$. parseint (Tok. substring (1); stack. push (New variablevalue (Varnum);} else {int op = "+ -*/". indexof (Tok. charat (0); If (OP =-1) {// constant int val = integer. parseint (Tok); stack. push (New constantvalue (VAL);} else {// operator calculator node2 = (calculator) stack. pop (); calculator node1 = (calculator) stack. pop (); stack. push (new operation (Tok. charat (0), node1, node2) ;}} return (calculator) stack. pop () ;}// constant static class constantvalue implements calculator {private int _ value; constantvalue (INT value) {_ value = value;} public int evaluate (INT [] ARGs) {return _ value ;}// variable static class variablevalue implements calculator {private int _ Varnum; variablevalue (INT Varnum) {_ Varnum = Varnum ;} public int evaluate (INT [] ARGs) {return ARGs [_ Varnum] ;}// operator static class operation implements calculator {char _ OP; calculator _ arg1; calculator _ arg2; operation (char op, Calculator arg1, Calculator arg2) {_ OP = op; _ arg1 = arg1; _ arg2 = arg2;} public int evaluate (INT ARGs []) {int val1 = _ arg1.evaluate (ARGs); int val2 = _ arg2.evaluate (ARGs); If (_ OP = '+') {return val1 + val2 ;} else if (_ OP = '-') {return val1-val2;} else if (_ OP = '*') {return val1 * val2 ;} else if (_ OP = '/') {return val1/val2;} else {Throw new runtimeexception ("operator illegal:" + _ OP );}}}}
Because the calculation logic of the expression has been parsed in advance, the performance of calculatorparser is significantly higher than that of the first calculator that is executed by interpretation. However, we can also use
Code dynamic generation technology further optimizes the code.
V. Compilation Method
To further optimize the performance of the expression calculator, We need to directly compile the expression-first dynamically generate Java code based on the logic of the expression, and then execute the dynamically generated Java code,
This method can be called the compilation method.
Translating a suffix expression into a Java expression is simple, for example, "$0 $1 $2 * +" can be expressed by the Java expression "ARGs [0] + (ARGs [1] * ARGs [2. We want to generate
Select a unique name for the Java class and write the code to the temporary file. Dynamically generated Java classes take the following form:
Public class [class name] implements calculator {public int evaluate (INT [] ARGs) {return ARGs [0] + (ARGs [1] * ARGs [2]);}
The complete code of the compilation calculator is as follows.
// Calculatorcompiler. javaimport Java. util. stack; import Java. util. stringtokenizer; import Java. io. *; // custom Class Loader public class calculatorcompiler extends classloader {string _ compiler; string _ classpath; Public calculatorcompiler () {super (classloader. getsystemclassloader (); // compiler type _ compiler = system. getproperty ("Calc. compiler "); // default compiler if (_ compiler = NULL) _ compiler =" javac "; _ classpath = ". "; Str Ing extraclasspath = system. getproperty ("Calc. classpath"); If (extraclasspath! = NULL) {_ classpath = _ classpath + system. getproperty ("path. separator ") + extraclasspath;} public calculator compile (string expression) {// A3 string jtext = javaexpression (expression); string filename =" "; string classname = ""; try {// create a temporary file javafile = file. createtempfile ("compiled _",". java ", new file (". "); filename = javafile. getname (); classname = filename. substring (0, filename. Lastindexof (". "); generatejavafile (javafile, classname, expression); // compile the file invokecompiler (javafile); // create a Java Class byte [] Buf = readbytes (classname + ". class "); Class C = defineclass (BUF, 0, Buf. length); try {// create and return the class instance return (calculator) C. newinstance ();} catch (illegalaccessexception e) {Throw new runtimeexception (E. getmessage ();} catch (instantiationexception e) {Throw new runtimeexcepti On (E. getmessage () ;}} catch (ioexception e) {Throw new runtimeexception (E. getmessage () ;}}// generate the Java file void generatejavafile (File javafile, string classname, string expression) throws ioexception {fileoutputstream out = new fileoutputstream (javafile ); string text = "public class" + classname + "implements calculator {" + "Public int evaluate (INT [] ARGs) {" + "" + javaexpression (expressio N) + "}" + "}"; out. write (text. getbytes (); out. close () ;}// compile the Java file void invokecompiler (File javafile) throws ioexception {string [] cmd = {_ compiler, "-classpath", _ classpath, javafile. getname ()}; // execute the compilation command // A1: Process = runtime.getruntime(cmd.exe C (CMD); try {// wait until the compiler ends process. waitfor ();} catch (interruptedexception e) {} int val = process. exitvalue (); If (Val! = 0) {Throw new runtimeexception ("Compilation error:" + "error code" + val );}} // read the Class Object byte [] readbytes (string filename) throws ioexception in byte array format {// A2 file classfile = new file (filename ); byte [] Buf = new byte [(INT) classfile. length ()]; fileinputstream in = new fileinputstream (classfile); In. read (BUF); In. close (); Return Buf;} string javaexpression (string expression) {stack = new stack (); stringtokenizer toks = new stringtokenizer (expression); While (toks. hasmoretokens () {string Tok = toks. nexttoken (); If (Tok. startswith ("$") {stack. push ("ARGs [" + integer. parseint (Tok. substring (1) + "]");} else {int op = "+ -*/". indexof (Tok. charat (0); If (OP =-1) {stack. push (Tok);} else {string arg2 = (string) stack. pop (); string arg1 = (string) stack. pop (); stack. push ("(" + arg1 + "" + Tok. charat (0) + "" + arg2 + ")") ;}} return "return" + (string) stack. pop () + ";";}}
With the dynamically generated code, you need to compile the code. We assume that the system uses the javac compiler, and the path environment variable of the system contains the path of the javac compiler. For example
If javac is not in the path environment variable or you want to use another compiler, you can specify it through the compiler attribute, for example, "-dcalc. compiler = jikes ". If the compiler is not
Javac. Generally, you need to put the JAR file (RT. jar in the JRE/lib directory) in the classpath of the compiler. We use the classpath attribute to indicate additional
Classpath member. For example, "-dcalc. classpath = C:/Java/JRE/lib/RT. Jar ".
The compiler can use runtime.exe C (string [] program execution as an external task. The execution result of runtime.exe C is a process object (see the description
Code, which is referenced in a similar way below ). The cmd array contains the system commands to be executed. The first element must be the name of the program to be executed, and the other elements must be
Parameters passed to the execution program. After starting the compilation process, we need to wait until the compilation process finishes running and then obtain the returned value of the compiler. If the compilation process returns 0, the compilation is successful. The last issue related to the compiler is that since the compiler runs as an external process, it is best to read the compiler output and error reports. If the compiler encounters a large number of errors
The compilation process may be blocked (waiting for reading ). The example in this article is just to test the performance, for a simple calculation, do not deal with this problem. However, in the formal Java project
The question must be handled. After the compilation is successful, there will be a class file in the current directory. We will load it with classloader (comment "A2 "). Classloader reads
Byte array, so we first read the content of the class file into the byte array, and then create a class. The Class Loader here is the simplest custom class loader, but it is enough
The task here. After the class is successfully loaded, create an instance of the class and return the instance (comment "A3 ").
The test results show that the performance of the compiler calculator has been significantly improved. It is also 1000000 computations, and now only 100-200 ms, instead of 1-2 seconds. However,
Compilation also brings about a great deal of time overhead. It takes about 1-2 seconds to call the javac compiler to compile the code, which offsets the performance improvement of the calculator. However, javac is not a high-performance
If we use a high-speed compiler such as jikes, the Compilation Time will be greatly improved to 100-200 ms.
Vi. Generation Method
The ideal solution is of course the performance advantages of the compilation method during runtime, and the overhead of calling the external compiler is avoided. In the following example, we need to directly generate Java bytecode in the memory to avoid external calls.
The compiler overhead is called the generation method.
The format of Java class files is complicated, so we need to use a third-party bytecode code library to generate files. In this example, bcel is used, that is, bytecode engineering.
Library. Bcel is a free open source code library (http://sourceforge.net/projects/bcel/) that helps us analyze, create, and process binary
Java bytecode. Let's take a look at the list of calculator code that uses bcel to directly generate bytecode.
// Calculatorgenerator. javaimport Java. io. *; import Java. util. stack; import Java. util. stringtokenizer; // download bcelcode library import de from sourceforge.net/projects/bcel. fub. bytecode. classfile. *; import de. fub. bytecode. generic. *; import de. fub. bytecode. constants; public class calculatorgenerator extends classloader {public calculator generate (string expression) {string classname = "calc _" + system. currenttimemillis (); // Declaration class // B1 classgen = new classgen (classname, "Java. lang. object "," ", constants. acc_public | constants. acc_super, new string [] {"Calculator"}); // constructor // B2 classgen. addemptyconstructor (constants. acc_public); // Method for adding a calculated expression // B3 addevalmethod (classgen, expression); byte [] DATA = classgen. getjavaclass (). getbytes (); Class C = defineclass (data, 0, Data. length); try {return (calculator) C. newinstance ();} catch (illegalaccessexception e) {Throw new runtimeexception (E. getmessage ();} catch (instantiationexception e) {Throw new runtimeexception (E. getmessage () ;}} private void addevalmethod (classgen, string expression) {// B4 constantpoolgen CP = classgen. getconstantpool (); instructionlist IL = new instructionlist (); stringtokenizer toks = new stringtokenizer (expression); int stacksize = 0; int maxstack = 0; while (toks. hasmoretokens () {string Tok = toks. nexttoken (); If (Tok. startswith ("$") {int Varnum = integer. parseint (Tok. substring (1); // array references Il. append (instructionconstants. aload_1); // array No. il. append (new push (CP, Varnum); Il. append (instructionconstants. iaload);} else {int op = "+ -*/". indexof (Tok. charat (0); // generate the switch (OP) {Case-1: int val = integer according to the operator. parseint (Tok); Il. append (new push (CP, Val); break; Case 0: Il. append (instructionconstants. iadd); break; Case 1: Il. append (instructionconstants. isub); break; Case 2: Il. append (instructionconstants. imul); break; Case 3: Il. append (instructionconstants. idiv); break; default: Throw new runtimeexception ("the operator is invalid"); }}} Il. append (instructionconstants. ireturn); // creation method // B5 methodgen method = new methodgen (constants. acc_public, type. int, new type [] {type. getType ("[I")}, new string [] {"ARGs"}, "evaluate", classgen. getclassname (), Il, CP); // B6 method. setmaxstack (); method. setmaxlocals (); // Add the method to the class classgen. addmethod (method. getmethod ());}}
When using bcel, you must first create a classgen object (comment "B1") that represents the Java class "). Just like the previous compilation method, we need to define a unique class name. Different from common Java code, we need to explicitly declare the superclass java. Lang. object. Acc_public declares that the class is of the Public type. All Java classes of version 1.0.2 or later must declare the acc_super access tag. Finally, we specify this class to implement the calculator interface.
Second, make sure that the class has a default constructor (comment "B2 "). For a general Java compiler, if the Java class does not have a defined constructor, the Java compiler automatically inserts a default constructor. Now we use bcel to generate bytecode directly. The constructor must be explicitly declared. The method for generating default constructor with bcel is simple. You only need to call classgen. addemptyconstructor.
Finally, we need to generate the evaluate (INT [] arguments) method for the calculated expression (comment "B3" and "B4 "). JVM itself is based on the stack, so the process of converting expressions into bytecode is very simple. The stack-based calculator can almost directly convert into bytecode. Commands are collected to an instructionlist in the execution order. In addition, we also need a reference to constantpoolgen pointing to the constant pool.
After the instructionlist is ready, we can create a methodgen object (comment "B5 "). We want to create a public method, whose return value is int, and the input parameter is an integer array (note, here we use the internal representation of the integer array "[I "). In addition, we also provide parameter names, but this is not necessary. Here, the parameter name is ARGs and the method name is evaluate. The last few parameters include the name of a class, an instructionlist, and a constant pool.
The restrictions on defining Java methods in bcel are strict (comment "B6 "). For example, the Java method must declare how much space the operator stack needs and the space allocated for local variables. If these values are incorrect, the JVM rejects the execution method. In this example, it is not difficult to manually calculate these values, but bcel provides several methods for analyzing bytecode. We only need to call setmaxstack () and setmaxlocals () method.
So far, the entire class has been constructed. The remaining task is to load the class into JVM. As long as the class in the byte array form exists in the memory, we can call the class loader as in the compilation method.
The directly generated code is as fast as the code generated by the compilation method, but the initial object creation time is greatly reduced. If an external compiler is called, it is best to require more than ms, and only 4 ms on average to create a class using bcel.
VII. Performance and Application
Table 1 shows the average object creation time of the four methods. The compilation method is divided into two compilers for testing. Table 2 is the expression used for five tests. Table 3 is the time required to calculate these expressions for 1000000 times.
Obviously, the example in this article is completely for the purpose of testing performance. In actual applications, it is rare to calculate an expression 1000000 times. However, it must be resolved at runtime
Data (XML, script language, query statement, and so on) is often encountered. Dynamic code generation does not necessarily apply to each type of task, but in the following scenarios
Usage:
· The processing process is mainly determined by the effective definition information of the runtime.
· The processing process must be repeated multiple times.
· If the definition information is re-parsed during each processing process, a large overhead is required.
If a problem is suitable for using the dynamic code generation technology, another question is: Should I use the compilation method or the generation method? In general, first generate the Java code and then
The method for calling an external compiler is relatively simple. Compared with JVM commands, most people are more familiar with Java code. debugging programs with source code is more convenient than directly debugging bytecode. In addition,
A good compiler will optimize the Code during the compilation process, and such optimization operations are generally hard to take into account when coding manually. On the other hand, calling an external compiler is a very costly process,
Configuring the compiler and classpath also increases the complexity of application maintenance. The generation method has obvious performance advantages. However, it requires developers to have a deep understanding of the format and
JVM bytecode command. In the process of generating code, the compiler actually does a lot of work that is invisible on the surface. The manually written bytecode may not be able to achieve the compiler's automatic compilation.
Effect. If the code to be generated is complex, you must consider it carefully before choosing to use the generation method.