1. Introduction
The instructions for a Java Virtual machine consist of a byte-length number that represents the meaning of a particular operation, called the opcode, and the following 0 to many of the parameters required to represent the operation (called operands).
Because the Java virtual machine uses a schema that targets the operand stack rather than the register, most directives do not contain operands, only one opcode.
The length of the Java virtual machine opcode is one byte (that is, 0~255), which means that the total number of opcode in the instruction set cannot exceed 256.
2. Byte code and data type
In the instruction set of a Java virtual machine, most directives contain data type information for their operations. For example, the iload directive is used to load int data from a local variable into the operand stack, while the fload instruction loads data of type float.
Most of the bytecode directives that are related to data types, their opcode mnemonics have special characters that indicate which data type to serve: I represent the data operation for the int type, L represents the Long,s on behalf of Short,b on behalf of BYTE,C on behalf of the CHAR,F on behalf of float, D represents Double,a on behalf of reference. There are also some instruction mnemonics that do not explicitly indicate the type of operation of the letter, such as the arraylength directive.
If every instruction associated with a data type supports all runtime data types of a Java virtual machine, the number of instructions will exceed the number of bytes that a byte can indicate. Therefore, the instruction set of the Java Virtual Machine provides only a limited type-dependent instruction to support it for a specific operation, in other words, the instruction set will be deliberately designed to be non-fully independent, and there are separate instructions that can be used to convert some unsupported types to a supported type when necessary.
3. Byte Code Classification introduction 3.1 Loading and storing instructions
The load and store directives are used to transfer data back and forth between the local variable table and the operand stack in the stack frame.
- Loads a local variable into the Operation Stack:iload,iload_<n>
- Stores a number from the operand stack to the local variable table:istore,istore_<n>
- Load a constant into the operand stack: Bipush
- An instruction to expand the access index of a local variable table: Wide
Where iload_<n> represents a set of instructions, representing the iload_0,iload_1,iload_3. These sets of instructions are a special form of a generic instruction with an operand, and for these special sets of instructions, they omit the explicit operand and do not need to take the operand. The operand is actually implied in the instruction. For example, the semantics of iload_0 are exactly the same as the iload instruction semantics for the operand 0 o'clock.
3.2 Operational Instructions
An operation or arithmetic instruction is used to perform a specific operation on the values on the two operand stacks and to re-deposit the results to the top of the Operation Stack. In general, arithmetic instructions can be divided into two types: the instruction of the operation of the integer data and the operation of the floating-point data. Regardless of the arithmetic instruction, the data type of the Java Virtual machine is used, and because there are no arithmetic instructions that directly support byte, short, Char, and Boolean types, the operation of this type of data should be replaced by an instruction that operates int. The arithmetic instructions for integers and floating-point numbers have their own different behavior when they overflow and are removed by 0, all the arithmetic instructions are as follows:
- Addition instruction: Iadd,ladd,fadd,dadd
- Subtraction instruction: Isub,lsub,fsub,dsub
- Multiplication instruction: Imul,lmul,fmul,dmul
- Division instruction: Idiv,ldiv,fdiv,ddiv
- Command for remainder: Irem,lrem,frem,drem
- Take Counter-command: Ineg,lneg,fneg,dneg
- Displacement directive: ISHL,ISHR,IUSHR,LSHL,LSHR,LUSHR
- Bitwise OR INSTRUCTION: Ior,lor
- Bitwise AND INSTRUCTION: Iand,land
- Bitwise XOR OR instruction: Ixor,lxor
- Local variable self-increment directive: iinc
- Comparison instruction: DCMPG,DCMPL,FCMPG,FCMPL,LCMP
Data operations can cause overflow, for example: two large positive integers added, the result can be a negative number, in fact, the Java Virtual Machine specification does not explicitly define an integer data overflow of the specific operation results, only specify when processing the integer data, Only the division instruction (Idiv and Ldiv) and the remainder instruction (IREM and Lrem) will cause the virtual machine to throw a arithmeticexception overflow when a divisor is zero, and no other integer operation scenario should throw a run-time exception.
3.3 Type Conversion directives
Type conversion directives can convert two different numeric types to each other, which are typically used to implement explicit type conversion operations in user code.
Java Virtual machines directly support the following numeric types of widening type conversions (widening Numeric Conversions, which is a small-range type to a wide-range type of security conversion)
- type int to long,float or double
- Type Long to Float,double
- Float type to double
In contrast, when dealing with narrowing type conversions (Narrowing numberic Conversions), you must explicitly use conversion directives to accomplish these transformations, which include: i2b,i2c,i2s,l2i,f2i,f2l,d2i,d2l and d2f. Narrowing type conversions can result in a different sign and order of magnitude for the conversion result, and the conversion process is likely to result in a loss of precision in numeric values.
Although the narrowing conversion of data types can occur such as upper-bound overflow, lower-bound overflow, and loss of precision, it is never possible for a Java Virtual Machine specification to explicitly specify a narrowing conversion instruction for a numeric type to cause a virtual machine to throw a run-time exception.
3.4 Object creation and access directives
Although class instances and arrays are objects, Java virtual machines use different bytecode directives for the creation and operation of class instances and arrays, with the following directives:
- Instructions for creating an instance of a class: New
- Instructions for creating an array: Newarray,anewarray,multianewarray
- Instructions for accessing class fields and instance fields: Getfield,putfield,getstatic,putstatic
- An instruction that loads an array element into the operand stack: baload,caload,saload,iaload,laload,faload,daload,aaload
- An instruction that stores the value of an operand stack into an array element: Bastore,castore,sastore,iastorefastore,dastore,aastore
- instruction to take array length: arraylength
- To check the class instance type directive: Instanceof,checkcast
3.5 Operand Stack Management instructions
As with the stack in a common data structure, the Java Virtual machine provides some instructions for manipulating the stack of operations directly, including:
- Stack top one or two elements of the operand stack: POP,POP2
- Copy the stack top one or two values and re-press the copied value or double copy value into the top of the stack: dup,dup2,dup_x1,dup2_x1,dup_x2,dup2_x2
- Swap the top two values of the stack: swap
3.6 Control Transfer Instructions
The control transfer instruction allows the Java Virtual machine to conditionally or unconditionally continue executing the program from the specified location instruction instead of the next instruction of the control transfer instruction, which is understood from the conceptual model and can be thought of as a conditional or unconditional modification of the value of the PC register. The control transfer directives are as follows:
- Conditional branch: ifeq,iflt,ifle,ifne,ifgt,ifge,ifnull,ifnonnull,if_icmpeq,if_icmpne,if_icmplt,if_icmpgt,if_icmple,if_ Icmpge,if_acmpeq and If_acmpne
- Compound Conditional Branch: Tableswitch,lookupswitch
- Unconditional Branch: Goto,goto_w,jsr,jsr_w,ret
3.7 Method invocation and return instruction
- Invokevirtual: Directives are used to invoke an instance method of an object, which is dispatched according to the actual type of the object, which is the most common method of assigning methods in the Java language.
- Invokeinterface: The instruction is used to invoke an interface method, which searches for an object that implements the interface method at run time, and finds the appropriate method to make the call;
- Invokespecial: Used to invoke some instance methods that require special handling, including instance initialization methods, private methods, and parent class methods;
- Invokestatic: Used to invoke a class method (static method)
- Invokedynamic: Used to dynamically parse out the method referenced by the call Point qualifier at run time and execute the method, the dispatch logic of the preceding 4 call instructions is cured inside the Java Virtual machine, and the dispatch logic of the invokedynamic instruction is determined by the user's Bootstrap method.
The method invocation instruction is independent of the data type, and the method returns instructions that differ according to the type of the return value, including Ireturn (the return value is boolean,byte,char,short and int), Lreturn,freturn,dreturn and Areturn, There is also a return instruction to use for void methods, instance initialization methods, and class initialization methods for classes and interfaces.
3.8 Exception Handling Instructions
An operation that explicitly throws an exception in a Java program (a throw statement) is implemented by the Athrow directive.
The Java Virtual Machine specification also specifies that many run-time exceptions are automatically thrown when other Java Virtual machine instructions detect abnormal conditions.
In Java virtual machines, processing exceptions (catch statements) are not implemented by bytecode directives, but rather by exception tables.
3.9 Synchronization Instructions
Java Virtual machines can support method-level synchronization and synchronization of a sequence of instructions within a method, both of which are supported using enhancement (Monitor).
The Java Virtual machine's instruction set has Monitorenter and monitorexit two instructions to support the semantics of the Synchronized keyword.
The compiler must make sure that no matter how the method is done, each monitorenter instruction that is called in the method must execute its monitorexit instruction for it, whether the method ends normally or ends unexpectedly.
Java bytecode Directives