An in-depth understanding of jvm06--bytecode instruction introduction

Last Update:2016-05-22 Source: Internet

Author: User

Tags throw exception

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is based on Zhou Zhiming's "in-depth understanding of Java virtual machines"

The instructions for a Java virtual machine consist of a byte-length opcode that represents the meaning of a particular operation (Opcode) and 0 to several operands (operands) that represent the parameters required for this operation. Many instructions in a virtual machine do not contain operands, only one opcode.

If exception handling is omitted, the Java virtual machine interpreter can work efficiently using the following pseudo-code loop:

 Do if  while (Process the next cycle);

Do {

Automatically calculates the PC Register and removes the opcode from the position of the pc register ;

if ( There are operands ) take out the operand ;

Perform the actions defined by the opcode

} while ( processing the next loop ) ;

The number and length of the operands depend on the opcode, and if an operand is longer than one byte, it will be stored in Big-endian order-that is, the pre-high byte order. For example, if you want to store a 16-bit-length unsigned integer using two unsigned bytes (name them byte1 and Byte2), their values should look like this:

(Byte1 <8byte2

( byte1 << 8 ) | Byte2

Byte-code instruction flow should be single-byte alignment, only "Tableswitch" and "Lookupswitch" two instruction exception, because their operands are more special, are divided into 4 bytes to open, so the two instructions that also need to set aside the corresponding vacancy to achieve alignment.

Restricting the length of the Java virtual machine opcode to one byte, and discarding the parameter length alignment of the compiled code, is to get as little lean compiled code as possible, even at the cost of a specific implementation of the Java Virtual machine's performance. Since each opcode can only have one byte length, it directly restricts the number of instructions in the entire instruction set (the limit for bytecode cannot exceed 256), and since there is no assumption that the data is aligned, this means that when the virtual machine processes data that exceeds one byte, The structure of specific data has to be rebuilt from bytes at run time, which in some way loses some performance.

Data types vs. Java virtual machines

In the instruction set of a Java virtual machine, most directives contain data type information for their operations. For example, the iload instruction is used to load the int data from the local variable table into the operand stack, while the fload instruction loads data of type float. The operations of these two instructions may be implemented by the same piece of code, but they must have their own independent operators.

For most of the bytecode directives associated with data types, their opcode mnemonics have special characters that indicate which data type to serve: I represents a data operation on the int type, and L represents Long,s on behalf of Short,b on behalf of BYTE,C on behalf of the Char,f float, D represents Double,a on behalf of reference. There are also some instruction mnemonics that do not explicitly indicate the type of operation of the letter, such as the Arraylength directive, which does not represent a special character of the data type, but the operand can never be an object of an array type. There are other directives, such as the unconditional jump instruction Goto, which is independent of the data type.

Because the Java virtual machine's opcode is only one byte long, the opcode that contains the data type poses a great strain on the design of the instruction set: if every instruction associated with the data type supports all the runtime data types of the Java virtual machine, then I'm afraid it will exceed the number range that one byte can represent. Therefore, the instruction set of the Java virtual machine provides only a limited type-dependent instruction to support it for a particular operation, in other words, the instruction set will be deliberately designed to be non-orthogonal, that is, not every data type and each operation has a corresponding instruction. There are separate instructions that can be used to convert some unsupported types to a supported type when necessary.

The following table lists the bytecode instruction sets supported by the Java virtual machine, and a specific bytecode directive can be obtained by replacing T in the instruction template of the opcode column with the special characters represented by the data type column. If the grid that is determined by the directive template and the data type two columns in the table is empty, the virtual machine does not support this operation for this type of data. For example, the load directive has a iload that operates int, but does not operate a similar instruction of type Byte.

Note that from the following table, most of the instructions do not support integer types Byte, char, and short, and even no directives support the Boolean type. At compile time or run time, the compiler will sign-extend the data with byte and short types to the corresponding int type data, and the Boolean and char type data 0-bit extensions (zero-extend) to the corresponding int type data. Similarly, when working with arrays of type Boolean, Byte, short, and char, they are also converted to use byte-code directives of the corresponding int type for processing. Therefore, most operations on Boolean, Byte, short, and char type data are actually using the corresponding type of int (computational type).

The data types supported by the Java virtual machine instruction set:

Short

opcode	byte		int	Long	float	Double	Char	Reference
Tipush	Bipush	Sipush
Tconst	Iconst	Lconst	Fconst	Dconst	Aconst
Tload	Iload	Lload	Fload	Dload	Aload
Tstore	Istore	Lstore	Fstore	Dstore	Astore
Tinc	Iinc
Taload	Baload	Saload	Iaload	Laload	Faload	Daload	Caload	Aaload
Tastore	Bastore	Sastore	Iastore	Lastore	Fastore	Dastore	Castore	Aastore
Tadd	Iadd	Ladd	Fadd	Dadd
Tsub	Isub	Lsub	Fsub	Dsub
Tmul	Imul	Lmul	Fmul	Dmul
Tdiv	Idiv	Ldiv	Fdiv	Ddiv
Trem	Irem	Lrem	Frem	Drem
Tneg	Ineg	Lneg	Fneg	Dneg
Tshl	Ishl	Lshl
Tshr	Ishr	Lshr
Tushr	Iushr	Lushr
Tand	Iand	Land
Tor	Ior	Lor
Txor	Ixor	Lxor
i2t	I2b	I2s	i2l	i2f	i2d
l2t	L2i	l2f	l2d
f2t	F2i	f2l	F2d
D2T	D2i	d2l	d2f
Tcmp	lcmp
Tcmpl	Fcmpl	Dcmpl
Tcmpg	Fcmpg	Dcmpg
If_tcmpop	If_icmpop	If_acmpop
Treturn	Ireturn	Lreturn	Freturn	Dreturn	Areturn

In a Java virtual machine, the mapping between the actual type and the operation type, as shown in the following table:

actual type	type of Operation	category
Boolean	Int	Category One
Byte	Int	Category One
Char	Int	Category One
Short	Int	Category One
Int	Int	Category One
Float	Float	Category One
Reference	Reference	Category One
ReturnAddress	ReturnAddress	Category One
Long	Long	Category II
Double	Double	Category II

Some Java virtual machine instructions (such as pop and swap instructions) that operate on the Operation Stack are independent of the specific type, but these directives must also be restricted by the type of operation, which is also listed in the table.

Loading and storing instructions

The load and store directives are used to transfer data from the local variable table of the stack frame to the stack of operands:

Instructions for loading a local variable into the action stack include: Iload, iload_<n>, Lload, lload_<n>, Fload, fload_<n>, Dload, dload_<n>, Aload, aload_<n>
Instructions for storing a numeric value from the operand stack to a local variable table include: Istore, istore_<n>, Lstore, lstore_<n>, Fstore, fstore_<n>, Dstore, Dstore _<n>, Astore, astore_<n>
Instructions to load a constant into the operand stack include: Bipush, Sipush, LDC, Ldc_w, Ldc2_w, Aconst_null, Iconst_m1, iconst_<i>, lconst_<l>, Fconst_<f>, dconst_<d>
An instruction to expand the access index of a local variable table: Wide

An instruction that accesses a field or array element of an object also transmits data to the operand stack.

Some of the instruction Mnemonics listed above are terminated with angle brackets (for example, iload_<n>), which actually represent a set of instructions (for example, Iload_<n> it represents Iload_0, Iload_1, Iload_2, and Iload_3 these instructions). These sets of instructions are a special form of a generic instruction with an operand (for example, Iload), for which there are no operands on the surface and no action to take the operand, but the operands are implied in the instruction. In addition, their semantics are exactly the same as their native generic directives (for example, the semantics of iload_0 are exactly the same as the iload instruction semantics of the 0 o'clock operand). The letters between the angle brackets define the data type of the instruction implied operand,<i> is the int data,<l> represents the long type,<f> represents the float type,<d> represents the double type. When manipulating Byte, char, and short type data, it is also represented by the int type.

This method of instruction representation is common throughout the Java virtual machine specification.

Operation Instructions

The arithmetic instruction is used to perform a specific operation on the values on the two operand stacks and to re-deposit the results to the top of the Operation Stack. In general, there are two types of operation instructions: the instruction of the operation of the integer data and the instruction to operate on the floating-point data, regardless of the arithmetic instruction, using the numeric type of the Java virtual machine. The data does not directly support arithmetic instructions for Byte, short, Char, and Boolean types (§2.11.1), and the operation of the data is to use an instruction of type int.

The arithmetic instructions for integers with floating-point numbers have their own different behavior at the time of overflow and being removed by 0, all arithmetic instructions include:

Addition instructions: Iadd, Ladd, Fadd, Dadd
Subtraction directives: Isub, Lsub, Fsub, dsub
Multiplication directives: Imul, Lmul, Fmul, Dmul
Division directives: Idiv, Ldiv, Fdiv, Ddiv
Command for remainder: Irem, Lrem, Frem, Drem
Take Counter-instructions: Ineg, Lneg, Fneg, Dneg
Displacement directives: ISHL, Ishr, Iushr, LSHL, LSHR, LUSHR
Bitwise OR instruction: IOR, Lor
Bitwise AND INSTRUCTION: Iand, land
Bitwise XOR OR instruction: Ixor, Lxor
Local variable self-increment directive: iinc
Comparison directives: Dcmpg, Dcmpl, Fcmpg, Fcmpl, lcmp

The instruction set of the Java Virtual machine directly supports the various semantics of integer and floating-point operations described in the Java language specification.

The Java virtual machine does not explicitly specify an integer data overflow condition, but specifies that when processing integer data, only the division instruction (Idiv and Ldiv) and the remainder instruction (IREM and Lrem) will cause the virtual machine to throw an exception if this occurs, and if this happens, the virtual machine will throw Arithmeitcexception exception.

When working with floating-point numbers, a Java virtual machine must adhere to the behavior limits specified in the IEEE 754 specification. This means that the Java Virtual Machine requires full support for the informal floating-point values defined in IEEE 754 (denormalized floating-point numbers,§2.3.2) and the Cascade underflow (gradual underflow). These features will make it easier for some numerical algorithms to be processed.

The Java virtual machine requires that when the floating-point arithmetic is performed, all the results of the operation must be rounded to the appropriate progress, the imprecise result must be rounded to the nearest exact value that can be represented, and if there are two representations that are as close as the value, the least significant bit of zero is preferred. This rounding mode is also the default rounding mode in the IEEE 754 specification, which is referred to as the nearest number rounding mode.

When converting a floating-point number to an integer, the Java virtual machine uses the 0 rounding mode in the IEEE 754 standard, which results in the truncation of numbers and the discard of valid bytes for all fractional parts. Rounding to 0 mode selects a number that is closest but not greater than the original value in the target numeric type as the most accurate rounding result.

Java virtual machines do not throw any run-time exceptions when dealing with floating-point arithmetic (Java exceptions are described here, do not confuse floating-point exceptions in the IEEE 754 specification), and when an operation overflows, it is represented by a signed infinity. If there is no explicit mathematical definition of an operation result, the NaN value will be expressed. All arithmetic operations that use the Nan value as the operand, and the result returns NaN.

When comparing a long type value, the virtual machine takes a signed comparison, while comparing floating-point values (Dcmpg, Dcmpl, Fcmpg, Fcmpl), the virtual machine uses the IEEE 754 specification to define a no-signal comparison (nonsignaling Comparisons) mode.

Type conversion directives

Type conversion directives can convert two Java virtual machine numeric types to each other, which are typically used to implement explicit type conversion operations for user code, or to handle the problem of non-complete independence of instructions in a Java Virtual machine bytecode instruction set.

Java Virtual Machine Direct support (note: "Direct support" means no explicit conversion instructions are required for conversion) the widening type conversion of the following numeric values (widening Numeric Conversions, small-range type to large-range type security conversion):

type int to long, float, or double type
Long type to float, double type
Float type to double type

The narrowing type conversion (Narrowing Numeric Conversions) directives include: I2b, I²c, I2s, L2i, F2i, f2l, D2i, d2l, and d2f. Narrowing type conversions can result in a different sign and order of magnitude for the conversion result, and the conversion process is likely to result in a numerical loss of precision.

When you convert an int or long type to an integer type T, the conversion process simply discards anything other than the lowest bit n bytes, and n is the data type length of type T, which may cause the conversion result to have a different sign than the input value (note: The bit is discarded in the high-level byte symbol).

When converting a floating-point value to an integer type T (t limited to one of the int or long types), the following translation rules are followed:

If the floating-point value is NaN, the conversion result is an int or a long type of 0
Otherwise, if the floating-point value is not infinite, the floating-point value uses the IEEE 754-to-0 rounding mode (§2.8.1)
Rounding, get the integer value V, this time there may be two situations:
- If T is a long type and the result of the conversion is within the range of the long type, it is converted to long
  Type value V
- If T is of type int and the result of the conversion is within the range of the int type, it is converted to int
  Type value V
Otherwise:
- If the value of the conversion result V is too small (including small enough negative numbers and negative infinity), the T class cannot be used
  Type, the result of the conversion is the smallest number that can be represented by an int or long type.
- The T class cannot be used if the value of the conversion result V is too large (including large enough positive numbers and positive infinity)
  The maximum number that can be represented by an int or long type.

The process of narrowing conversions from a double type to a float type is consistent with what is defined in IEEE 754 and is rounded out by IEEE 754 to the nearest rounding mode (§2.8.1) to a number that can be represented by a float type. If the absolute value of the conversion result is too small to be represented by float, the positive and negative zeros of type float are returned. If the absolute value of the conversion result is too large to be represented by float, the positive and negative infinity of the float type is returned, and the Nan value for the double type will be converted to the Nan of type float.

The narrowing conversions of numeric types in a Java virtual machine can never cause a virtual machine to throw a run-time exception (exceptions in this case refer to exceptions defined in the Java Virtual Machine specification), despite the possibility of upper-bound overflow, low-limit overflow, and loss of precision, and readers should not contact IEEE 754 is confused with the floating-point exception signal defined in the

object creation and manipulation

Although class instances and arrays are objects, Java virtual machines use different bytecode directives for the creation and operation of class instances and arrays:

Instructions for creating an instance of a class: New
Instructions for creating an array: Newarray,anewarray,multianewarray
Access a class field (static field, or class variable) and an instance field (not a static field, or an instance variable) directive: GetField, Putfield, Getstatic, putstatic
An instruction that loads an array element into the operand stack: baload, Caload, Saload, Iaload, Laload, Faload, Daload, Aaload
An instruction that stores the value of an operand stack to an array element: Bastore, Castore, Sastore, Iastore, Fastore, Dastore, Aastore
instruction to take array length: arraylength
Instructions for checking the class instance type: Instanceof, Checkcas

Operand Stack Management directives

The Java virtual machine provides some instructions for directly manipulating the stack of operations, including: Pop, Pop2, DUP, dup2, dup_x1, dup2_x1, dup_x2, dup2_x2, and swap.

Control transfer Directives

The control transfer instruction allows the Java virtual machine to continue executing the program conditionally or unconditionally from the next instruction of the specified instruction instead of the control transfer instruction. Control transfer directives include:

Conditional branches: ifeq, Iflt, Ifle, Ifne, IFGT, IFGE, Ifnull, Ifnonnull, If_icmpeq, If_icmpne, If_icmplt, IF_ICMPGT, If_icmple, if_ Icmpge, If_acmpeq and If_acmpne.
Compound conditional branches: Tableswitch, Lookupswitch
Unconditional branching: Goto, Goto_w, JSR, Jsr_w, ret

There is a specialized instruction set in a Java virtual machine that handles conditional branch comparison operations of the int and reference types in order to be able to clearly identify whether an entity value is null or not, and a special instruction is used to detect null values.

Conditional Branch comparison operations of the Boolean type, byte type, char type, and short type are done using comparison directives of type int, whereas for a long type, float type, and a conditional branch comparison operation of type Double, the corresponding type of comparison instruction is executed first , the operation instruction returns an integer value into the operand stack, followed by a conditional branch comparison operation of type int to complete the branch jump. Since various types of comparisons will eventually be converted to comparison operations of type int, this importance based on the comparison of int types, the Java virtual machine provides a very rich range of conditional branching directives of type int.

All conditional Branch transfer directives of type int are signed comparison operations.

Method calls and return directives

The following four instructions are used for method invocation:

The invokevirtual directive is used to invoke an instance method of an object, which is dispatched according to the actual type of the object (virtual method dispatch), which is the most common method of assigning methods in the Java language.
The invokeinterface directive is used to invoke an interface method that searches for an object that implements the interface method at run time, and finds the appropriate method to invoke.
The invokespecial directive is used to invoke some instance methods that require special handling, including instance initialization methods, private methods, and parent class methods.
The invokestatic directive is used to invoke a class method (static method).

The method return instruction is differentiated by the type of the return value, including Ireturn (used when the return value is Boolean, Byte, char, short, and int types), Lreturn, Freturn, Dreturn, and Areturn, plus a The return instruction is used for methods that are declared void, instance initialization methods, classes, and class initialization methods for interfaces.

Throw exception

An operation that explicitly throws an exception in a program is implemented by the Athrow directive, except in this case, where other exceptions are automatically thrown by the virtual machine when other Java virtual machine instructions detect an abnormal condition.

Synchronous

Java virtual machines can support method-level synchronization and synchronization of a sequence of instructions within a method, both of which are supported using enhancement (Monitor).

Method-level synchronization is implicit, which is not controlled by bytecode directives, and is implemented in method invocations and return operations. A virtual machine can distinguish a method from a synchronous method from the Acc_synchronized access flag in the method table structure (Method_info Structure) in the method constant pool. When the method is called, the calling instruction will check whether the acc_synchronized access flag of the method is set, and if it is set, the execution thread will first hold the pipe and then execute the method, and finally the method completes (whether normal or abnormal) when the pipe is released. During the execution of the method, the execution thread holds the pipe, and no other thread can get the same pipe path. If an exception is thrown during the execution of a synchronization method and the exception cannot be handled inside the method, the thread held by the synchronization method is automatically freed when the exception is thrown outside the synchronization method.

Synchronizing a sequence of instruction sets is usually represented by a synchronized block in the Java language, and the Java Virtual Machine instruction set has Monitorenter and monitorexit two instructions to support the semantics of the Synchronized keyword, and the correct implementation of sync The hronized keyword requires the compiler to collaborate with the Java virtual machine to support both.

Structured locking (structured Locking) is the case where each pipe exit during a method call matches the previous pipe entry. Because there is no guarantee that all code submitted to the Java Virtual machine satisfies a structured lock, the Java Virtual machine allows (but does not enforce) the following two rules to ensure that a structured lock is established. Suppose T represents a thread, and M represents a tube:

t the number of times that a process m must be held during the execution of a method has to be equal to the number of times that the T releases the tube m when the method completes, including normal and abnormal completion.
At any point in the method invocation process, the thread T releases the tube m more often than t holds the tube m number of times.

Note that the process of automatically holding and releasing a pipe while synchronizing a method call is also considered to occur during a method call.

An in-depth understanding of jvm06--bytecode instruction introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More