Before the official start, say the language independence provided by the virtual machine
The Java Virtual machine will only parse the bytecode file, and he will not care about what high-level language it uses. Let's look at the structure of the class bytecode file.
A class file is a set of binary streams that are based on 8-bit bytes, and each data item is tightly arranged in a class file in a tight order, with no delimiters in the middle. When a data item that occupies more than 8 bytes of space is encountered, it is divided into several 8-bit bytes in front of the high-order. So what is the storage structure of the class file? How do you differentiate between a data item and a few bytes? The class file is stored in a pseudo-structure similar to the C language structure, which has only two data types: unsigned number and table, a composite data item consisting of multiple unsigned numbers or other tables as data items, and the entire class file is essentially a table with the class file format as follows:
Type |
Name |
Number |
U4 |
Magic |
1 |
U2 |
Minor_version |
1 |
U2 |
Major_version |
1 |
U2 |
Constant_pool_count |
1 |
Cp_info |
Constant_pool |
Constant_pool_count-1 |
U2 |
Access_flags |
1 |
U2 |
This_class |
1 |
U2 |
Super_class |
1 |
U2 |
Interfaces_count |
1 |
U2 |
Interfaces |
Interfaces_count |
U2 |
Fields_count |
1 |
Field_info |
Fields |
Fields_count |
U2 |
Methods_count |
1 |
Method_info |
Methods |
Methods_count |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
Note: U1, U2, U4, U8 represent 1 bytes, 2 bytes, 4 bytes, and 8 bytes, and the end of "_info" is the table.
1. Magic Number Magic
The first 4 bytes, is used to determine whether the file is a virtual machine can be accepted by the class file, the value is 0xCAFEBABE, can be accepted by the virtual machine, otherwise it can not
2. Minor version number minor_version and major version number Major_version
Java version number is starting from 45, JDK1.1 can support the version number of 45.0~45.65535,jdk1.2 can support 45.0~56.65535. The following table lists support for version numbers from JDK1.1 to 1.7.
3. Constant pool
Constant_pool_count represents the constant pool capacity count value, and if constant_pool_count=21, then there are 20 constants in the constant pool Constant_pool, and 0 items are empty. A constant pool holds two classes of constants: literals and symbolic references, literal strings or final constant values, symbolic references to the fully qualified names of classes and interfaces, the names and descriptors of fields, the names and descriptors of methods. Each constant in a constant pool is a table with 11 different structure of the table, as follows:
Each of the 11 constant types has its own structure, as follows:
(1) Constant_class_info
Type |
Name |
Number |
U1 |
Tag |
1 |
U2 |
Name_index |
1 |
Tag is the flag bit, Name_index is an index value that points to a constant in the constant pool of a constant_utf8_info type, and the constant pointed to represents the fully qualified name of the class or interface.
(2) Constant_utf8_info
Type |
Name |
Number |
U1 |
Tag |
1 |
U2 |
Length |
1 |
U1 |
Bytes |
Length |
Length indicates how many bytes the Utf-8 encoded string length is, followed by a length byte of continuous data is a string encoded with Utf-8. The maximum value that U2 can express is 65535, so a variable or method name that defines an English character more than 64kB in a Java program will not compile.
(3) Constant_integer_info
Type |
Name |
Number |
U1 |
Tag |
1 |
U4 |
Bytes |
1 |
bytes represents an int value stored in front of a high
Constant_float_info, Constant_long_info, Constant_double_info are similar to Constant_integer_info.
4. Access Flag Access_flags
5. Class index This_class, parent class index Super_class and interface index interfaces
The class file is determined by these three data to determine the inheritance of the classes, and the class index is used to determine the fully qualified name of the class, and the parent class index to determine the fully qualified name of the class's parent class. The class index and the parent class index each point to a class of type constant_class_info to describe the constant, which is found by Constant_class_info Name_index defined in the constant_utf8_ The fully qualified name string in the constant of type info.
6. Field table collection fields_info Field table collection Fields_info is used to describe variables declared in interfaces or classes, field fields include class-level variables or instance-level variables, and do not include variables declared inside methods. The field table structure is as follows:
Field modifiers are placed in the Access_flags project, very similar to the Access_flag project in the class. Name_index, Descriptor_index, are references to constant pools, which represent the simple names of fields and the descriptors of fields and methods. Descriptors are used to describe the data type of a field, the parameter list of a method, and the return value. such as code
Private Int m;
public Voidmin ();
The simple names of the M Field and the Inc () method are M and Inc, respectively, and the descriptors are I, () v.
7. Method table Collection Method_info
The structure is as follows
Similar to the Field table collection. The code in the method is compiled into bytecode instructions by the compiler and stored in a property named code in the method's attribute table collection.
8. Property sheet Collection Attribute_info
You can carry your own set of property sheets in class files, field tables, and method tables. The Java Virtual Machine Specification pre-defines the properties that should be recognized by the 9 virtual machine implementations. Specific as follows
For each property, its name needs to refer to a represented representation of a constant_utf8_info type from a constant pool, the structure of the property value is fully customizable, only the length of the number of bits that the property value takes up, and the attribute table structure that conforms to the rule is as follows:
(1) Code attribute
The Code property appears in the Properties collection of the method table, and the method in the interface or abstract class does not have the Code property. The Code attribute table is structured as follows
Type |
Name |
Number |
Description |
U2 |
Attribute_name_index |
1 |
Index to the Constant_utf8_info constant, with a value of code |
U4 |
Attribute_length |
1 |
Property value Length |
U2 |
Max_stack |
1 |
Max depth of operand stack |
U2 |
Max_locals |
1 |
Storage space required for local variables |
U4 |
Code_length |
1 |
Byte code length |
U1 |
Code |
Code_length |
Storage byte code |
U2 |
Exception_table_length |
1 |
Exception table length |
Exception_info |
Exception_table |
Exception_table_length |
Exception table, implementing Java exception and finally processing mechanism |
U2 |
Attributes_count |
1 |
|
Attribute_info |
Attributes |
Attributes_count |
|
(2) Exceptions property
Enumerates the exceptions that may be thrown in a method, that is, the exception that is enumerated after the throws keyword in the method description, with the following structure
Type |
Name |
Number |
Description |
U2 |
Attribute_name_index |
1 |
|
U4 |
Attribute_length |
1 |
|
U2 |
Number_of_exceptions |
1 |
Number of exceptions |
U2 |
Exception_index_table |
Number_of_exceptions |
|
(3) Linenumbertable property
Describes the corresponding relationship between the Java source line number and the bytecode line number, with the following structure
Type |
Name |
Number |
Description |
U2 |
Attribute_name_index |
1 |
|
U4 |
Attribute_length |
1 |
|
U2 |
Line_number_table_length |
1 |
|
Line_number_info |
Line_number_table |
Line_number_table_length |
including START_PC and Line_number, the former is the byte code line number, the latter is the Java source line number |
(4) Localvariabletable property
Describes the relationship between variables in a local variable table in a stack frame and variables defined in Java source code.
(5) SourceFile
Record the name of the source file that generated the class file.
(6) Constantvalue property
The function is to notify the virtual machine to automatically assign a value to the static variable. This property can only be used by variables modified by the static keyword. In the program
Int x=123; and static intx=123; the virtual machine assigns a value to it in a different way than the moment.
(7) Innerclasses Property
Records the association between the inner class and the host class. If an inner class is defined in a class, the compiler generates innerclasses properties for it and the inner classes it contains.
(8) Deprecated and synthetic properties
Deprecated flag A class, field, method is no longer used by the program flag, and is set using the @ deprecated comment in the code.
Synthetic represents that this field or method is not generated directly by the Java source code, but is added by the compiler itself.
9. Practical examples
Code instance
Package Org.fenixsoft.clazz public class testclass{ Private int m; public Int Inc () { Return m+1; } } |
Use Winhex to open the class file as follows
Magic number and version number
The first four bytes are the magic number, followed by the four bytes is the version number, 0x00000033, to decimal 51, indicating that the version can be JDK1.7 by the virtual machine execution of the class file.
Constant pool
Immediately following two bytes 0x0016, converted to decimal 22, which represents a constant pool of 21 constants, the index value is 1~21. There are 11 constant types in a constant pool, but the first byte of each type is a flag bit byte.
The first constant, the flag bit byte is 0x07, the decimal is 7, represents constant_class_info, through the preceding explanation, know the constant_class_info type of constant, followed by a 2-byte name_index, Point to constant_utf8_info,2 bytes in the constant pool as 0x00002, pointing to the second constant in the constant pool,
The flag bit byte of the second constant is 0x01, which represents a constant of type constant_utf8_info, followed by two bytes for length, 0x001d, and decimal to 29, which is followed by a contiguous string of 29 bytes. Number of 29 backwards, respectively
A byte of the Utf-8 encoding, equivalent to 1~127 ASCII code, 0x6f into decimal 111, the ASCII table character is O, similar, the subsequent projection, 29 consecutive characters for Org/fenixsoft/clazz/testclass.
The third constant's flag bit is 0x07, which represents Constant_class_info, followed by a 2-byte name_index with a value of 0x0004, pointing to the fourth constant in the constant pool.
The flag bit of the fourth constant is 01, which is a constant of type constant_utf8_info, followed by two bytes for 0x0010, and 29 for decimal, which is a contiguous string of 16 bytes later. Number of 16 backwards, respectively
For Java/lang/object.
The fifth constant has a flag of 01, which represents a constant of type constant_utf8_info, followed by two bytes for 0x0001, and decimal to 1, which is followed by a contiguous string of 1 bytes. Represents M
The sixth constant has a flag of 01, which represents a constant of type constant_utf8_info, followed by two bytes for 0x0001, and decimal to 1, which is followed by a contiguous string of 1 bytes. On behalf of I.
The seventh constant has a flag of 01, which represents a constant of type constant_utf8_info, followed by two bytes for 0x0006, and decimal to 6, which is followed by a contiguous string of 6 bytes. Representative <init>
The eighth item is () V
The Nineth item is code
The flag bit of the tenth constant is 0x0a, which represents Constant_methodref_info, followed by a 2-byte index with a value of 0x0003, which points to the indexed entry of the class descriptor Constant_class_info that declares the method. The next 2 bytes index is 0x000b, which points to the index entry for the name and type descriptor Constant_nameandtype_info. That is java/lang/object. " <init> ". () V
The flag bit of the 11th constant is 0x0c, which represents Constant_nameandtype, followed by a 2-byte index with a value of 0x0007, which points to the indexes of the field or method name constant entry. The next 2 bytes index is 0x0008, which points to the index of the field or method that describes the constant item. That is, "<init>". () V.
One analogy,
The 12th item is linenumbertable
The 13th item is localvariabletable
The 14th item is this
The 15th item is lorg/fenixsoft/clazz/testclass;
Item 16th is Inc
The 17th item is () I
The 18th item is ORG/FENIXSOFT/CLAZZ/TESTCLASS.M:I
The 19th item is M:I
The 20th item is sourcefile
The 21st item is Testclass.java
Access flags
A constant pool of two bytes, with a value of 0x0021, Acc_public and Acc_super
Class index, parent class index, interface index
The values are 0x0001, 0x0003, 0x0000, the class index is 1, the parent class index is 3, and the interface index collection size is 0. In a constant pool, the value of the first item is Org/fenixsoft/clazz/testclass, and the value of the third item is java/lang/ Object, the class is Org/fenixsoft/clazz/testclass, the parent class is java/lang/object, there is no interface
Field
The next two bytes 0x0001, representing the number of fields is 1, depending on the structure of the field table, the following items are
Access flag 0x0002, indicating acc_private
Name_index:0x0005, indicating M
descriptor_index:0x0006, which means I, int
attributes_count:0x0000, the number of attributes is 0
That is, a field of privateint m
Method
The next two bytes 0x0002, representing the number of methods is 2, according to the structure of the method table, the following items are
The first method
Access flag 0x0001, indicating acc_public
Name_index:0x0007, said <init>
Descriptor_index:0x0008, representing () V, Void ()
ATTRIBUTES_COUNT:0X0001, the number of attributes is 1
Based on attribute structure
attribute_name_index:0x0009, for code
attribute_length:0x0000, Length 0
The second method is that the parent class is the parent class method, and if it is not overridden in a subclass, the information for the parent class method does not appear in the Method table collection, so the second method is not visible <clinit>
Java class file