1. The class file is a set of binary streams with 8-bit bytes as the basic unit. Each data item is strictly arranged in the class file in order, without any Separator in the middle, this makes almost all the content stored in the entire class file necessary data for running the program, and there is no gap. When a data item needs to occupy more than 8 bytes of space, it is divided into several 8 bytes for storage according to the method of the top position.
2. According to Java Virtual Machine regulations, the class file format is stored by a pseudo mechanism similar to the C language structure. In this pseudo structure, there are only two data types: unsigned number and table.
3. the unsigned number is a basic data type, u1, U2, U4, and u8 represent the unsigned number of 1 byte, 2 byte, 4 byte, and 8 byte, respectively, the unsigned number can be used to describe a number, index reference, number value, or to encode a string value by UTF-8.
4. A table is a composite data type consisting of multiple unsigned numbers or other tables as data items. All Tables habitually end with "_ INFO. A table is used to describe data in a hierarchical composite structure. The entire class file is essentially a table consisting of the following data items.
Class File Format
Type |
Name |
Quantity |
U4 |
Magic |
1 |
U2 |
Minor_version |
1 |
U2 |
Major_version |
1 |
U2 |
Constant_pool_count |
1 |
Cp_info |
Constant_pool |
Constant_pool_count-1 |
U2 |
Access_flags |
1 |
U2 |
This_class |
1 |
U2 |
Super_class |
1 |
U2 |
Interface_count |
1 |
U2 |
Interfaces |
Interface_count |
U2 |
Fields_count |
1 |
Field_info |
Fields |
Fields_count |
U2 |
Methods_count |
1 |
Method_info |
Methods |
Methods_count |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
5. cp_info: the constant pool mainly stores two types of constants: literal and symbolic references ). The literal volume is similar to the constant concept at the Java language level. The human text string is declared
The constant value of final. Symbol reference is a concept of compilation principles, including the following three constants:
- Fully Qualified name)
- Field name and Descriptor (descriptor)
- Method Name and Descriptor
Every constant in the constant pool is a table. There are 11 types of table structure data with different structures. These 11 types of tables share a common feature, the first constant at the beginning of the table is a URL-type identifier (TAG, ranging from 1 to 12, without the Data Type of flag 2), which indicates the constant type of the current constant, the meanings of the 11 constant types are as follows:
Project strength of the constant pool
Type |
Flag |
Description |
Constant_utf8_info |
1 |
A UTF-8-encoded string |
Constant_integer_info |
3 |
Integer literal |
Constant_float_info |
4 |
Float literal |
Constant_long_info |
5 |
Long Integer literal |
Constant_double_info |
6 |
Double-precision floating-point literal |
Constant_class_info |
7 |
Symbol reference of a class or interface |
Constant_string_info |
8 |
String Literal |
Constant_fieldref_info |
9 |
Symbol reference of a field |
Constant_methodref_info |
10 |
Symbol reference of methods in the class |
Constant_interfacemethodref_info |
11 |
Symbol reference of methods in the interface |
Constant_nameandtype_info |
12 |
Partial symbol reference of a field or Method |
The constant pool is the most tedious data because these 11 constant types have their own structures. The structural definitions of the 11 constant pools are summarized as follows:
Structural tables of 11 Data Types in the constant pool
Constant |
Project |
Type |
Description |
Constant_utf8_info |
Tag |
U1 |
The value is 1. |
Length |
U2 |
The number of bytes occupied by the UTF-8-encoded string |
Bytes |
U1 |
UTF-8-encoded string of Length |
Constant_integer_info |
Tag |
U1 |
The value is 3. |
Bytes |
U4 |
The Int value stored in front of a high position |
Constant_float_info |
Tag |
U1 |
The value is 4. |
Bytes |
U4 |
Based on the float value stored in the front |
Constant_long_info |
Tag |
U1 |
The value is 5. |
Bytes |
U8 |
According to the long value stored in front of the high position |
Constant_double_info |
Tag |
U1 |
The value is 6. |
Bytes |
U8 |
The double value stored in front of a high position |
Constant_class_info |
Tag |
U1 |
The value is 7. |
Index |
U2 |
Specifies the index of a fully qualified name constant. |
Constant_string_info |
Tag |
U1 |
The value is 8. |
Index |
U2 |
Index of the specified string literal |
Constant_fieldref_info |
Tag |
U1 |
The value is 9. |
Index |
U2 |
Index item of the class or interface descriptor constant_class_info of the declared Field |
Index |
U2 |
Index pointing to the field descriptor constant_nameandtype |
Constant_methodref_info |
Tag |
U1 |
The value is 10. |
Index |
U2 |
Specifies the index of the constant_class_info class descriptor of the method to be declared. |
Index |
U2 |
Specify the index item of the name and type descriptor constant_nameandtype |
Constant_interfacemethodref_info |
Tag |
U1 |
The value is 11. |
Index |
U2 |
Index Entry pointing to the constant_class_info interface descriptor of the declared Method |
Index |
U2 |
Specify the index item of the name and type descriptor constant_nameandtype |
Constant_nameandtype_info |
Tag |
U1 |
The value is 12. |
Index |
U2 |
Index pointing to the constant item of this field or method name |
Index |
U2 |
Index pointing to this field or constant item of the method Descriptor |
You can use the-verbose parameter of the javap tool to output the bytecode content of the testclass. Class file.
6. The field table (field_info) is used to describe the variables declared in the interface or class. Fields include class-level variables or instance-level variables, but are not included in variables declared within the method. The field table structure is as follows:
Field table structure
Type |
Name |
Quantity |
U2 |
Access_flags |
1 |
U2 |
Name_index |
1 |
U2 |
Descriptor_index |
1 |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
7. The structure of the method table is the same as that of the field table. It is different only in the selection of the access flag and Attribute Table set.
Method table structure
Type |
Name |
Quantity |
U2 |
Access_flags |
1 |
U2 |
Name_index |
1 |
U2 |
Descriptor_index |
1 |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
8. Attribute Table (attribute_info) can carry its own attribute table set in the class file, field table, and method table to describe specific information in some scenarios.
Unlike other data items in the class file, which require strict sequence, length, and content, the restrictions on Attribute Table sets are slightly looser, and each attribute table is no longer required to have strict order, in addition, as long as it is not the same as an existing attribute name, any compiler implemented by anyone can write its own defined attribute information to the Attribute Table. attributes that are not recognized by the Java Virtual Machine during runtime will be ignored. In order to correctly parse the class file, Java virtual machine specification (version 2) pre-defines the attributes that the virtual machine in Section 9 should recognize, as shown below:
Predefined attributes of Virtual Machine specifications
Attribute name |
Usage location |
Description |
Code |
Method table |
Bytecode commands compiled by Java code |
Constantvalue |
Field table |
Constant Value defined by the final keyword |
Deprecated |
Class, method table, and field table |
Methods and fields declared as deprecated |
Exceptions |
Method table |
Method throw an exception |
Innerclasses |
Class File |
Internal class list |
Linenumbertable |
Code attributes |
Correspondence between the line numbers of Java source code and bytecode commands |
Localvariabletalbe |
Code attributes |
Local variable description of the method |
Sourcefile |
Class File |
Source File description |
Synthetic |
Class, method table, and field table |
The identifier method or field is automatically generated by the compiler. |
Code attribute. After the code in the Java program method body is processed by the javac compiler, the bytecode directive is stored in the Code attribute. The Code attribute appears in the attribute set of the method table, but not all methods must have this attribute. For example, the interface or abstract method does not have the code attribute. If the method table has a code attribute, then its structure will be shown in the table:
Structure of the Code Attribute Table
Type |
Name |
Quantity |
U2 |
Attribute_name_index |
1 |
U4 |
Attribute_length |
1 |
U2 |
Max_stack |
1 |
U2 |
Max_locals |
1 |
U4 |
Code_length |
1 |
U1 |
Code |
Code_length |
U2 |
Prediction_table_length |
1 |
Prediction_info |
Prediction_table |
Prediction_table_length |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |