The class file format uses a structure similar to the C language structure to store data, which has only two data types: unsigned number and table.
The unsigned number belongs to the basic data type, and the different lengths of the data items are represented by U1, U2, U4, U8, respectively, representing a data item that occupies one byte, two bytes, 4 bytes, and 8 bytes in the class file.
A table is a composite data type that consists of multiple unsigned numbers or other tables as data items, and all tables are accustomed to ending with "_info", so to speak, the entire class file is a table structure!
The following table lists the specific meanings of the individual data items in the class file:
Type |
Name |
Number |
U4 |
Magic |
1 |
U2 |
Minor_version |
1 |
U2 |
Major_version |
1 |
U2 |
Constant_pool_count |
1 |
Cp_info |
Constant_pool |
Constant_pool_count-1 |
U2 |
Access_flags |
1 |
U2 |
This_class |
1 |
U2 |
Super_class |
1 |
U2 |
Interfaces_count |
1 |
U2 |
Interfaces |
Interfaces_count |
U2 |
Fields_count |
1 |
Field_info |
Fields |
Fields_count |
U2 |
Methods_count |
1 |
Method_info |
Methods |
Methods_count |
U2 |
Attribute_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
The version number of the magic number and class file
Magic
The four bytes at the beginning of the class file hold the magic number of the class file, which is the symbol of the class file, which is a fixed value: 0XCAFEBABE. That is to say, he is the standard for judging whether a file is a file in class format, if the first four bytes are not 0XCAFEBABE, then it is not a class file and cannot be recognized by the JVM.
Minor_version and Major_version
The four bytes of the magic number are the version number and the major version number of the class file. With the development of Java, the format of the class file will be changed accordingly. The version number indicates when the class file was added to or changed which features. For example, different versions of the Javac compiler compile the class file, the version number may be different, and different versions of the JVM can recognize the class file version number may be different, in general, the higher version of the JVM can recognize the lower version of the Javac compiler compiled class file, A lower version of the JVM does not recognize a class file that is compiled by a high version of the Javac compiler. If you use a lower version of the JVM to execute a higher version of the class file, the JVM throws Java.lang.UnsupportedClassVersionError.
Constant pool
In the class file, which is followed by the version number is the constant pool-related data item, the constant pool can be understood as the class of the resource warehouse, which is the class file structure with the most associated with other projects of the data type, is the largest data items occupy the class file space.
There are two main types of constants in a constant pool: literal and symbolic references. Literal comparisons are similar to the concept of constants at the Java level, such as text strings, constant values declared final, and so on. The symbolic references, in summary, include the following three types of constants:
- The fully qualified name of the class and interface (that is, the class name with the package name, such as: Org.lxh.test.TestClass)
- The name and descriptor of the field (private, static, and other descriptors)
- The name and descriptor of the method (private, static, and other descriptors)
The virtual machine does not dynamically connect when the class file is loaded-that is, the final memory layout information for each method and field is not saved in the class file, so the symbolic references to these fields and methods cannot be directly used by the virtual machine without conversion. When the virtual runtime is running, the corresponding symbolic reference needs to be obtained from the constant pool, which is then replaced with a direct reference and translated into a specific memory address during the parsing phase of the class loading process.
access flags & nbsp , &NB Sp , &NB Sp
Flag Name |
Flag value |
Logo meaning |
Against the image of the |
Acc_public |
0x0001 |
Public type |
All types |
Acc_final |
0x0010 |
Final type |
Class |
Acc_super |
0x0020 |
Using the new invokespecial semantics |
Classes and Interfaces |
Acc_interface |
0x0200 |
Interface type |
Interface |
Acc_abstract |
0x0400 |
Abstract type |
Classes and Interfaces |
Acc_synthetic |
0x1000 |
The class is not generated by user code |
All types |
Acc_annotation |
0x2000 |
Annotation type |
Annotations |
Acc_enum |
0x4000 |
Enum type |
Enumeration |
Class index, parent class index, and Interface interface collection
The class index (This_class) and the parent class index (SUPER_CLASS) are data of a U2 type, and the Interface interface collection (interfaces) is a set of data sets of U2 types, which are determined by the three data in the class file. The class index, parent index, and interface index collection are sorted sequentially after the access flag, the class index and the parent class index are represented by index values of two U2 types, each pointing to a class description constant of type Comnstant_class_info. The fully qualified name string that is defined in a constant of type comnstant_utf8_info is found by the index value in the constant. The interface index collection is used to describe which interfaces this class implements, and the interfaces that are implemented will be sorted from left to right in the index collection of the interface, following the implements statement (if the class itself is an interface, then the Extend statement).
Field table Collection
The Field table (Field_info) is used to describe the variables declared in an interface or class. A field includes a class-level variable or an instance-level variable, but does not include a variable declared within a method. Field names, data types, modifiers, and so on are not fixed and can only be described by constants in a constant pool. The following are the most popular formats for the field table:
Type |
Name |
Number |
U2 |
Access_flags |
1 |
U2 |
Name_index |
1 |
U2 |
Description_index |
1 |
U2 |
Attributes_count |
1 |
Attribute_info |
Attributes |
Attributes_count |
The access_flags is very similar to the access_flags in the class and is a modifier that represents the data type, such as public, static, volatile, and so on. The following name_index and Descriptor_index are references to constant pools, which represent the simple names of fields and the descriptors of fields and methods, followed by a collection of attribute tables for storing some additional information.
One last thing to note: Fields inherited from the parent class or interface are not listed in the Field table collection, but it is possible to list fields that do not exist in the original Java code. For example, in an internal class to maintain access to an external class, a field that points to an instance of an external class is automatically added.
Method table Collection
The structure of the method table (Method_info) is the same as the structure of the property sheet, but more. The Java code in the method, after compiling the compiler into bytecode instruction, is stored in a property named "Code" in the collection of the method attribute table, and the item about the attribute table is also described in detail later.
Relative to the Field table collection, method information from the parent class does not appear in the Method table collection if the parent method is not overwritten in the subclass. However, it is also possible to have a method that is automatically added by the compiler, most typically the class constructor "<clinit>" method and the instance constructor "<init>" method.
In the Java language, to overload a method, in addition to having the same simple name as the original method, requires a feature signature that differs from the original method, which is a collection of field symbol references in a method for each parameter in a constant pool, that is, because the return value is not included in the signature. So the Java language cannot overload an existing method simply by relying on the difference in the return value.
Property sheet Collection
The attribute table (Attribute_info), which has many previous lines, can carry its own set of property sheets in class files, field tables, and method tables to describe the information that is proprietary to certain scenarios.
Unlike other data items in the class file that require strict ordering, length, and content, the restrictions on the set of attribute tables are slightly looser, no longer requiring the individual property sheets to be in strict order, and as long as they are not duplicated with the existing property names, anyone-implemented compilers can write their own defined property information to the property sheet. But the Java virtual runtime ignores properties that it does not recognize.
Reference
- "In-depth understanding of Java virtual machines"
- http://blog.csdn.net/ns_code/article/details/17675609
Java Virtual machine--class class file structure