Java class file description

Source: Internet
Author: User

I. What is a Java class file?
The javatm virtual machine specification (second Edtion) Statement: a Java class file consists of eight-byte streams, all 16-bit, 32-bit, and 64-bit data are constructed by reading 2, 4, and 8 bytes respectively. The multi-byte data is always stored in the big-Endian order, that is, the high byte is in front (put in the low address ). Each class file contains only one Java type (class or interface ).

Perhaps the statement in the javatm virtual machine specification is not clear enough, so we can refer to the statement in the inside the Java Virtual Machine (second Edtion): the Java class file is specific. class is a file that can be mounted on a Java virtual machine.

I think they are not comprehensive and clear. I defined this as follows: a Java class file is a binary file consisting of byte streams in a specific format. This specific format refers to the class file format to be discussed in section 2, that is, the class file format defined in the javatm virtual machine specification. From another perspective, this specific format is a format that the JVM can recognize and load. Why? This is because the JVM needs to verify the class file when loading the class file to ensure that the content of the mounted class file conforms to the correct internal structure. This internal structure refers to this specific format. As long as the class files comply with this specific format are legal and standard class files, they are all class files that can be loaded by JVM. If you think this statement is not clear enough, I suggest you read this article and try again later to understand J.

For the convenience of discussion, the two references will be briefly described below:

1) the Java Virtual Machine specification (second Edtion) is simplified to jvm spec (2 nded ).

2) Brief description of "inside the Java Virtual Machine" (second Edtion) is "Inside JVM" (2 nded ).

Ii. Java class file format

Before talking about the format of the class file, we should introduce three concepts:

1) Data Type: jvm spec (2 nded) states that the data in the Java class file is represented by a defined data type set, namely u1, U2, U4, indicates the data of 1, 2, and 4 bytes of the unsigned type. In the book "Inside JVM" (2 nded), the author calls this data type set the basic type of class files. I think it is more vivid and easy to understand. Therefore, in this article, we also use basic types to represent the data of Java class files.

2) Table: according to the definition in jvm spec (2 nded), the table is composed of items (for definition, see 3) and used in several class file structures. Jvm spec (2 nded) states that the Java class file format is represented by a pseudo structure written by a mark similar to the C structure. This pseudo structure refers to the table here. For example, the classfile table below is a typical example of this pseudo structure. All the tables below refer to this pseudo structure. The table size is variable because its components are variable. Note: The variable here is for the class level, that is, the size of the item may be different in different class files, but for each specific class file, the size of this item is certain, so the size of this table is also certain. So why is the item variable? See the following analysis.

(3) item: the content that describes the structure of the Java class file format is called item (items ). Each item has its own type and name. The type of an item may be a basic type or a table name. This type of item is an array item. Each element of an array item is a table. This table, like the top-level classfile table, is also a pseudo structure and is composed of some items, these tables are not necessarily in the same format, so array items can also be considered as a variable-Size Structure flow J. These tables are subitems for this array item. Of course, the subitem may have subitems (currently, the depth of the subitem can be at most two layers ). There is nothing to say about the item name, that is, some names specified in jvm spec (2 nded. In addition, items have sizes. For items without subitems, their sizes are fixed. For items with subitems, their sizes are variable. In a specific class file, the size of a variable (array) is specified in the previous one. Why? This is what is defined in jvm spec (2 nded! In the class file, each item is stored in the class file in the sequence defined in the specification. There is no interval between adjacent items, and continuous items (arrays) are also stored in order, without filling or alignment, the class file can be compact.

Well, I think I have already explained these three concepts clearly. Now I will officially parse the format of the class file.

First, we need to parse the classfile table structure. This is the outermost structure of the class file defined in jvm spec (2 nded), that is, the format of the class file.

Classfile table structure
Classfile {
U4 magic;

U2 minor_version;

U2 major_version;

U2 constant_pool_count;

Cp_info constant_pool [constant_pool_count-1];

U2 access_flags;

U2 this_class;

U2 super_class;

U2 interfaces_count;

U2 interfaces [interfaces_count];

U2 fields_count;

Field_info fields [fields_count];

U2 methods_count;

Method_info methods [methods_count];

U2 attributes_count;

Attribute_info attributes [attributes_count];

}

The structure of the classfile table consists of 16 different items. The items in the table can be analyzed as follows:

(1) magic

The first four bytes of each class file are called magic number: 0 xcafebabe. Magic number is used to easily distinguish between Java class files and non-Java class files. (If a file does not start with 0xcafebabe, it is certainly not a Java class file because it does not comply with the specification J ). When Java is also called "Oak", this magic number has been fixed, and it indicates the emergence of the Java name. For details about the origins of magic data, refer to J.

(2) minor_version and major_version

The following four bytes of the class file include the times and major version number. Generally, the Java Virtual Machine can read class files only after the primary version number and a series of minor version numbers are specified. If the version number of the class file exceeds the valid range that the Java Virtual Machine can process, the Java Virtual Machine will not process the class file. For example, a j2se5.0 virtual machine cannot execute class files compiled by the j2se6.0 compiler.

(3) constant_pool_count

The item after the version number is constant_pool_count, that is, the Count item of the constant pool. The value of this item must be greater than zero. It gives the number of elements of the list item of the constant pool in this class file, this counting item includes the constant_pool table item with an index of 0, but this table item does not appear in the constant_pool list of the class file, because it is reserved for internal implementation of the Java Virtual Machine, therefore, the number of elements in the constant pool list constant_pool_count-1, the index values of each constant pool table item are 1 to constant_pool_count-1.

Note: here, there are several terms that need to be explained. The constant pool is constant_pool, and the constant pool list is constant_pool []. A constant pool table item is a specific table item (element) in the constant pool list ). Possible types of these constant pool table items are shown in the following cp_type table:

Cp_type
Entry type flag value

Constant_class 7

Constant_fieldref 9

Constant_methodref 10

Constant_interfacemethodref 11

Constant_string 8

Constant_integer 3

Constant_float 4

Constant_long 5

Constant_double 6

Constant_nameandtype 12

Constant_utf8 1

(4) constant_pool []

Under the constant_pool_count item, the constant_pool [] item is the list of constant pools, where various constants referenced in the classfile structure and its sub-structure are stored, such as text string, final variable value, Class Name and method name. In the Java class file, the constant pool table items are described in a cp_info structure, the constant pool list is a constant_pool [] array consisting of a continuous, variable-length cp_info table structure of the constant_pool_count-1. Why is a constant_pool constant_pool_count-1, which has been explained above. Each constant pool table item is a variable-length structure. Its format is as follows:

Cp_info
Cp_info {
U1 tag;
U1 info [];
}

The tag item in the cp_info table is an unsigned byte type value, which indicates the type and format of the cp_info table. The specific tag type is shown in the table above.

It should be noted that cp_info is just an abstract concept. In the class file, it is represented as a series of concrete constant_pool structures, such as constant_xxxx_info, the specific format is determined by the tag item (the first byte) of the cp_info table. Different cp_info tables have different info [] items. For example, the info [] item of the constant_class_info table is "U2 name_index ", the info [] item of the constant_utf8_info table is "U2 length; U1 bytes [length];". Obviously, the two cp_info tables are different and their sizes are different, therefore, the table item size in the constant pool is variable. Since the table items of each constant pool in the constant pool list have different structures, the size of the constant pool list is also variable. In the class file, the constant pool list item is a variable-length structured stream.

We can know from the cp_info table and the cp_type table that if the value of the tag (FLAG) item in the cp_info table is 1, the current cp_info is a constant_utf8_info table structure, if the value of the tag item in the cp_info table is 3, the current cp_info is a constant_integer_info table structure, and so on. For the structure of these tables, see chapter 4 of jvm spec (2 nded) or chapter 6 of inside JVM (2 nded.

(5) access_flags

The two bytes following the constant pool are called access_flags. The access_flags item describes some access flag information of this Java type. For example, the access flag indicates whether the object defines a class or an interface. The access flag also defines the modifiers used in the class or interface declaration; class and interface are abstract or public. In fact, the value of the access_flags item is the access flag mask used in the Java type declaration (mask, here the Mask refers to the value of access_flags is the sum of all access flag values, of course, unused flag spaces are set to 0 in the class file. For example, if the value of access_flags is 0x0001, it indicates that the access token of this Java type is acc_public; if the value of access_flags is 0x0011, it indicates that the access flags of this Java type are acc_public and acc_final, because only the sum of the two flags can be 0x0011, and so on ).

All access_flags of a Java type are shown in the following table:

Access_flags
Flag name Value Meaning

Acc_public 0x0001 is declared as public and can be accessed from outside its package

Acc_final 0x0010 is declared as final and does not allow subclass

Acc_super 0x0020 use the invokespecial command to process super class calls

Acc_interface 0x0200 indicates an interface, not a class

Acc_abstract 0x0400 declared abstract and cannot be instantiated

It should be noted that this is a Java-type access token list, some of which can only be used by classes, and some can only be used by interfaces, for more information, see jvm spec (2 nded ).

(6) this_class

The next two bytes are this_class items. The value is an index of the table items in the constant pool, that is, it points to a constant pool table item, the constant pool table item must be in the constant_class_info table structure. This table has a name_index item that points to another constant pool table item, which contains the full qualified name of this class or interface.

(7) super_class

The two bytes after this_class are super_class items. This item must be a valid index for the table item of the constant pool or the value is 0. If the value of the super_class item is 0, the class file must represent the java. Lang. Object Class. If the value of the super_class item is not 0, there are two cases. If the class file represents a class, then, the super_class item must be the index of the constant_class_info table item of the class in the constant pool. This superclass and any of its superclasses cannot be a final class. If the class file represents an interface, the super_class item must represent Java in the constant pool. lang. the index of a constant_class_info table of the object class.

(8) interfaces_count and interfaces []

The two bytes next to the super_class item are the interfaces_count item, which indicates the number of superinterfaces directly implemented by the class or extended by the interface.

The interfaces_count item is followed by the interfaces list item, which contains the constant pool indexes of the Super interfaces directly implemented by this class or extended by this interface. There are a total of interfaces_count indexes. The index of the constant pool in the interfaces list is arranged from left to right according to the sequence specified by this type in the source code.

(9) fields_count and fields []

The following is the fields_count item. The value of this item provides the number of field_info table structures in the fields list item, indicating the total number of class variables and instance variables declared by the Java type.

The fields list item contains the complete description of all fields declared in the Java type. Each field_info table item in the fields list fully represents the information of a field, including its name, descriptor, and modifier. Some of this information is placed in the field_info table, such as modifier; others are placed in the constant pool pointed to by the field_info table, such as the name and descriptor. Same as the previous analysis, the fields list item is also a variable-length structure.

It must be noted that only the fields declared in this Java type can be listed in the fields list. The fields list does not include the field information inherited from the superclass or superinterfaces.

(10) methods_count and methods []

In the class file, fields is followed by a description of the methods declared in the Java type. The first is the methods_count item, which occupies two bytes. Its value indicates the total number of all methods declared in this Java type. The methods_count item is followed by the methods list item, which consists of a consecutive methods_count method_info table. Each method_info table contains information related to a method, such as the method name, Descriptor (that is, the return value and parameter type of the method), and other information. If a method is neither abstract nor native, the method_info table contains the stack space length required for local variables of the method, the exception table captured by the method, the bytecode sequence, the optional row number table, and the local variable table.

It must be noted that only explicitly defined methods of this Java type can be listed in the fields list. The fields List does not include method information inherited from superclass or superinterfaces.

(11) attributes_count and attributes []

The last part of the class file is attribute, which provides basic information about the attributes defined in this Java type. The first is attributes_count, which occupies two bytes. Its value indicates the total number of attributes_info tables in the subsequent attributes list. The first item of each attributes_info table is an index of the constant_utf8_info table in the constant pool. This table provides the name of this attribute.

It should be noted that there are many types of attributes. In the class file, attributes appear in many places, and attributes attribute items exist in the top-level classfile table, the fielbutes attribute items are also available in the field_info table, and the attributes attribute items are also available in method_info, but they have their own functions. For details, see the above analysis. In jvm spec (2 nded), the unique attribute defined for the attributes list item of the classfile table structure is the sourcefile attribute, the only attribute defined for the attributes list item of the field_info table structure is the constantvalue attribute, and the attributes list item of the method_info table structure defines the code and exceptions attributes.

All in all, the class file format is a standard format. This specification refers to the normalization of the table structures mentioned above and the inclusion relationship between these table structures. In fact, jvm spec (2 nded) organizes the format of class files through the concepts of tables and items. First, the classfile table is the structure of the outermost layer of the class file. In other words, this is the format of the class file. Secondly, the classfile table is composed of some items. The content of these items must comply with the specifications defined in jvm spec (2 nded). Specifically, if the type of this item is a basic type, the value of this item must comply with the specifications. For example, the magic item must be 0 xcafebabe, and the value of access_flags must be a valid flag value; if the type of this item is a table name, that is, this item is an array item, then each table item in the list of this array item is a legal and standard table, it cannot be a new table not defined in a specification. This is the normalization of the Inclusion relationship. Similarly, each table item in the list item must comply with the table item defined in its specification, for example, if the name_index of a constant_class_info table in the constant pool list is not an index of the constant_utf8_info table structure, the table item in the constant pool is not a legal table item, therefore, the list item of this constant pool does not conform to the specification, so the entire file does not comply with the specification.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.