Class 006 File Structure

Source: Internet
Author: User
Tags field table uppercase character

1. class file structure any class file corresponds to the definition information for the only class or interface, but in turn, the class or interface is not necessarily defined in the file (for example, a class or interface can also be generated directly from the ClassLoader). The "Class" file does not necessarily exist on the disk

A class file is a set of binary streams that are based on 8-bit bytes, with only two data types: unsigned number and table
Unsigned numbers are basic data types that represent 1-byte, 2-byte, 4-byte, and 8-byte unsigned numbers in U1, U2, U4, and U8, and unsigned numbers can be used to describe numbers, index references, quantity values, or string values by UTF-8 encoding.

A table is a composite data type that consists of multiple unsigned numbers or other tables as data items, and all tables habitually end with "_info".


The first 4 bytes of each class file are called Magic Numbers, and its sole purpose is to determine if the file is a class file that can be accepted by the virtual machine, with the value: 0xCAFEBABE
The 4 bytes of the magic number are stored in the version number of the class file: the 5th and 6th bytes are the minor version (Minor versions), and the 7th and 8th bytes are the major version numbers (Major version). The version number for Java starts at 45, and the major version of each JDK after JDK 1.1 is released with the major release number up by 1 (jdk1.1-45,jdk1.2-46,..., jdk1.7-51)

2. Constant pool followed by major and minor version number is the constant pool entrance, the constant pool can be understood as a class file in the resource warehouse, it is the class file structure with the most associated with other projects of the data type, is also occupy the class file space one of the largest data items, It is also the first table-type data item that appears in a class file.

Because the number of constants in a constant pool is not fixed, the entry for a constant pool needs to place a U2 type of data that represents the constant pool capacity count value (Constant_pool_count). Unlike the language habits in Java, this capacity count starts at 1 instead of 0,For example, the constant pool capacity (offset address: 0x00000008) is hexadecimal number 0x0016, which is the decimal 22, which means there are 21 constants in the constant pool with an index value range of 1~21. At the time of the class file format specification, the designer has given special consideration to the fact that the No. 0 constant is empty, and the goal is to satisfy some of the subsequent data pointing to the index value of the constant pool, in a particular case, to express the meaning of "do not refer to any of the constant pool items", which can be set to zero.

There are two major types of constants in a constant pool: literal (Literal) and symbolic (symbolic References). Literal comparisons are similar to the concept of constants at the Java language level, such as text strings, constant values declared final, and so on. Symbolic references, however, fall into the concept of compilation principles, including the following three types of constants: 
    • Fully qualified name of class and interface (Fully qualified name)
    • The name and descriptor of the field (descriptor)
    • The name and descriptor of the method

Constant_class_info type, a constant of this type that represents a symbol reference for a class or interface.

Name_index is an index value that points to a Constant_utf8_info type constant in the constant pool that represents the fully qualified name of the class (or interface)

The length value shows how many bytes the UTF-8 encoded string length is, followed by the length of the long byte contiguous data is a string that is represented by the UTF-8 abbreviation encoding.

3. The access flag after the end of the constant pool, the next two bytes represent the access flag (ACCESS_FLAGS), which is used to identify some class or interface level access information, including: Whether this class is an interface or not, is defined as a public type , whether it is defined as an abstract type, or whether it is declared final, if it is a class.

4. The class index, the parent class index and the interface Index Collection Class index (This_class) and the parent class index (Super_class) are all U2 types of data, and the interface index collection (interfaces) is a collection of data of a set of U2 types. The class file is determined by the three data to determine the inheritance of the classes.
The class index, the parent class index, and the interface index collection are sorted sequentially after the access flag, and the class and parent indexes are represented by index values of two U2 types, each pointing to a class description constant of type Constant_class_info, by Constant_class_ The index value in a constant of type info can find the fully qualified name string that is defined in a constant of type constant_utf8_info.
For an interface index collection, the first entry of the--U2 type of data is an interface counter (Interfaces_count), which represents the capacity of the Index table. If the class does not implement any interfaces, the counter value is 0, and the index table of the subsequent interface no longer consumes any bytes.

5. The Field Table collection Field table (Field_info) is used to describe an interface or a variable declared in a class. Fields include class-level variables and instance-level variables, but do not include local variables declared inside a method.
The information that can be included is the scope of the field (public, private, protected modifier), the instance variable or the class variable (static modifier), the variability (final), the concurrency visibility (the volatile modifier, whether to force read and write from the main memory), can be serialized (transient modifier), field data type (base type, object, array), field name. Format cannot be fixed, only data in a constant pool can be referenced

The following Access_flags flags are two index values: Name_index and Descriptor_index. They are references to constant pools that represent the simple names of fields and the descriptors of fields and methods.
The base data type (Byte, char, double, float, int, long, short, Boolean) and the void type that represents no return value are represented by an uppercase character, and the object type is represented by the character L plus the fully qualified name of the object.

For an array type, each dimension will be described by using a predecessor "[" Character,a two-dimensional array, defined as a type of "java.lang.string[][", will be recorded as: "[[Ljava/lang/string;", an Integer array "int[]" will be recorded as "[I".
When describing a method with a descriptor, the parameter list is placed in a set of parentheses "()" in the strict order of the parameters, followed by the order of the parameter list, followed by the return value. If the descriptor for Method Void Inc () is "() V", the Method java.lang.String toString () is described as "() ljava/lang/string;", Method int IndexOf (char[]source,int The descriptor for Sourceoffset,int sourcecount,char[]target,int targetoffset,int targetcount,int fromIndex) is "([CII[CIII) I".
Fields inherited from superclass or parent interfaces are not listed in the Field table collection, but may list fields that do not exist in the original Java code

6. Collection of method tables
The Java code in the method, after compiling the compiler into bytecode instruction, is stored in a property named "Code" in the collection of the method attribute table, and the attribute table is the most extensible data item in the class file format.
If the parent class method is not overridden in a subclass (override), the method information from the parent class does not appear in the Method table collection.

7. Property sheet Collection
An attribute table that conforms to the rules should satisfy the structure defined in the following table.
Other specific properties are found in the book.
8. Bytecode instruction The instructions for a Java Virtual machine consist of a byte-length number representing the meaning of a particular operation (called OpCode, Opcode) and 0 or more of the parameters that follow it (known as operands, operands) that are required to represent the operation. Because the Java virtual machine uses a schema that targets the operand stack instead of the register (the differences and effects of the two architectures are discussed in chapter 8th), most directives do not contain operands, only one opcode.

In the instruction set of a Java virtual machine, most directives contain data type information for their operations.
For most bytecode directives that are related to data types, their opcode mnemonics have special characters that indicate which data type to serve: I represents data manipulation for int type, L represents long,s on behalf of Short,b on behalf of Byte,c, Char,f represents float, D represents Double,a on behalf of reference. There are also some instruction mnemonics that do not explicitly indicate the type of operation of the letter, such as the Arraylength directive, which does not represent a special character of the data type, but the operand can never be an object of an array type.
See the book for specific instructions.
9. The public design and private design Java Virtual machine specification depicts the common program storage format that Java virtual machines should have: the class file format and the bytecode instruction set.
It is necessary to understand the dividing line between public design and private implementation, and the Java Virtual Machine implementation must be able to read the class file and precisely implement the semantics of the Java Virtual machine code contained therein.
(How to achieve, what optimizations do not require, as long as the class file to meet the loading and implementation of instructions) virtual machine Implementation of the following two kinds of methods:
    • Translates the input Java Virtual machine code into an instruction set of another virtual machine when it is loaded or executed.
    • Translates the input Java Virtual machine code into the local instruction set of the Cheng host CPU (that is, JIT code generation technology) when it is loaded or executed.

The class file is the data entry for the Java Virtual Machine execution engine and is one of the fundamental components of the Java technology architecture.

From for notes (Wiz)

Class 006 File Structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.