platform-Independent and language-independent nature
Byte code (bytecode) is the cornerstone of Java platform Independence and language independence.
Platform independence refers to the different CPU instruction sets, different operating systems, can recognize the same bytecode, achieve "write, run everywhere (write Once, running anywhere)".
Language independence refers to a class file compiled by different development languages, which can be run on a Java virtual machine as long as it conforms to the structure of the class file.
Java is designed to build platform-independent and language-independent, so the Java specification is divided into: the Java language Specification "the Java Language specification" and the Java Virtual Machine specification "The Java dummy Machine Specification ".
A schematic diagram of Java language independence:
II. structure of class documents
A class file is a set of binary streams that are based on 8-byte units. A data item greater than 8 bytes, with a high position after the previous status.
There are only two types of data in the class file: unsigned numbers and tables
Among them, in U1, U2, U4, U8 to represent 1 bytes, 2 bytes, 4 bytes and 8 bytes of unsigned numbers, you can describe the number, index reference, quantity value, UTF-8 encoded string values.
Tables are used to describe the data for a hierarchical composite structure, with the following types: Cp_info, Field_info, Method_info, Attribute_info
Note: When describing data of the same type but with indefinite quantity, a forward capacity counter is used to add several consecutive data items.
1, magic number
The first 4 bytes of each class file are called the Magic number, and its only function is to determine whether the file is a class file that can be accepted by a virtual machine. (identification)
The magic value for the class file is: 0xCAFEBABE ("Cafe Baby").
2, the class file version number
Immediately thereafter, the 4 bytes of the magic number store the version number of the class file, where the 5th and 6th bytes are minor versions (Minor version), and the 7th and 8th bytes are the major version numbers (Major versions).
Note: A newer version of the JDK can be backward compatible with the previous version of the class file, but cannot run a later version of the class file. The Java version number starts with 45, jdk1.0~1.1 uses the 45.0~45.3 version number, JDK1.2 can use the 46.0~46.65535 version number, and JDK1.6 can use the 50.0~50.65535 version number.
3, Chang
followed by the primary and secondary version number is the constant pool entrance, because the number of constants in the constant pool is not fixed, so the constant pool entry is placed in a U2 type of data representing the constant pool capacity count value (Constant_pool_count), which starts at 1. Constants in a constant pool are equal to the count minus one, and index 0 constants do special use.
Note: Only the capacity count of the constant pool in the class file starts at 1, and for other collection types, the capacity counts, including the interface index collection, the Field table collection, the Method table collection, and so on, start at 0.
There are two main types of constants in the constant pool: literal (Literal) and symbolic references (symbolic References).
Literal: A constant concept close to the Java language level, such as a text string, a constant value declared as final, and so on.
Note: For literal constants, a constant pool stores not only the literal constants defined by the class itself, but also the literal constants of other classes that the class references, which is an optimization of access, such as the third example of a "passive reference" in Note 4.
A symbolic reference contains the following three types of constants:
* * Fully qualified name of class and interface (fully qualified name)
* * Name and descriptor of field (descriptor)
The name and descriptor of the * * method
Each constant in the constant pool is a table, and the first bit at the beginning of the table is a U1 type, representing which constant type the current constant belongs to, a total of 11 table-structured data with different structures, and the following types:
The structure of 11 types of data is defined as follows:
4, visit the logo
After the constant pool, the next 2 bytes represent the access flag (ACCESS_FLAGS), which identifies the access information at some class or interface level.
The signs and signs have the following meanings:
Note: A total of 32 signs in the access_flags can be used, the current definition of only 8 of them, no use of the logo bit requirements are 0.
5. Class index, parent class index, and interface index collection
Immediately after the access flag is the Sequential class index (This_class), the parent class index (SUPER_CLASS), the interface index collection (interfaces).
Both the class index and the parent class index are data of a U2 type.
An interface index collection is a collection of data of a set of U2 types.
The class file is determined by these three items of data to determine the inheritance relationship of the classes. The class index is used to determine the fully qualified name of this class, which is used to determine the fully qualified name of the parent class of the class (note: All Java classes have no parent class index of 0 except the object class). An interface index set is used to describe which interfaces the class implements (the implemented interfaces are arranged from left to right in the indexed collection of the interface in the declared order).
The index value of the class index, the U2 type referenced by the parent class index, points to the Constant_class_info class description constants amount, which can be found by the index value of the Constant_class_info type constant in the constant_utf8_ The fully qualified name string in a constant of type info. The process of finding a fully qualified name for a class index is as follows:
Interface index Collection, the first item of the entry is the U2 type of data representing the interface counter (Interfaces_count), which represents the capacity of the Index table.
6. Collection of field tables
A field table (Field_info) is used to describe a variable declared in an interface or class. A field contains a class-level or instance-level variable, but does not include a variable declared inside a method. (Note that each field is a single table)
Fields can contain information about the access rights of a field, whether it is a static variable (class-level), whether it is mutable (final), concurrent visibility (volatile), whether it is serializable (transient), field data type (base type, object, array), field name.
Field table structure definition:
Access_flags represents a field access token, which is defined as follows:
Name_index is a reference to a constant pool, representing the simple name of the field. The simple name is relative to the fully qualified name.
Descriptor_index is also a reference to a constant pool, representing the descriptor for the field and method.
The method and field descriptors are used to describe the data type of the field, the parameter list of the method (quantity, type, order), and the return value. The description rules are as follows:
"type rule":
* Basic data types and void types are represented by an uppercase character;
* The object type is represented by the fully qualified name of the L-plus object;
* array type, each dimension will be described with a predecessor ' [' character. For example, a two-dimensional array defining a java.lang.string[][] type will be recorded as: "[[ljava/lang/string], a int[] type, which will be recorded as" [I].
Note: The type description Fu Pe literacy is defined as follows:
Attributes_count, Attribute_info, represents a collection of property sheets for a field that is used to store additional information.
The first U2 type data for the Field table collection is the capacity counter fields_count, followed by a series of field tables.
Fields that inherit from a superclass or parent interface are not listed in the Field table collection, but may list fields that do not exist in the original Java code, such as in an inner class that automatically adds a field to an external class instance in order to preserve the accessibility of the external class.
7, Method table collection
The structure of the method table is the same as the field table, and each structure has a similar meaning, except that the access flags and the options for the collection of property sheets are different.
The meaning of the method table identification bit is as follows:
Representation of method Descriptors
"method Description rule":
* First describe the parameter list, the return value in the description;
* The list of parameters is placed in a set of parentheses () in the strict order of the parameters. If the descriptor for Method Void Inc () is "() V", the method java.lang.String the descriptor for toString () as "() ljava/lang/string;" (Note: where ";" means the end of the fully qualified name). method int IndexOf (char[] source, int offset) is described as "([CI) I".
8. Property sheet Collection
The property sheet (attribute_info) is used to describe information that is proprietary to some scenarios, and the class file itself, the field table, and the method table have their own collection of property sheets.
The property sheet does not require a strict order, and any compiler that implements it can write its own defined attribute information to the property sheet, and the Java virtual Opportunity ignores attributes that it does not recognize.
The virtual machine specification has predefined properties as follows:
For each property, its name needs to refer to a Changshilai representation of a constant_utf8_info type from a constant pool.
The structure of the property sheet is defined as follows:
<1> "Code property sheet"
After the code in the Java program method body is processed by the compiler, it eventually becomes a bytecode instruction, which is stored in the Code property sheet, and the code attribute can appear in the property collection of the method table, and the method of the interface or abstract class does not have the code attribute.
The Code property sheet is structured as follows:
Attribute_name_index is an index that points to a Constant_utf8_info constant, which is fixed to "Code" and represents the property name of the property;
Attribute_length indicates the length of the property value. The length of the property value is fixed to the length of the entire property sheet minus 6 bytes (Note: These 6 bytes refer to the sum of the lengths of the Attribute_name_index and attribute_length items).
Max_stack: Represents the maximum value of the operand stack (Operand Stacks) depth. Note: This value is required to allocate the stack depth in the stack frame (frame) when the virtual runtime is being operated.
Max_locals: Represents the storage space required for a local variable table, and the Max_locals unit is slot.
Note: slot is the smallest unit used by virtual machines to allocate memory for local variables, and for data types that are no more than 32 bits long, each local variable occupies 1 slot, while the two 64-bit data types, double and long, require 2 slot to store.
Note: The method parameter, which contains the implied this for an instance method, the explicit exception handler parameter, and the local variables defined in the method body, are stored using local variables.
Code_length and code are used to store byte code directives. The Code_lengh represents the byte code length, and the code is a series of byte streams that store bytecode directives.
Note: Each instruction is a single byte of a U1 type.
Note: The virtual machine specification limits a method to no more than 65,535 byte code directives. is exceeded, the Javac compiler rejects the compilation.
Execption_table_length and Exception_table are explicit exception-handling table collections, and explicit exception-handling tables do not have to exist for the Code property.
The explicit exception-handling table structure is defined as follows:
The meanings of these fields are: if the bytecode is from line start_pc (note: "line" here means the offset from the start of the byte code relative to the method body) to the END_PC line (not including the END_PC line), there is an exception of type Catch_type or its subtype. Then go to line handler_pc to continue processing. When the value of the Catch_type is 0 o'clock, any exception will need to be directed to the handler_pc for processing.
<2> "Exceptions Properties"
The function of the exceptions property is to enumerate the detected exceptions (Checked exceptions) that may be thrown in the method, which is the exception that is enumerated after the throws keyword when the method is described.
The structure of the exceptions property is as follows:
The Number_of_exception item indicates the number of checked anomalies that may be thrown;
The exception_index_table item represents an index to a Constant_class_info constant in a constant pool, representing the type of the exception being checked.
<3> "Linenumbertable Properties"
The Linenumbertable property is used to describe the correspondence between the Java source line number and the byte Code line number (byte code offset). It is not a required property at run time. Use the-g:none or-g:lines option in Javac to cancel or require this information to be generated. If this information is not generated, the most important effect on the operation of the program is when the exception is thrown, the stack will not show the line number of errors, debugging can not follow the source code to set breakpoints.
The structure of the Linenumbertable property is as follows:
Where Line_number_table is a set of Line_number_table_length, type Line_number_info, line_number_info tables include START_PC and Line_ Number two A U2 type of data item, which is the bytecode line number and the Java source code line.
<4> "Localvariabletable Properties"
The Localvariabletable property is used to describe the relationship between a variable in a local variable table in a stack frame and a variable defined in the Java source code. is not a required property at run time and is not generated by default in the class file and can be controlled using-g:none or-g:vars options in Javac.
The structure of the Localvariabletable property is defined as follows:
The Loca_variable_info item represents the association of a stack frame with a local variable in the source code, and its structure is defined as follows:
where the start_pc and length properties represent the byte-code offsets at the beginning of the declaration cycle of the local variable and the length of their scope coverage (both of which are scoped to the local variable in the bytecode);
Both Name_index and Descriptor_index are indexes that point to the Constant_utf8_info type constant in a constant pool, representing the name of the local variable and the descriptor for that local variable, respectively.
Index is the position of this local variable slot in the stack frame local variable table. When the variable is a 64-bit type, it occupies a position of slot for index and index+1 two.
Note: After Java introduced generics, the "localvariabletypetable" attribute was added. Structs are similar to localvariabletable, except for one property item, which is replacing Descriptor_index with signature (the feature signature of a field).
<5> "SourceFile Properties"
The SourceFile property is used to record the name of the source file that generated this class file. This property is also optional and is controlled using the Javac-g:none or-g:source option.
The structure of the property is defined as follows:
Where the Sourcefile_index item is the index to the Constant_utf8_info constant in the constant pool, which is the file name of the source file.
<6> "Constantvalue Properties"
The function of the Constantvalue property is to tell the virtual machine to automatically assign a value to a static variable.
The assignment of a variable of a non-static type (that is, an instance variable) in Java is done in an instance constructor <init> method; For a class variable, there are two options: assignment in the Class Builder <clinit> method, Or use the Constantvalue property to assign a value. The Sun Javac compiler's choice is to have both final and static adornments, and types of primitive type or string variables, generating the Constantvalue property to initialize, and the rest of the static variable passing through the <clinit> method to initialize the.
The structure of the Constantvalue property is defined as follows:
The Constantvaue_index item represents a reference to a literal constant in a constant pool, depending on the type of field, the literal can be constant_long_info, Constant_float_info, Constant_double_ One of the info, Constant_integer_info, and Constant_string_info constants.
<7> "Innerclasses Properties"
The Innerclasses property is used to record the association between the inner class and the host class. If an inner class is defined in a class, the compiler will generate the Innerclasses attribute for it and the inner class it contains.
The structure of the Innerclasses property is as follows:
The data item number_of_classes represents the number of internal class information that needs to be recorded, and the information for each inner class is described by a inner_classes_info table.
The structure of the Inner_classes_info table is as follows:
Both Inner_class_info_index and Outer_info_index are indexes that point to the Constant_class_info-type constants in the constant pool, representing the symbolic references of the inner class and the host class, respectively.
Inner_name_index is the index of the Constant_utf8_info type constant in the constant pool, representing the name of the inner class, or 0 if it is an anonymous inner class.
The inner_class_access_flags is the access flag for the inner class, similar to the access_flags of the class, and its value ranges as follows:
<8> "Deprecated and synthetic properties"
The deprecated and synthetic two attributes are all Boolean attributes of the flag type, with or without differences, without the concept of attribute values.
The deprecated attribute is used to represent a class, field, or method that has been determined by the author of the program to be deprecated (set using the @deprecated annotation in code).
The synthetic property represents that the field or method is not directly generated by the Java source code, but is added by the compiler itself.
The structure of the deprecated and synthetic properties is defined as follows:
Where the value of the ATTRIBUTE_LENGTH data item must be 0x00000000.