Original: http://blog.csdn.net/dc_726/article/details/7944154
1.Class File Basics
(1) file format
The structure of class files is not as loosely free as describing languages such as XML. Since it does not have any delimiters, the above data items are strictly qualified, either sequentially or quantitatively. Which byte represents what meaning, the length is how much, the order of precedence is not allowed to change.
(2) data type
Careful observation of the above class file format, you can see that the class file format is similar to the C language structure of the pseudo structure to store, this pseudo structure only two types of data:unsigned numbers and tables。 Unsigned numbers are U1, U2, U4, U8 to represent 1, 2, 4, 8 bytes respectively. A table is a composite data type consisting of more than one unsigned number or other table, ending with "_info." At the beginning of a table, a front capacity counter is usually used, because a table usually describes a variable number of data.
The following figure represents the type of data items in order in the class file format:
(3) Compatibility
A high version of the JDK can be backward compatible with previous versions of the class file, but it cannot run a later version of the class file, even if the file format has not changed. For example, the JRE in JDK 1.7 can perform a class file that JDK 1.5 compiles, but the JDK 1.7-compiled class file cannot be used by JDK 1.5. This is useful for the target parameter, which you can specify-target 1.5 when compiling with JDK 1.7.
2. A simple example [Java]View plain copy package com.cdai.jvm.bytecode; public class Bytecodesample {private String msg = ' Hello world '; public void Say () {System.out.println (msg); After compiling into a class file:
3. Bytes-by-byte analysis
(1) Magic number and version number
The first four bytes (U4) Cafebabe is the magic number of the class file, 5th, 6 bytes (U2) is the minor version number of the class file, 7th, 8 bytes (U2) is the major version number. Hexadecimal 0 and 32, which is version number 50.0, JDK 1.6. The target parameter described earlier affects these four-byte values, making the class file compatible with different JDK versions.
(2) constant pool
Chang is a table structure, and as previously described, there is a U2 type counter in front of the table's contents that represents the length of the constant pool. Hexadecimal 23 has a decimal value of 35, which indicates that there are table entries in the constant pool that are labeled 1~34. The subscript starts at 1 instead of 0 because the No. 0 table entry means "do not refer to any of the constant pools." The first byte of each table entry is a U1 type, representing the data type in 12. The specific meaning is as follows:
With the first itemModifiedAs an example, 07 indicates that the constant is a constant_class_info type, followed by a U2 type of index that executes the 2nd constant. And look at the second item.of 6f 6d 2f ... theRepresents the string type, with a length of 36 (hexadecimal 00 24), followed by the UTF-8 encoded string "Com/cdai/jvm/bytecode/bytecodesample". It's easy to read and understand. A constant pool is primarily serviced for subsequent field tables and method tables.
The following is the full picture of the constant pool after parsing via JAVAP (executejavap-c-l-s-verbose bytecodesample)
Constant Pool:
Const #1 = Class #2; Com/cdai/jvm/bytecode/bytecodesample
Const #2 = Asciz com/cdai/jvm/bytecode/bytecodesample;
Const #3 = Class #4; Java/lang/object
Const #4 = Asciz Java/lang/object;
Const #5 = Asciz msg;
Const #6 = Asciz ljava/lang/string;;
Const #7 = Asciz <init>;
Const #8 = Asciz () V;
Const #9 = Asciz Code;
Const #10 = method #3. #11; Java/lang/object. " <init> ":() V
Const #11 = Nameandtype #7: #8;//"<init>":() V
Const #12 = String #13; Hello World
Const #13 = Asciz Hello world;
Const #14 = Field #1. #15; com/cdai/jvm/bytecode/bytecodesample.msg:ljava/lang/string;
Const #15 = Nameandtype #5: #6;//msg:ljava/lang/string;
Const #16 = Asciz linenumbertable;
Const #17 = Asciz localvariabletable;
Const #18 = Asciz this;
Const #19 = Asciz lcom/cdai/jvm/bytecode/bytecodesample;;
Const #20 = Asciz say;
Const #21 = Field #22. #24; Java/lang/system.out:ljava/io/printstream;
Const #22 = Class #23; Java/lang/system
Const #23 = Asciz Java/lang/system;
Const #24 = Nameandtype #25: #26;//Out:ljava/io/printstream;
Const #25 = Asciz out;
Const #26 = Asciz ljava/io/printstream;;
Const #27 = method #28. #30; JAVA/IO/PRINTSTREAM.PRINTLN: (ljava/lang/string;) V
Const #28 = Class #29; Java/io/printstream
Const #29 = Asciz Java/io/printstream;
Const #30 = Nameandtype #31: #32;//println: (ljava/lang/string;) V
Const #31 = Asciz println;
Const #32 = Asciz (ljava/lang/string;) V;
Const #33 = Asciz sourcefile;
Const #34 = Asciz Bytecodesample.java;
(3) Access flag
Obviously, 00 21 represents the common class.
(4) class, parent class, interface
The values of these three U2 types represent the class index 1, the parent class index 3, and the interface index collection 0. View the previous constant pool, the 1th item is "Com/cdai/jvm/bytecode/bytecodesample", and the 3rd item is "Java/lang/object". The No. 0 item means that this class does not implement any interfaces, which is the role of the constant pool No. 0 item.
(5) Field table
00 01 indicates that there are 1 fields. 00 02 is the Access flag for the field, which represents the private permission. 00 05 is the name index of the field, pointing to the constant Chili the 5th item "MSG". 00 06 is a field.DescriptorIndex, pointing to the 6th item "ljava/lang/string" in the constant pool. The last 00 00 indicates that the field has no otherproperty sheetOut.
The function of a descriptor is to describe the data type of the field, the parameter list of the method, and the return value. A property sheet is a table structure that provides extra information for field and method tables. For fields where you declare a field as a static final msg = "AAA" Constant, the field is followed by a property sheet with an item named Constantvalue, pointing to a constant in the constant pool with a value of "AAA."
The property sheet is not as strict in order, length, and content as other data items in the class file, and any compiler that implements can write its own defined attribute information to the property sheet, and the JVM ignores attributes that it does not recognize. The Code property of the property sheet is also used in the following method table to hold the bytecode of the method.
(6) Method table
00 02 indicates that there are two methods. 00 01 is the method's access flag, representing the public method. The 00 07 and 00 08 are the same as the name and descriptor indexes in the field table, representing "<init>" and "() V" respectively. 00 01 means that the method has a property sheet, and the property name is 00 09, which is the code attribute we mentioned earlier.