View the Java bytecode file structure from HelloWorld

Source: Internet
Author: User
Tags field table

View the Java bytecode file structure from HelloWorld

Most of the time, we are learning how to program from the code level, but seldom look at what is behind each Java code. Today, let's take a look at the Java class file structure from the simplest Hello World.

Before getting started, let's write a simple getting started Hello World.

Public class Demo {
Public static void main (String args []) {
System. out. println ("Hello World .");
}
}

Then runjavac Demo.javaCommand to compile this class. A Demo. class file is generated.

Then we use the plain text editor to open the generated Demo. class file.

Cafe babe 0000 0034 001d 0a00 0600 0f09
0010 0011 0800 120a 0013 0014 0700 1507
0016 0100 063c 696e 6974 3e01 0003 2829
5601 0004 436f 6465 0100 0f4c 696e 654e
756d 6265 7254 6162 6c65 0100 046d 6169
6e01 0016 285b 4c6a 6176 612f 6c61 6e67
2f53 7472 696e 673b 2956 0100 0a53 6f75
7263 6546 696c 6501 0009 4465 6d6f 2e6a
6176 610c 0007 0008 0700 170c 0018 0019
0100 0b48 656c 6c6f 2057 6f72 6c64 0700
1a0c 001b 001c 0100 0444 656d 6f01 0010
6a61 7661 2f6c 616e 672f 4f62 6a65 6374
0100 106a 6176 612f 6c61 6e67 2f53 7973
7465 6d01 0003 6f75 7401 0015 4c6a 6176
612f 696f 2f50 7269 6e74 5374 7265 616d
3b01 0013 6a61 7661 2f69 6f2f 5072 696e
7453 7472 6561 6d01 0007 7072 696e 746c
6e01 0015 284c 6a61 7661 2f6c 616e 672f
5374 7269 6e67 3b29 5600 2100 0500
0000 0000 0200 0100 0700 0800 0100
0000 1d00 0100 0100 0000 052a b700 01b1
0000 0001 000a 0000 0006 0001 0000 0001
0009 000b 000c 0001 0009 0000 0025
0001 0000 0009 b200 0212 03b6 0004 b100
0000 0100 0a00 0000 0a00 0200 0000
0800 0400 0100 0d00 0000 0200 0e

We can see that the five simple lines of code have finally been condensed into the hexadecimal symbol consisting of the long string of numbers and letters above. When we run the Java class, the console can accurately output "Hello World", so we can conclude that this long string of symbols must comply with certain rules, and this rule is actually: java Virtual Machine specifications.

Java virtual machine specification

The Java virtual machine specification specifies the Java Virtual Machine Structure, Class file structure, bytecode instructions, and other content. For software developers, Class file structure is a necessity to understand.

The Class file structure of the Java Virtual Machine is a set of 8-byte binary streams. Each data item is strictly arranged in the Class file in order, without any separator, this makes the content stored in the entire Class file almost all the data required by the program and there is no gap.

Java Virtual Machine

After finishing the Java Virtual Machine specifications, you need to understand the concept of Java virtual machine.

In fact, a Java virtual machine is a virtual computer. Like a real computer, Java virtual machines have their own complete hardware systems, such as processors, stacks, registers, and corresponding Instruction Set Systems. The only difference between a virtual machine and our computer is that the processor and memory stack of the virtual machine are virtualized by software, while the processor and memory of our computer are authentic.

Although the name is a Java virtual machine, the Java Virtual Machine is not directly related to the Java language. It only reads Class files according to the Java Virtual Machine specifications, and parses and executes bytecode commands according to regulations, that's all.

If you are awesome enough, you can write a compiler to compile the C language code into a bytecode file that complies with the Java Virtual Machine specifications, then the Java Virtual Machine can also be executed.

Specifically, the Java Virtual Machine is bound to a bytecode file (Class file.

Java file structure

Java Virtual Machine specifications define many specifications, some of which define the structure and specifications of bytecode. Java Virtual Machine specifications define two data types to represent the Class file format, namely, the number of unsigned numbers and tables.

Unsigned numberIt is the most basic data type. u1, u2, u4, and u8 represent the unsigned number of 1 byte, 2 byte, 4 byte, and 8 byte, respectively, unsigned numbers can be used to describe numbers, index references, number values, or string values encoded by UTF-8. For example, u4 in the first row of the following table indicates the first four bytes of the Class file, which indicates the magic number of the file, the u2 in the second row indicates the 5th-6 bytes of the Class file, indicating the JDK version number.

TableIt is a composite data type consisting of multiple unsigned numbers or other tables as data items. All Tables habitually end with "_ info, A table is used to describe data in a hierarchical composite structure. For example, the 5th rows in the following table indicate a table of the cp_info type (constant pool), where all constants of the class are stored.
The entire Class file is essentially a table, which consists of the data items shown in the table below.

The above table can actually be divided into the following seven parts, which constitute a complete Class bytecode file:

  • Magic number and Class file version
  • Constant pool
  • Access flag
  • Class index, parent index, and interface Index
  • Field table set
  • Method table set
  • Attribute Table set
Magic number and Class file version

The 1st-4 bytes in the Class file represent the Magic Number of the file ). Its only function is to determine whether the file is a Class file that can be accepted by the Virtual Machine. Its fixed value is: 0 xCAFEBABE (coffee baby ). If the magic number of a Class file is not 0 xCAFEBABE, the virtual machine will refuse to run the file.

The 5th-6 bytes in the Class file represent the Minor Version number (Minor Version) of the Class file, that is, the JDK Version number for compiling the Class file.

The 7th-8 bytes in the Class file represent the main Version number (Major Version) of the Class file, that is, the JDK main Version number for compiling the Class file.

The JDK of a higher version is backward compatible with previous stupid Class files, but cannot run the new version of Class files. For example, if a Class file is compiled using JDK 1.5, we can run it on the JDK 1.7 virtual machine, but not on the JDK 1.4 virtual machine. The following table lists the hexadecimal version numbers of each JDK version:

Let's take a look at the Class file of the Demo file. The first eight bytes are:cafe babe 0000 0034. We can know that this Class file is compiled by jdk1.8.

Constant pool

The 9th-10 bytes in the Class file are used to indicate the number of constants in the constant pool (constant_pool_count), so there is a constant_pool_count-1 constant. The 9th-10 bytes in the Class file are 001d, indicating that there are 28 constants.

Constants in each constant pool are represented by a cp_info table with 14 values:

1st constants. The next byte of 001d is 0A, which indicates that this constant is a constant of the CONSTANT_MethodHandle_info method reference type. As shown in the preceding table, the 2nd-3 bytes of this constant indicates class information. Here, 0006 indicates the information indicated by the 6th constants pointing to the constant pool. The 4th-5 bytes of the constant indicates the name and class descriptor. The value 000f indicates the information indicated by the 10th constants in the constant pool.

2nd constants. The next byte of 000f is 09, which indicates that this constant is a constant of the field reference type (CONSTANT_Fieldref_info. As shown in the preceding table, the 2nd-3 bytes of this constant indicates class information. Here, 0010 indicates the information indicated by the 16th constants pointing to the constant pool. The 4th-5 bytes of this constant indicates the name and class descriptor. The value 0011 indicates the information indicated by the 17th constants pointing to the constant pool.

3rd constants. Then the last byte of 0011 is 08, which indicates that the constant is a constant of the string reference type (CONSTANT_String_info. As you can see from the preceding table, the 2nd-3 bytes of this constant indicates the index pointing to the string literal. Here, 0012 indicates the 18th constants pointing to the constant pool.

4th constants. The next byte of 0012 is 0A, indicating that the constant is a constant of the CONSTANT_MethodHandle_info type. As shown in the preceding table, the 2nd-3 bytes of this constant indicates class information. Here, 0013 indicates the information indicated by the 19th constants pointing to the constant pool. The 4th-5 bytes of this constant indicates the name and class descriptor. The value 0014 indicates the information indicated by the 20th constants pointing to the constant pool.

5th constants are constants of the class information type, which point to 21st constants in the constant pool.

6th constants are constants of the class information type, which point to 22nd constants in the constant pool.

7th constants. Here, the tag value is 01, indicating that the constant is a constant of a string (CONSTANT_Utf8_info. As you can see from the preceding table, the 2nd-3 bytes of this constant indicates the length of the string. Here 0006 indicates that the length of the string is 6 bytes. The next six bytes of 01 are 3C 69 6E 69 74 3E. In the Class file, strings are encoded using ASCII codes. After these hexadecimal characters are converted to the corresponding ASCII code, the value is:<init>.

The first constant is a String constant. After conversion, it is:()V.

The first constant is a String constant. After conversion, it is:Code.

The first constant is a String constant. After conversion, it is:LineNumberTable.

The first constant is a String constant. After conversion, it is:main.

The first constant is a String constant. After conversion, it is:([Ljava/lang/String;)V.

The first constant is a String constant. After conversion, it is:SourceFile.

The first constant is a String constant. After conversion, it is:Demo.java.

15th constants. Here, the tag value is 0C, indicating that the constant is a constant of the CONSTANT_NameAndType_info method reference type. As you can see from the preceding table, the 2nd-3 bytes of this constant indicates the index of the field or method name. Here, 0007 indicates the information indicated by the 7th constants pointing to the constant pool. The 4th-5 bytes of this constant indicates the index of the field or method descriptor. The value 0008 indicates the information indicated by the 8th constants pointing to the constant pool. According to our previous analysis, we can know that the information represented by the 15th constants is:"<init>":()V.

16th constants. Here, the tag value is 07, indicating that the constant is a constant of the class information type (CONSTANT_Class_info. From the preceding table, we can see that the constant item 2nd-3 bytes indicates the index of the fully qualified name constant item. Here, 0017 indicates the information indicated by the 23rd constants pointing to the constant pool.

17th constants. Here, the tag value is 0C, indicating that the constant is a constant of the CONSTANT_NameAndType_info method reference type. As you can see from the preceding table, the 2nd-3 bytes of this constant indicates the index of the field or method name. Here, 0018 indicates the information indicated by the 24th constants pointing to the constant pool. The 4th-5 bytes of this constant indicates the index of the field or method descriptor. The value 0019 indicates the information indicated by the 25th constants pointing to the constant pool. According to our previous analysis, the information represented by 17th constants is actually:out:Ljava/io/PrintStream;.

The first constant is a String constant. After conversion, it is:Hello World.

19th constants. Here, the tag value is 07, indicating that the constant is a constant of the class information type (CONSTANT_Class_info. From the preceding table, we can see that the constant item 2nd-3 bytes indicates the index of the fully qualified name constant item. Here, 001A indicates the information indicated by the 26th constants pointing to the constant pool.

20th constants. Here, the tag value is 0C, indicating that the constant is a constant of the CONSTANT_NameAndType_info method reference type. As you can see from the preceding table, the 2nd-3 bytes of this constant indicates the index of the field or method name. Here, the 001B indicates the information indicated by the 27th constants pointing to the constant pool. The 4th-5 bytes of this constant indicates the index of the field or method descriptor. The value of 001C indicates the information indicated by the 28th constants pointing to the constant pool.

The first constant is a String constant. After conversion, it is:Demo.

The first constant is a String constant. After conversion, it is:java/lang/Object.

The first constant is a String constant. After conversion, it is:java/lang/System.

The first constant is a String constant. After conversion, it is:out.

The first constant is a String constant. After conversion, it is:Ljava/io/PrintStream;.

The first constant is a String constant. After conversion, it is:java/io/PrintStream.

The first constant is a String constant. After conversion, it is:println.

The first constant is a String constant. After conversion, it is:(Ljava/lang/String;)V.

By now, 28 constants in our constant pool have all been parsed. We learned about the composition of the constant pool through manual analysis, but many times we can use the javap Command provided by JDK to directly view the constant pool information of the Class file.

When we runjavap -verbose Demo.classThe console prints the composition information of the Class file, including the information of the constant pool.

The results printed by javap are compared with those manually analyzed. You will find that the results are consistent.

Access flag

After the constant pool ends, the next two bytes represent the access tag (access_flags), which is used to identify access information at the class or interface level, including: this Class is a Class or an interface, whether it is defined as the public type, and whether it is defined as the abstract type. The specific flag spaces and meanings are listed in the table below.

Here, the two bytes are 00 21. We have not found the flag whose value is 00 21. This is because the access flag may be composed of multiple flag names, so the flag value in the bytecode file is actually the result of multiple values or operations.

By reading the above table, we can know that 00 21 is obtained by 00 01 and 00 20. That is to say, the access flag of this class is public and the new meaning of the invokespecial bytecode command is allowed.

Class index, parent index, and interface Index

Both the class index and the parent index are u2-type data, and the interface index set is a set of u2-type data, the three data items in the Class file determine the inheritance relationship of the Class.

Class index.The class index is used to determine the full qualified name of the class, which is represented by a u2 type data. The class index is 00 05, indicating that it points to 5th constants in the constant pool. Through our previous analysis, we know that the final information of the 5th constants is the Demo class.

Parent index.The parent index is used to determine the full qualified name of the parent class of this class. The parent index is represented by a u2 type data. Here, the parent class index is 00 06, which indicates that it points to the 6th constants in the constant pool. Through our previous analysis, we know that the final information of the 6th constants is the Object class. Because it does not inherit any classes, the parent class of the Demo class is the default Object class.

Interface index.The interface index set is used to describe which interfaces are implemented by the class. These Implemented interfaces are implemented according to the implements Statement (if the class itself is an interface, it should be an extends Statement) the subsequent interfaces are arranged from left to right in the interface index set. For an interface index set, the first entry is the interface counter (interfaces_count) for u2 data, which indicates the capacity of the index table. After the interface counter, all interface information is followed. If this class does not implement any interface, the counter value is 0, and the index table of the subsequent interface no longer occupies any bytes.

In the Demo class bytecode file, because no interface is implemented, the two bytes after the parent class index are 0x0000, which indicates that the class does not implement any interface. Therefore, the interface index table is empty.

Field table set

The field table set is used to describe variables declared in interfaces or classes. The fields mentioned here include class-level variables and instance-level variables, but not local variables declared within the method.

The two bytes after the class interface set are a field counter, indicating that there are always several attribute fields. After the field counter, it is the specific attribute data. Each field in the field table is represented by a table named field_info. The data structure of the field_info table is as follows:

Because we have not declared any class member variables or class variables, in the Demo bytecode file, the field counter is 00 00, indicating that there are no attribute fields.

Method table set

The two bytes after the field table are a method counter, indicating that there are always several methods in the class. After the field counter, it is the specific method data. Each method in the method table is represented by a method_info, and its data structure is as follows:

In the bytecode file of the Demo class, the value of the method counter is 00 02, indicating that there are two methods in total.

1st methods. The last two bytes of the method counter indicate the method access identifier. 00 01 indicates the ACC_PUBLIC identifier, that is, the method access identifier is public. The second byte represents the index of the method name. Here, 00 07 points to the 7th constants in the constant pool.<init>. The following two bytes indicate the index items of the method descriptor. Here, 00 08 points to the 8th constants in the constant pool.()V. The following two bytes indicate the Attribute Table counter. 00 01 indicates that this method has one attribute. The next series is the content of the Attribute Table.

At this point, we have a comprehensive understanding of the Java class file structure by parsing Hello World. The Java Virtual Machine and Java Virtual Machine specifications are also briefly understood. I hope that after reading this article, you will have an in-depth understanding of the Java file structure.

This article permanently updates link: https://www.bkjia.com/Linux/2018-03/151358.htm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.