Deep understanding of the Jvm:class file structure constant pool (1) __JVM

Source: Internet
Author: User

Quoting a sentence "The result of code compilation from local machine code to bytecode is a small step in the development of the storage format, but a big step in programming language development", this sentence contains the essence of Java.

In this paper, I try to be concise, sketchy, for the purposes of recording learning and sharing communication. If there is any improper place please correct me. Body

We know that C + + code compiles files that are binary machine code that can be run directly on the local machine. Java compiles a class file that needs to be executed by the JVM. To understand how the JVM actually loads and executes class files, we first need to figure out what the storage structure of the class file is.

First, theclass file is a sequence of 8-byte permutations of a binary stream. All of the data items contained in it are tightly arranged without any delimiters.

Since it is a tightly arranged binary stream without any delimiters, the meaning of each byte in the class file must be strictly qualified. What we want to illustrate is this "qualifying rule".

For the sake of visualization, first look at the Classstructure.class file compiled from a very simple code. The code is as follows:

public class Classstructure {
  private string string = "Hello";

  public string Hello () {return

The Classstructure.class file compiled from this code is opened with the notepad++ Hex-editor plugin, which is all the information contained in the bytecode compiled by this code, as shown in the following figure:


Of course, all files can be opened in such a hexadecimal fashion. However, the class file has a feature, and we look at its head four bytes:ca fe babe, connected together is "coffee, baby". This first four bytes is the "magic number" for this file identity, which is used by the JVM to determine whether the file is a class file that can be accepted by the virtual machine.

With this head, let's look at the specific storage rules for the class file: (Note that all the bytes are sorted tightly, without delimiters)


We'll just intercept the constant pool section, and then we'll take a look at what we're storing later. Let's talk about Chang first, and also make it easier to understand what's going on with the "name index" of the constant pool in the later part.

One thing to emphasize here: * * Constant entries in a constant pool are not a single const constant in the programming language we intuitively understand, but rather a kind of data structure like "struct" in C language. We call this structure "table". Almost all the data in the **class file is stored in the form of "table". There will be many times in the back.

Let's take a look at the first 10 bytes of the byte code to intercept the following:


The first 10 bytes (two consecutive 16 in each byte) are ca fe ba, the head cafe Babe has said; Back Red 2 bytes is the minor version number of the JDK ; Green 2 Bytes is the major version number of the JDK (here is the decimal 51, the high version can be backward compatible); The blue portion is the number of constant items in the constant pool, where 0x0017 represents a total of 1 * 16 + 7 = 23 constant items in a constant pool. They are arranged in order, and the index number is 0 ~ 22, which is the same as the array access in the programming language. However, there is one point that needs to be specifically explained. The No. 0 constant is empty for a special purpose, i.e. our true constant entry is 1 ~ 22 of these 22 constants.

Let's look at the 22 constant entries, which we mentioned earlier, that the class file is a tightly aligned byte stream, and that the exact meaning of each one must be explicitly and strictly qualified to ensure that the JVM executes correctly. Therefore, the size, composition, and arrangement of each constant in a constant pool must be clearly stated.

In fact, the specific rules for each of the constant tables in a constant pool are set in advance, with 12 before JDK 1.7. We first take two types of constants as an example, first look at the storage mode, and then look at the specific meaning:

1, theSting type of constant table storage method is as follows, the literal meaning is to describe a string type of constant.

2. Themethodref type constant table is stored as follows, literally meaning a constant type that describes a method index (ref).

These are two of the 12 constants, all of which are uniquely identified with the tag's value. The head of each constant table must have a byte of tag to represent its constant type, and the tag value of each constant is fixed. For example, the tag value of the methodref type is "ten", and the tag value of theSting type is "8", which makes it possible to uniquely determine which constant table is stored in the place. And each kind of constant's specific storage way is in advance stipulation good. For example, the methodref type mentioned above will occupy a fixed 5 (1 + 2 +2) byte, each byte meaning in the table above, theString type will occupy 3 (1 + 2) bytes, each byte meaning see the table above. This 22 constants can be arranged in sequence. And the meaning of each byte is very clear.

Verbose, what does "index" appear in the methodref constant table mean. For example, the index that points to the class type constant. The value of the index, which is the specific values stored in these two bytes, is the index number of the 0~22 we mentioned above. That is, it points to another constant in the constant pool, which is class(one of 12 types).

Be aware. There are only 2 index values in the constant table of the methodref type constant, and there is no specific description of the method. The specific description of the method is based on other constants in the indexed constant pool , which we'll see later.

As mentioned earlier, the constant pool in this article has a total of 22 constant tables, so starting with the next 11th byte (underlined below) should be the tag value for the first constant table. The value is 0x0a, which is the decimal "10", said above, the tag value of 10 means that the entry is methodref type. According to the methodref constant table, the next two bytes are the index number pointing to the class type, as underlined in Figure 0x00 05. The next two bytes are 0x00 13, where the #19 (16 + 3) is the constant (type nameandtype) for the specific description of the method.


This concludes the 5 bytes of the first constant table. Next is the tag value for the next constant, followed by the 0x08 at the beginning of the next underscore, which is the String type constant, and the next two bytes 0x00 0f is the index to the Utf8 type constant. At this point, the 3 byte of the constant is complete, and next should be the tag value of the next constant ...

A byte-by-byte manual translation is a good way to learn class file structure, but after all, trouble. In fact, the JAVAP tool in the JDK will help us do the "translation" work we did above, we go through cmd into the command window, with JAVAP tools to translate the Classstruct.class file just generated as follows:


As shown above, the first constant

#1 = MethodRef      #5. #19

Type is MethodRef, as we have translated, its two indexes are #5 and #19 respectively. We looked through the index:
#5 = Class          #22

The class type's constant table is stored as an index and continues to look at the index:
#22 = Utf8          java/lang/object

Here we find the class that declares the method is Java/lang/object. It also shows that "the specific description of the method is based on other constants in the indexed constant pool." In fact, the method is the constructor that the compiler adds for us. As for the specific meaning of the #8 and #9 pointed out by #19, we shall say the following.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.