[Java] When JVM is used, you can touch the fish in shortest.

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is not an article describing jvm or cross-platform features of jvm, nor an article about jvm security features, or an article about jvm command operations and data operations, this article focuses on the type of lifecycle.

The Type lifecycle involves: class loading, jvm architecture, and garbage collection mechanism.

Why do we need to talk about the jvm architecture? Because the class loading and garbage collection mechanisms are closely related to the jvm architecture.

So what is the jvm architecture?

When the jvm runs, it will apply to the system for a memory partition (different jvm implementations may be different, and some may use virtual memory) and separate this memory into a part to store many things, for example, an object created by a program, a parameter passed to a method, a returned value, or a local variable, is called the runtime data zone, the runtime data area can be divided into Method Area, heap, java stack, pc register, and local method stack. As shown in the figure above, you may understand what the jvm system looks like, but you may not know what the method zone in the runtime data zone is.

Method Area: When a virtual machine loads a class file, it will parse the type information from the binary data contained in the class file, and then put the type information into the method area. Because the method area is shared by all threads, data thread security must be considered. If both threads are trying to find the lava class, when the lava class is not loaded, only one thread should be loaded while the other thread is waiting.

PC register: each new thread generates its own pc register and a java stack frame.

Heap: stores all objects generated when the program runs. Heap is a thread-shared memory zone, so we need to consider concurrency when writing multi-threaded programs.

Java stack: A java stack consists of many stack frames. When a thread calls the java method, the virtual machine pushes a new stack frame to the java stack. When the method returns, this stack frame is popped up from the java stack and discarded.

Now you can imagine how some JVMs work. Should you continue with the specific working principles ?. But don't worry, first understand the next type of loader System.

Before learning about the class loader System, first understand the class loaders in jvm: BootstrapLoader, ExtClassLoader, AppClassLoader; ExtClassLoader (responsible for loading rt under jre. jar, charsets. jar) and AppClassLoader (responsible for reprinting class packages under classpath) are subclasses of ClassLoader (abstract class;

BootstrapLoader (responsible for loading the jre core class library) is the root loader written in c/c ++ and cannot be seen in java.

The three class loaders have a parent-child relationship. The root loader is the ExtClassLoader parent loader and the ExtClassLoader is the AppClassLoader parent loader;

Class loading in Jvm is also the first threshold for the Security Sandbox Model. The Java load class uses the parent-parent delegation mode to take full responsibility for the delegation mechanism. Now let's take a look at the loading process.

When a class is loaded, If you specify a class loader for loading, the class loader will first be assigned to the parent class loader and always be assigned to the root loader, if a java. lang. string, because it is the core class library and has been loaded, a class object will be returned directly. What if it is a class not found by the root loader? Next, it will be handed over to the sub-class (next-level parent class) loader. If the class file is still not found, it will then be loaded by the class loader specified by the user. (The process of loading superclasses is not described here. Do not neglect it ).

If someone maliciously writes a basic class of java. lang. String, will the VM be affected? This class will not be finally loaded by the root loader, and the root loader will only load the class to the jre core class library. The class type returned in the end is not a String written by the user, in addition, the system's built-in String, that is, the user's write String will never be loaded.

After learning how the class Loader works, we also need to understand the class file format;

# {U4magic; // magic u2minor_version; // class version u2major_version; // class main version number; // constant pool count cp_infoconstant_pool [constant_pool_count-1]; // constant pool u2access_flags; // modifier; // constant pool index comment; u2interfaces [interfaces_count]; comment; field_infofields [fields_count]; u2methods_count; method_infomethods [methods_count]; comment; Comment [comment];}

We need to know a lot, but what we cannot understand is the cp_infoconstant_pool constant pool.

A constant pool contains many tables:

CONSTANT_Utf8 UTF-8-encoded Unicode string

CONSTANT_Integer int-type Literal Value

CONSTANT_Float float

CONSTANT_Long-type Literal Value

CONSTANT_Double double type Literal Value

CONSTANT_Class symbol reference for a class or interface

Reference of CONSTANT_String-type Literal Value

CONSTANT_Field ref symbol reference for a field

CONSTANT_Method ref: Symbol reference for methods in a class

Constant_interfacemethod Ref: Symbol reference of a method in an Interface

Constant_nameandtype references some symbols of a field or method.

I will not explain the structure of these tables. It does not matter if I do not know enough about the class file. Now that we understand the JVM System and the workflow of the Class Loader, let's take a closer look at the changes in the JVM runtime data zone and the structure in the method area.

During the class loading process, each class loader will form a table in the Method Area, which records the name of the loader and the corresponding class permissions. Without such a table, the JVM internal namespace is formed. At the same time, the constant pool and other information of this class are also provided in the method area.

So speaking of this, the process is still vague, and a lot of knowledge is lost. Now let's look at a detailed loading process.

When a common class is loaded, the loadclass method of the Class Loader is called. If the class to be loaded has not been loaded into the namespace, then JVM will pass a fully qualified name of this type to the class loader, that is, the constant pool constant_class_info (the table stores information such as the parent Class and Class Loader, to try to load the referenced type. If the referenced type is defined by the JVM loader, it is loaded by the JVM Class Loader. Otherwise, it is loaded by the user-defined loader, once the referenced type is loaded, JVM checks its binary data carefully. If the class is a class and is not Java. lang. object. The JVM loads data based on its fully qualified name (recursive application). This process also requires recursive superinterfaces.

The loading process is almost complete. The complete process is: Load connection-initialization.

Then the connection and initialization will pass by, focusing on garbage collection.

The connection process is mainly to verify (confirm that the type complies with the semantics of the Java language, and it does not compromise the integrity of the virtual machine), prepare (the Java Virtual Machine allocates memory for class variables, design the default initial value), parsing (in the constant pool of the type, find the class, interface, field and method conformity reference, replace these symbol references with the direct reference process ).

During initialization, if the class has a direct superclass and the superclass have not been initialized, initialize them first. The initialization interface does not need to initialize its parent interface.

Supplement:

When Jvm runs a method, it first pushes the method into the java stack, which contains information such as local variables. Where can the object be put? Objects are referenced in the stack, that is, variables. All objects are stored in the stack.

Why should we put objects in the heap and data like variables in the stack? To put it bluntly, the object is too big and it is difficult to store it in the stack. (Of course, the standard answer is not like this. I just want to explain it)

After learning about this process, we must understand the garbage collection mechanism.

Basic recycling Algorithm

1. Reference count: A relatively old collection algorithm. The principle is that this object has a reference, that is, adding a count. deleting a reference reduces the count. During garbage collection, only objects with zero collection count are used. The most critical issue of this algorithm is that it cannot handle circular references.

2. Mark-clear: this algorithm is executed in two phases. In the first stage, all referenced objects are marked from the reference root node. In the second stage, the whole heap is traversed to clear unmarked objects. This algorithm suspends the entire application and generates memory fragments.

3. Copy: This algorithm divides the memory space into two equal regions and uses only one of them at a time. During garbage collection, traverse the current region and copy the objects in use to another region. The secondary algorithm only processes objects in use at a time, so the replication cost is relatively small. At the same time, the corresponding memory can be organized after the replication, but there is a fragmentation problem. Of course, the disadvantage of this algorithm is also obvious, that is, it requires two times of memory space.

4. Tag-sorting: This algorithm combines the advantages of tag-clearing and Replication Algorithms. It is also divided into two phases. In the first phase, all referenced objects are marked from the root node. In the second stage, the whole heap is traversed to clear unlabeled objects and compress the surviving objects to one of the heap, discharge in sequence. This algorithm avoids the problem of tag-clearing fragments and the space of the replication algorithm.

5. incremental collection: implements the garbage collection algorithm, that is, garbage collection is performed simultaneously by the application.

6. Generation Division: Based on the garbage collection algorithm obtained after object lifecycle analysis. Objects are divided into young, old, and persistent generations. Different algorithms (one of the above methods) are used to reclaim objects in different lifecycles. The current garbage collector (beginning with J2SE1.2) uses this algorithm.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Java] When JVM is used, you can touch the fish in shortest.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Java] When JVM is used, you can touch the fish in shortest.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support