In-depth understanding of Java virtual machines------notes

Last Update:2017-12-25 Source: Internet

Author: User

Tags deprecated volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently began to read the "Deep Java Virtual Machine" This book, just see this article, so share to everyone, in fact, some of the important points of the book in the regulations written out, let people read the overall structure of the control more clear!

In c we want to execute a piece of our own machine instructions, probably as follows:

typedefvoid(*FUNC)(int);char* str = "your code";FUNC f = (FUNC)str;(*f)(0);

In other words, we can do a tool, read the instructions from a file, and then run the instructions. The "Programmed machine instructions" in the above code, of course, refers to the ability to run on the CPU, if I also implement a translation machine: from its own defined format instructions to the CPU instructions, then you can execute the code according to the custom format. So is the above code equivalent to the simplest virtual machine? The overall structure of the JVM is as follows:

The role of ClassLoader is to load instructions that can be identified by the JVM (not just from disk files or memory), so let's look at the format:

The magic number and the version are not said (the file format of the street is this thing), followed by the constant pool, which is nothing more than two kinds of things:

Literal constants (such as Integer, Long, string, etc.);
Symbol reference (where is the method?) What kind of? ）；

And we know that in the JVM the class is based on the fully qualified name to find, then the description of the method should also be the case, then the relationship between these constants are as follows:

In the next "access" to indicate whether the class is public or private, and this&super&interface on the surface of the "class", "Inherit from which class", "implemented which interfaces", In fact, this simply preserves the Constant_class_info (U2) that represents this information.

It feels like Nameindex and descriptorindex add up and nameandtype a bit, so why not just use a Nameandtype index value? The biggest difference between MethodInfo and FieldInfo is attributes. For example, FieldInfo's attribute table holds the initial value of the variable, while the MethodInfo attribute table holds the byte code. So let's look at these attributes in turn, first of all code:

Java Learning Exchange QQ Group: 603654340 We'll learn java! together.

There are a couple of interesting places:

From the class file you can know the depth of the stack during execution;
For non-static methods, the compiler passes the This parameter to the method;
The range of records in the exception table is the number of lines of the instruction (not the source code);
The exception here refers to the Try-catch, while the exception table with code class refers to throws out;

The exceptions is very simple:

Linenumbertable preserves the relationship between the bytecode and the source code, with the following structure:

Localvariabletable describes the relationship between variables in a local variable table in a stack frame and variables defined in the source code, as follows:

SourceFile indicates the Java source file name that generated the class file (for example, many class files are generated when many classes are declared in a Java file), with the following structure:

The deprecated and synthetic properties only have the difference between "have" and "no":

Deprecated: By the program author is no longer recommended to use, through the @deprecated annotation description;
Synthetic: Indicates that a field or method is generated automatically by the compiler, such as <init>;

Which is why the code attribute is attribute behind it?

The time to load a class is simple: load it when you use it (nonsense!). ）。 Take a look at the process of class loading:

The above process is done: ClassLoader, this thing is still very important, in the JVM through the ClassLoader and the class itself to determine whether the two class is the same. In other words: different ClassLoader load the same class file, and the JVM thinks that the classes they generate are different. Sometimes the stream is not loaded from the class file (such as Java applets are loaded from the network), then this classloader and ordinary implementation logic is certainly not the same, through the different ClassLoader can solve the problem.

But allowing the use of different ClassLoader also raises new questions: What if I declare a java.lang.Integer, but the code inside is very dangerous? This leads to the parental delegation mode:

In addition to the top-level startup ClassLoader, the rest of the ClassLoader should have a parent classloader (implemented by a combination), which takes precedence in delegating to the parent ClassLoader to complete when it receives a request to load the class.

In this case, the system's class loader is preferred when loading the java.lang.Integer, so that the user does not load their own writing. In Java programmers see there are 3 types of system-provided ClassLoader:

Bootstrap ClassLoader: Responsible for loading the <java_home>\lib directory of the class library, can not be directly referenced by the JAVA program;
Extension ClassLoader: Responsible for loading <java_home>\lib\ext, developers can directly use;
Application ClassLoader: Loads the class library specified on the CLASSPATH and uses it if it does not have its own class loader defined;

The default class would be to have application ClassLoader to load the class, and then recursively use application ClassLoader to load (as mentioned in the previous loading process) if the new type was found to be used. This way, only in their own program can use their own classloader to load the class, and the loaded class is not used by others.

The parental delegation mode is not a mandatory constraint, but rather a class load implementation that Java designers recommend to developers. 3 "breaks" in parental delegation mode:

To be compatible with JDK 1.0, the user is recommended to overwrite the Findclass method;
There is a problem with the code in the underlying class to access the user class (such as Jndi): line Chengshan the following class loader;
Some of the user's needs, such as HotSwap, OSGi, etc.;

After the loading is complete, the next step is to see how the program works. Stack frames are used to support virtual machines for method invocation and execution, which means that the frame is a unit that pushes the stack frame into the stack when other methods are called, and the structure is as follows:

After the class file has been compiled, the number of local variables required at run time has been determined (as seen in the previous class file), so it is important to note that this feature may cause a GC to be thrown (and not detailed here). In the stack, always the underlying stack to call the upper stack (and a certain number of adjacent), then they pass the parameter (return the result) is often by pressing it into the operand stack, some virtual machines in order to improve this part of the efficiency of the adjacent stack frame "entangled" together:

So what we're going to see is how the method is executed, and the first question is which method is executed? There doesn't seem to be a problem in "process-oriented" programming, but in Java OR C + + It's a matter of comparing egg aches. The reason is that it doesn't usually work that way, but you have to figure it out. There are two ways that the JVM can determine the target method:

Static Dispatch: Determines which method to invoke based on the parameter type and method name. However, it does not mean that no matching type is reported, such as: func (int a), and the method is called when Func (' a ') is invoked (without the Func (char a), of course), so the key to the person is a bit like a chain of processing. No matter how complex this is, these are determined during compilation, as this is a look up.
Dynamic Dispatch: The most common is interface a = new Implements (), which class A call method should be, is not deterministic during compilation. In fact, dynamic dispatch is very simple to implement: get the actual type of the object when calling the method.

In fact, "static" and "dynamic" to people's feeling is still relatively vague, "static allocation" to the person's feeling is based on the type of parameter upward lookup method, "dynamic allocation" to the person's feeling is based on the actual type of the instance to look up. The efficiency of virtual machine optimal dynamic dispatch is generally to create a virtual method table for the class in the method area:

The virtual method table holds the actual entry address of each method, if a method is not overridden in the subclass, then the address entry in the virtual method table of the subclass and the address entry for the same method as the parent class are consistent, pointing to the implementation entry of the parent class. If the subclass overrides this method, the address in the subclass method table will be replaced with the entry address that points to the subclass implementation version. In fact, in simple, it is a pretreatment.

The execution of a single method is very simple, write a simple program and then use Javap-c, and then combine the meaning of each instruction to be able to know how the program is executed and returned (largely based on the stack), here is not deep and detailed.

In general, from Java files to run, the total will go through two stages: Java to class file and execute class file. The first stage is actually compiled, in this process is more interesting is "grammatical sugar" (Other such as lexical analysis and grammatical analysis will not say, here omit 10,000 characters ~!~). The so-called Java syntax sugar is: for traversal shorthand, auto-boxing, generics and so on (in fact, there is no sense of string+string is also a syntactic sugar, in practice will become stringbuffer append). One of the more interesting is generics:

Generics in Java and the generic principle in C + + are different: for C + + list<a> and list<b> are two things, and in Java list<a> and list<b> are list< Object> Because object in Java is the parent object of all objects, then object o can point to all of the objects, then you can use List<object> to save all the set of objects (feeling the implementation is a bit obsolete).

One problem here is object deletion, such as the following code:

Static void func (list<integer> a) {        return;}

When you use JAVAP to view the generated class, you will find:

Static void func (java.util.List);  Signature: (Ljava/util/List;) V  Code:   0:   return

There is no trace of an integer at all, but if you add a return value, that is:

Static Integer func (list<integer> a) {        returnnull;}

When you look at it again, it becomes:

Static java.lang.Integer func (java.util.List);  Signature: (Ljava/util/list;) ljava/lang/Integer;  Code:   0:   aconst_null   1:   Areturn

The principle of generic implementations can be used to understand many of the problems encountered in real life, such as the use of the list when the inexplicable type cast error.

The second part, the actual execution of the class file, is now discussed. The two concepts that are often mentioned in C + + are debug and release, while the two concepts commonly mentioned in Java are server and client (although they are divided according to the exact difference), the client and server two modes correspond to two compilers:

The client corresponds to the C1 compilation: compiles the bytecode code and takes time-consuming and reliable optimizations, adding performance monitoring when necessary.
The server corresponds to C2 compilation: Compile the bytecode code and take a longer time-consuming optimization, and may also perform some unreliable optimizations based on performance monitoring results.

In the case where the monitor has found hot code (a method that has been called many times or is executed many times), it will want the immediate compiler to submit a code compilation request for that method. When this method is called again, it checks to see if there is a JIT-compiled version of the method, and if it does, it takes precedence over the compiled native code. By default, the process of compiling native code is parallel to the old code (that is, interpreting the execution bytecode), and you can use-xx:-backgroundcompilation to suppress background compilation, which means that the execution thread will execute the generated native code after the island is compiled.

There is a more fun thing to do when compiling optimizations, escape analysis (the so-called escape refers to being able to be referenced from outside the method) and can be optimized for objects that do not escape:

Allocating objects on the stack can reduce the pressure on the GC;
There is no need to thread synchronize the objects that are escaping;
If an object cannot escape, you can not declare the object in the method, but put some "parts";

On the question of the efficiency of Java and C + +, it doesn't make sense to talk about it: the language to the end must be to generate machine instructions, in the language of the mechanism of the different, resulting in different languages between the process of generating machine instructions may differ, But the process of generating this is not a half-dime relationship with our yards (more accurately, we don't know about the process we're producing), so don't fight for the high efficiency (or even the better) before figuring it out.

The concurrency of a program is primarily concerned with some of the problems that may occur when different threads operate on the same piece of memory (for things like file locks, cough), first understand the relationship between threads and memory:

The main memory here is like a memory bar, working memory is like a register +cache. The Java memory model defines the operations in 8, and their execution is as follows:

The most lightweight synchronization mechanism in a Java Virtual machine: volatile, which has the following properties:

When a variable changes, it is immediately visible to other threads;
Prohibit command reordering optimization;

The implementation of the volatile from the point of view of the Java memory model operation is quite simple: load must be used before use and must be store after assign, so that it is guaranteed to be read from the main memory each time, and each assignment will be synchronized to the main memory (seemingly nonsense). Thread synchronization is mainly considered from three aspects:

Atomicity: Long and double require special consideration;
Visibility, and final (synchronized) except for volatile;
Order: Command rearrangement, of course, can prohibit the command rearrangement;

If you think about syncing at any time, that code is exhausting. The following are the natural first-mover relationship of the Java memory model:

The order in which the control flow is executed is consistent with the order of the Code;
Unlock first occurs after the lock operation facing the same lock;
The write operation to the volatile variable precedes the read operation of the variable;
The start method of the thread precedes any action on the thread;
All the actions of the thread precede the thread termination detection;
The invocation of the thread interrupt method precedes the occurrence of the interrupt event detected by the code of the interrupted thread;
The initialization of the object is done prior to the Finalize method invocation;
transitivity;

In fact, the above eight rules are still very interesting, if one of them is not established what will happen? After all, the Java thread is still a user-level thread, so what is it (the problem is also tangled in C). There are several main ways to implement threading:

Use a kernel thread (a lightweight process) to proxy;
Completely in the user-State implementation, the kernel can not feel;
User and Kernel hybrid implementation, each do their own good things;

Here does not go deep to see (although here the introduction of the root did not say the same), think about it all know that different virtual machines in different operating systems above the implementation of the way is probably not the same, if you want to look deep or pthread more interesting. Other things to note about threads, such as state transitions, are not discussed here.

Thread safety: When multiple threads access an object, it is thread-safe to call this object if it is not necessary to take into account the scheduling and alternation of these threads in the run-time environment, or to make additional synchronizations, or to perform any other coordinated operation on the caller.

The variables shared in Java threads can be divided into the following five types:

Immutable: This does not need to be explained (and does not necessarily have to be final modified);
Absolute thread safety: that is, to satisfy the above thread-safe description;
Relative thread safety: it is simple to say that a call to a single behavior does not go wrong;
Thread-compatible: objects are not thread-safe, but can be remedied by the caller's synchronization;
Thread antagonism: No matter how the caller handles it, it can not be used in multi-threaded environment;

There are several ways to implement a lock:

Mutex synchronization is not to say that the waiting thread will keep waiting;
Non-blocking synchronization, optimism (conflict is not as much as we think);

Spin locks are a good choice if you switch between threads very often, so you don't need the overhead of system calls when switching threads. If a task can be completed quickly, it may be a good idea to lock the whole process (rather than lock each sub-process). Other lock optimizations include "lightweight lock" and "eccentric lock".

In-depth understanding of Java virtual machines------notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More