first Knowledge of the JVM
Introduction to the JVM
While understanding the JVM is not a requirement for developing or running Java programs, you can avoid a lot of performance problems by learning more about JVM knowledge.
Java Virtual machine (JVM) refers to the operating environment of Java applications, in general, the JVM is defined by the specification of a virtual computer, is designed to interpret the execution from Java source code compiled from the bytecode. More generally, the JVM refers to a specific implementation of this specification. This implementation is based on a strict instruction set and a comprehensive memory model. In addition, the JVM is often described as an implementation of the software runtime environment. Typically, the JVM implementation is mainly referring to hotspot.
The JVM specification guarantees that any implementation can interpret the execution byte code in the same way. The implementations can be diverse, including processes, stand-alone Java operating systems, or processor chips that execute bytecode directly. The JVM we know most is implemented as software, running on popular operating system platforms (including Windows, Linux, Solaris, and so on).
The structure of the JVM allows for finer control over a Java application. These applications run in the sandbox (Sandbox) environment. Ensure that the local file system, processor, and network connection cannot be accessed without proper permission. When remote execution, the code also requires certificate authentication.
In addition to interpreting the execution of Java bytecode, most JVM implementations include a JIT (Just-in-time instant) compiler for generating machine code for commonly used methods. The machine code uses the local language of the CPU, which has a faster speed than the byte code. JVM Physical Structure
The JVM physical structure is as follows, let's take a look at:
Method area
Also known as "permanent generation", "not heap", used to store virtual machine loaded class information, constants, static variables, is the memory area shared by each thread. The default minimum value is 16MB and the maximum is 64MB, and the size of the method area can be limited by-xx:permsize and-xx:maxpermsize parameters.
Run a constant-amount pool: is part of the method area, in addition to the description of the version, field, method, and interface of the class file, there is also a constant pool that holds the various symbolic references generated by the compiler, which are placed in the Run-time pool of the method area after the class is loaded. Java stack
Describes the memory model executed by the Java method: Each method is executed with the creation of a "stack frame" to store information such as local variable tables (including parameters), operation Stacks, method exits, and so on. Each method is called to the completion of the process, corresponding to a stack frame in the virtual machine stack from the stack to the process. The declaration cycle is the same as the thread, and is thread-private.
The Local variables table holds the various basic data types, object references (reference pointers, not the objects themselves) that are known to the compiler, where 64-bit long and double types of data occupy 2 local variables, and the remaining data types account for only 1. The memory space required by a local variable table is allocated during compilation, and when a method is entered, it is entirely certain that the method needs to allocate a large number of local variables in the stack frame, and the stack frame does not change the size space of the local variable table during runtime. Local method Stack
Similar to the virtual machine stack, the difference is that the virtual machine stack performs Java method services for the virtual machine, while the local method stack serves the native method. Java heap
Also called the heap, the GC heap is the largest chunk of memory in the memory managed by the Java Virtual machine, and the memory area shared by each thread, created when the JVM is started. The memory area holds the object instance and the array (all new objects). Its size is set by the-XMS (minimum) and-xmx (maximum) parameters,-xms the minimum memory requested for the JVM when it is started, the default is 1/64 of the operating system's physical memory, but less than the maximum amount of memory that can be requested by the JVM, which defaults to 1/4 of physical memory but less than 1G By default, when the free heap memory is less than 40%, the JVM increases the size specified by heap to-XMX, which can be specified by-xx:minheapfreeration=, and when the free heap memory is greater than 70%, the JVM reduces the size of the heap to the size specified by the-XMS. You can specify this column by xx:maxheapfreeration=, and for running the system, to avoid frequent resizing of heap at run time, the-XMS is usually set to the same value as-XMX.
Since collectors are now using a generational collection algorithm, the heap is divided into Cenozoic and old eras. The new generation mainly stores newly created objects and objects that have not yet entered the old age. Older generations store objects that are still alive through multiple Cenozoic GC (Minor GC). Memory Model
The Java memory model is built on the concept of automatic memory management. When an object is no longer referenced by an application, the garbage collector reclaims it, freeing up the appropriate memory. This is very different from many other languages that need to free up memory themselves.
The JVM allocates memory from the underlying operating system and divides them into the following areas:
Heap space: This is a shared memory area for storing objects that can be reclaimed by the garbage collector (Heap).
Method Area: This area was formerly known as the "eternal generation" (Permanent
Generation), which is used to store the loaded classes. The area was recently canceled by the JVM. The loaded class is now loaded into the local memory area of the underlying operating system as metadata.
Region (Native area): This area is used to store references and variables of the base type.
An effective way to manage memory is to divide the space into different generations so that the garbage collector does not have to scan the entire heap. Most objects have a short life cycle, and those with longer life cycles are often not purged until the application exits.
When a Java application creates an object, the object is stored in the "Primary Pool" (Eden Pool). Once the primary pool is fully stored, a minor GC (a small range of garbage collection) is triggered in the Cenozoic. First, the garbage collector marks those "dead objects" (objects that are no longer being applied), while extending the lifecycle of all reserved objects (the length of the lifecycle is described numerically, representing the number of garbage collections experienced during the period). The garbage collector then reclaims the dead objects and moves the remaining live objects to the "Survivor Pool" (Survivor Pool), emptying the primary pool.
When an object survives a certain period of time, it is moved to the previous generation in the heap: "Lifetime generation" (Tenured pool). Finally, when lifetime generations are filled, a full GC or major GC (complete garbage collection) is triggered to clean up lifetime generations.
In general, we combine the area of the primary pool and the surviving pool into the Cenozoic, consisting of Eden Space and two survivor spaces of the same size (often called S0 and S1 or from and to), which can be specified by the-XMN parameters or by the-XX: Survivorration to adjust the size of Eden space and survivor space.
The region of the lifetime is the old generation, used to store after several generations of GC still surviving objects, such as caching objects, new objects may be directly into the older era, there are two main situations: ①. Large objects, which can be set by the startup parameter-xx:pretenuresizethreshold= 1024 (in bytes, the default is 0) to represent more than the Cenozoic distribution, but directly in the old age distribution. ②. Large Array object that does not refer to an external object in a tangent array.
Correspondingly, the GC produced in the Cenozoic is called the minor GC, and the GC produced in the generation is called the full GC. Hopefully, you'll get a better understanding of the terminology you see elsewhere.
When the garbage collection (GC) is executed, all application threads are stopped and the system produces a pause. Minor GC is very frequent, so the optimized ability to quickly reclaim dead objects is the main recovery method for the new generation of memory. The major GC runs much more slowly because it scans very many living objects. The garbage collector itself has a variety of implementations, and some garbage collector can perform major GC more quickly under certain circumstances.
The size of the heap is dynamic and is allocated from memory only when the heap needs to be expanded. When the heap is filled, the JVM allocates more memory to the heap until it reaches the upper limit of the heap size, which also causes the application to stop briefly. The basis of GC algorithm Introduction to GC
Garbage collection (garbage COLLECTION,GC), as the name implies, garbage collection is the space to release garbage, while in Java, programmers do not need to care about the dynamic allocation of memory and garbage collection, all of this to the JVM to deal with. So in Java, what kind of object will be considered "garbage". So when some objects are identified as garbage, what policies are used to recycle (free up space). What are the typical garbage collectors in the current business virtual machines?
Before delving into the implementation details of the GC algorithm, it is best to understand the relevant terminology and the underlying rationale. The implementation details vary from one collector to another, but in general, all the basic recyclers focus on the following two areas:
Find all the surviving objects.
Clean out all other objects--objects that are considered obsolete or useless object tags can reach objects
All modern GC algorithms used in the JVM identify all surviving objects before recycling.
First, some special objects are defined as GC root objects by the garbage collector. The so-called GC root objects include:
All local variables and incoming arguments in the current execution method
Active threads
Static variables in the loaded class
JNI Reference
There are several key points to note about the marking phase:
You need to pause the application thread before starting the tag, otherwise you can't really traverse it if the object graph is constantly changing. Pausing the application thread so that the JVM can take care of the housework is also called a safety point (safe
Point), which triggers a stop of the world (STW) pause. There are many reasons for triggering a security point, but the most common should be garbage collection.
The length of the pause time does not depend on how much of the object in the heap is not the size of the heap, but how much of the surviving object. As a result, the size of the heap will not affect the length of time in the marking phase.
When the marking phase is complete, the GC begins the next phase and deletes the unreachable object. How to determine if an object is useless (that is, "garbage")
In Java, references are associated with objects, meaning that if you want to manipulate an object, you must do so by reference. So it's obvious that a simple way is to determine whether an object can be recycled by reference counting. In general, if an object does not have any references associated with it, the object is not likely to be used elsewhere, and the object becomes a recyclable object. This way becomes a reference counting method.
This approach is characterized by simplicity and efficiency, but it does not solve the problem of circular references, so this approach is not used in Java (Python uses reference counting). Look at the following code:
public class Main {public
static void Main (string[] args) {
MyObject object1 = new MyObject ();
MyObject object2 = new MyObject ();
Object1.object = Object2;
Object2.object = Object1;
Object1 = null;
Object2 = null;
}
}
Class Myobject{Public
object = null;
}
The last two sentences assign the Object1 and object2 to null, meaning that Object1 and Object2 objects are no longer accessible, but the garbage collector never reclaims them because they reference each other and cause their reference count to be 0.
In order to solve this problem, a method of accessibility analysis was adopted in Java. The basic idea of this method is to search through a series of "GC Roots" objects as the starting point. If there is no accessible path between GC Roots and an object, the object is said to be unreachable, but it should be noted that an object that is judged to be unreachable does not necessarily become a recyclable object. Objects that are judged to be unreachable must undergo at least two markup processes, and if they still do not escape the possibility of being recyclable in the two-time markup process, they are essentially recyclable objects.
Look at one more example:
String str = new string ("Hello");
softreference<string> sr = new Softreference<string> (New String ("Java"));
Weakreference<string> WR = new weakreference<string> (New String ("World");
Which of these three sentences makes a string object a recyclable object. The 2nd and 3rd sentences, and the 2nd sentence, where there is insufficient memory, will determine the string object as a recyclable object, and the 3rd sentence, in any case, the string object is judged to be a recyclable object.
Finally, summarize the more common things that you would normally encounter when determining objects as recyclable objects: 1. Displays a reference to null or a reference to an object that is already pointing to a new object, such as the following code:
Object obj = new Object ();
obj = null;
Object obj1 = new Object ();
Object obj2 = new Object ();
Obj1 = Obj2;
2. The object that the local reference refers to, such as the following code:
void Fun () {
...
.. for (int i=0;i<10;i++) {
Object obj = new Object ();
System.out.println (Obj.getclass ());
}
After each execution of the loop, the generated object object becomes recyclable. 3. Only weakly referenced objects associated with them, such as:
Weakreference<string> WR = new weakreference<string> (New String ("World");
algorithms for cleaning up garbage
After determining what garbage can be reclaimed, the garbage collector does is start garbage collection, so next we introduce the core ideas of several common garbage collection algorithms. Mark-sweep (Mark-purge) algorithm
This is the most basic garbage collection algorithm, the reason is that it is the most basic because it is the easiest to implement, the idea is the simplest. The tag-purge algorithm is divided into two phases: the marking phase and the purge phase. The task of the mark phase is to mark all objects that need to be reclaimed, and the cleanup phase is the space occupied by the object being tagged. The specific process is shown in the following illustration:
After marking
Before Mark
It is easy to see from the diagram that the tag-purge algorithm is easy to implement, but one of the more serious problems is that it is easy to generate memory fragmentation, and too much fragmentation can lead to a new garbage collection action that will not be able to find enough space in the subsequent process to allocate space for large objects. Copying (copy) algorithm
In order to solve the defects of mark-sweep algorithm, the copying algorithm is presented. It divides the available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is run out, the surviving object is copied to another piece, then the memory space is cleaned up once, so it is not easy to have the memory fragmentation problem. The specific process is shown in the following illustration:
Before recycling
After recycling
This algorithm is simple, efficient and not easy to generate memory fragmentation, but the use of memory space has made a high price, because the memory can be used to reduce the original half.
Obviously, the efficiency of the copying algorithm is very much related to the number of the surviving objects, if there are many surviving objects, then the efficiency of the copying algorithm will be greatly reduced. Mark-compact (Mark-finishing) algorithm
In order to solve the defects of copying algorithm and make full use of memory space, a mark-compact algorithm is proposed. The algorithm marks the same phase as the Mark-sweep, but after the tag is finished, it does not clean the recyclable object directly, but instead moves the surviving objects to one end, and then cleans out the memory outside the end of the boundary. The specific process is shown in the following illustration:
Before recycling
After recycling
generational Collection (generational collection) algorithm
The Generational collection algorithm is an algorithm used by most of the JVM's garbage collectors at present. Its core idea is to divide memory into a number of different regions based on the life cycle of the object's survival. In general, the heap zoning is divided into the old age (tenured Generation) and Cenozoic (young Generation), the old age is characterized by a garbage collection only a small number of objects need to be recycled, and the new generation is characterized by each garbage collection has a large number of objects need to be recycled, Then the most suitable collection algorithm can be adopted according to the characteristics of different generations.
At present, most of the garbage collectors in the new generation are taking copying algorithm, because each garbage collection in the Cenozoic to reclaim most objects, that is, the number of operations to replicate less, but in practice is not in accordance with the proportion of 1:1 to divide the new generation of space, In general, the Cenozoic is divided into a larger Eden space and two small survivor space, each time using the Eden Space and one of the survivor space, when recycling, Copy the surviving objects in Eden and survivor to another survivor space, and then clear out Eden and the survivor space that you just used.
And because of the characteristics of the old age is a collection of only a small number of objects, the general use of the mark-compact algorithm.
Note that there is another generation outside the heap area that is the permanent generation (permanet Generation), which is used to store class classes, constants, method descriptions, and so on. Recycling for a permanent generation mainly reclaims two parts: discarded constants and unwanted classes. last
Writing this article is a mark of my first year of study, so in the end I think I need to thank the Guide me to embark on this study of the "teacher", he is my brother, very skillful, with a professional, the same hometown, in the first day I came to school actually knew him, He gave us a few people to introduce the Android development he was learning. Only then did not know his name, and I also because did not join the laboratory and did not want to study. Until the winter holiday, that is, this time last year, accidentally in the home group and saw a brother, then my brother and I chatted a lot, and I was from that moment began formally contact with the technology now learning. I have always been very grateful to brother, if not he, I will not be the present me. This year's study, but also he let me know more, brother often come to the lab, every time is to chat with me, and I can benefit a lot from it every time. Thanks brother, let me become now I, let me like this kind of study and life.
Senior's Blog
Yinhuan_