The previous blog in a more in-depth understanding of the JVM's large memory, these days in addition to hard work, there is time to continue to explore the virtual machine inside some of the principles of implementation, and C + + Presumably, the biggest feature of the Java language is the main cross-platform and no need to manually reclaim memory (by the virtual machine responsible for GC). Today is mainly to summarize some of these days of some Java Virtual machine GC research results, if the reader feel that my blog is wrong, welcome readers to my blog to leave a comment, I make reference to determine the changes, so as not to mislead other readers.
The Java language establishes a garbage collection mechanism to keep track of objects being used and to discover and reclaim objects that are no longer used (referenced). This mechanism can effectively protect against the two dangers that can occur in dynamic memory allocation: memory exhaustion caused by excessive memory garbage, and improper memory release caused by illegitimate references. The core idea of garbage collection algorithm is: To identify the virtual machine available memory space, that is, the object in the heap space, if the object is being referenced, then call it a living object, conversely, if the object is no longer referenced, it is garbage object, it can reclaim its occupied space for redistribution. The choice of garbage collection algorithm and the reasonable adjustment of garbage collection system parameters directly affect the system performance, so it is necessary for developers to do more in-depth understanding. A Java-developed programmer, familiar with the virtual machine's GC is also easy to debug the program error tuning. If you want to understand the GC of a virtual machine, you first need to understand some of the following basic concepts:
Referencesthe reference to Java is divided into the following 4 types, the functions and corresponding characteristics are as follows:
Strong references: The most frequently used references;
Soft reference: The virtual machine will recycle the heap memory it points to when the heap memory is low;
Weak reference: The virtual opportunity recycles the heap memory it points to at the next GC;
Virtual reference: The weakest reference, unable to get to the object, can only know whether the object is GC;
generation of heap memoryThe heap of Java is the area of memory where objects are stored. Logically subdivide the heap into new generation (young Generation), older generation (old Generation), and permanent generation (Permanent Generation).
1. Cenozoic: Can be subdivided into three logical regions of Eden, from survivor and to survivor, with objects preferentially stored in the new generation of Eden areas.
2. The old age: After several garbage collection, the new generation of objects will still survive in the old age, and large objects can be stored directly in the old age without the new generation.
3. Permanent generation: The method area uses the permanent generation as the storage area, logically, the permanent generation is part of the Java heap, but is often referred to as "non-heap" (NON-HEAP) memory as a distinction. Method areas are typically used to hold information about a class (fields of classes loaded by the ClassLoader, method signatures, and so on), run constant pools (such as string constant pools), static reference variables, and so on.
Algorithms for GCReplication Algorithm (Copying): Divides the heap memory into two blocks, when one piece is in use in the memory space tense, the "survive" (still referred to) the object is copied to another piece of free memory area, and then empty the current memory space. The replication algorithm is often used as a new generation garbage collection strategy.
Tag-purge algorithm (MARK-SWEEP): The object that can be reclaimed is marked first, and then the unified purge is performed. Disadvantages: low efficiency and a large number of discontinuous memory fragments.
Mark-and-Organize algorithm (MARK-COMPACT): Mark out recyclable objects, move all surviving objects to one end, and then clean the memory area of the other side directly.
Generational collection algorithm (generational Collection): The Java heap is divided into the new generation, the old age, most of the new generation of objects are recyclable, and the old age of the majority of objects are not recyclable. The new generation adopts the replication algorithm: Most objects are recyclable, only a few surviving objects are replicated, and the recovery efficiency is high. In the old age only a small number of objects can be recycled and tagged efficiently, so the use of a tag-clear (No moving object), marker-collation (moving the surviving object to one side) algorithm is combined for recycling.
policies for memory allocation1. The newly created object will be stored in the new Generation Eden area, and one of the Survivor (survivor) regions (from Survivor).
2. When the heap memory is tight and the new generation objects are recycled, the surviving objects will be copied from the Eden and from survivor zones to the to Survivor Zone, if the to Survivor area is in memory, and some of the surviving objects will be copied directly to the old age, Then empty the Eden and from survivor zones. Role swaps in the next generation garbage collection, from survivor and to survivor regions.
3. Large objects (often referred to as strings or arrays of very long content) are placed directly into the old age to avoid repeated copies of large objects in the Cenozoic, which can reduce the overhead of performance.
4. (Cenozoic) long-term survival of the object will be put into the old age, the new generation of objects each in the survivor region to complete a copy, the age of the object and 1, when the age of the object is increased to a certain value (the default is 15), the object will be stored in the old age, in order to avoid the object in the
the Finalize method for GC root and ObjectThe Java Virtual machine uses a strategy called the root search algorithm (GC root tracing) to determine whether the current object is still alive, and if it does not survive, the memory will be recycled, the basic idea of this algorithm is basically: through a series of objects named "GC Root" as the starting point , starting from these nodes to search down, the path of the search is called the reference chain (Reference Chain) If there is an object A to "GC root" does not have any reference chain connected to it (from the algorithm is called the GC root node to a node path unreachable), This object means that it is not referenced and will be garbage collected by the virtual machine.
In the Java language, the objects that can be used as GC root include the following:
1, The Reference object of the virtual machine stack (the variable table in the stack frame);
2. A Reference object of the static Class property in the method area;
3. The Reference object of the constant in the method area;
4. the referenced object in the native method (JNI);
If there is no reference chain between the objects in the heap and the GC roots, the GC can recycle them. The Finalize () method of the object is called before it is reclaimed, and can be blocked by overwriting the method, re-linking the current object's reference to the GC roots, and blocking the GC from being recycled. It is important to note that the Finalize () method of an object is only executed once, and if the GC reclaims the object again, it cannot be prevented from being reclaimed by GC.
Types of GCNew Generation GC (Minor GC): The new generation of garbage collection is very frequent (freeing free space as quickly as possible) and highly efficient (using a replication algorithm, most objects can be recycled and only a few surviving objects are copied).
Old age GC (major/full GC): Garbage collection, efficiency in the old age is usually at least 10 times times slower than the minor GC of the new generation (using the mark-sweep, labeling-sorting algorithm), and each full GC will have at least one minor GC at a time, usually in a heap memory tension, Or the display calls System.GC () when the full GC is triggered.
GC trigger conditions and performance overheadVirtual machines have a high frequency of secondary GC, but because this GC occupies a very short time, it has little effect on the system. More notable is the trigger condition of the main GC, because it has a noticeable effect on the system. Overall, there are two conditions that trigger the main GC:
① The GC is called when the application is idle, that is, when no application thread is running. Because the GC is in the lowest priority thread, the GC thread is not called when the application is busy, except for the following conditions.
GC is called when the ②java heap is low on memory. When the application thread is running and a new object is created during the run, if there is not enough memory space, the virtual machine will forcibly invoke the GC thread to reclaim the memory for the new allocation. If the GC is still unable to meet the memory allocation requirements after one time, the virtual machine will make another two GC for further attempts, and if the requirements are still not met, then the VM would report an "out of memory" error and the Java application will stop.
Because the main GC is determined by the virtual machine according to the system environment, and the system environment is constantly changing, so the main GC is running with uncertainty and cannot predict when it will occur, but it can be determined that the main GC is repeated for a long-running application. According to the GC mechanism of virtual machine, the operation of the program will directly affect the change of the system environment, and thus affect the triggering of GC. Without design and coding for GC features, there is a series of negative effects such as memory presence. To avoid these effects, the basic principle is to minimize the garbage and reduce the cost of the GC process as much as possible. Also try not to explicitly call System.GC (), although this function recommends that the virtual machine carry out the main GC, although it is only recommended but not necessarily, but in many cases it will trigger the main GC, thus increasing the frequency of the main GC, which increases the number of intermittent pauses, It is also a significant performance overhead for the entire application or system performance, and may even reduce the productivity and user experience of the application.
Explore notes 2-java GC for virtual machines