I believe in the same procedure. Apes often encounter garbage collection problems in their daily work or interviews, and are there any details in the dead of the dead of night on the JVM garbage collection mechanism? It doesn't matter if you don't have time to do it, because it will be smoothed out next.
- I. Technical background you need to know.
- Second, which memory needs to be recycled?
- 2.1 Reference Counting algorithm
- 2.1.1 Algorithm Analysis
- 2.1.2 Advantages and Disadvantages
- 2.1.3 is not very boring, to section code pressure yajing
- 2.2 Accessibility Analysis algorithm
- 2.3 References in Java What do you know?
- 2.4 Objects of death (recycled) before the last struggle
- 2.5 How the method area determines if recycling is required
- Three, the common garbage collection algorithm
- 3.1 Mark-Clear algorithm
- 3.2 Copy algorithm
- 3.3 Labeling-Sorting algorithm
- 3.4 Generation of collection algorithms
- 3.4.1 Young Generation's recovery algorithm
- 3.4.2 Aging Generation (old Generation) recovery algorithm
- 3.4.3 Durable generation (Permanent Generation) recovery algorithm
- Iv. Common garbage collectors
- V. When the GC was triggered (one of the most common questions of the interview)
- 5.1 Scavenge GC
- 5.2 Full GC
- Conclusion
I. Technical background you need to know.
According to the routine is to install X, talk about the JVM garbage collection of past life. When it comes to garbage collection (GC), most people take the technology as the companion product of the Java language. In fact, the history of GC is much older than that of Java, which used memory dynamic allocation and garbage collection techniques in the 1960 Lisp language. The experts who design and optimize the C + + language have a long snack.
Second, which memory needs to be recycled?
Apes know that the memory structure of the JVM consists of five areas: program counter, virtual machine stack, local method stack, heap area, method area. Where the program counter, virtual machine stack, the local method stack 3 regions with the thread, and out of the thread, so that the memory allocation and recovery of these areas are deterministic, there is no need to too much consideration of the recycling problem, because the end of the method or the end of the thread, memory naturally with the recycling. The Java heap and method areas are different and different! (How not to say the catchy), this part of the memory allocation and recycling is dynamic, it is the garbage collector need to pay attention to the part.
Before the garbage collector collects the heap and the method area, it is first to determine which objects of these areas can be recycled and which are not, which will use the algorithm to determine whether the object is alive or not! (The interviewer must have asked you a few questions.)
2.1.1 Algorithm analysis of 2.1 reference counting algorithm
The reference count is an early policy in the garbage collector. In this approach, each object instance in the heap has a reference count. When an object is created, the object instance is assigned to a variable that is set to 1. When any other variable is assigned a reference to this object, the count is incremented by 1 (a = B, then the counter of the object instance referenced by B + 1), but when a reference to an object instance exceeds the life cycle or is set to a new value, the reference counter of the object instance is reduced by 1. Any object instance that references a counter of 0 can be garbage collected. When an object instance is garbage collected, the reference counter of any object instance it references is reduced by 1.
2.1.2 Advantages and Disadvantages
Advantage : The reference counting collector can be executed very quickly, interwoven in the program running. It is advantageous to the real-time environment that the program needs not to be interrupted for a long time.
disadvantage : cannot detect a circular reference. If the parent object has a reference to a child object, the child object in turn references the parent object. In this way, their reference count will never be 0.
2.1.3 is not very boring, to section code pressure yajing
PublicClass referencefindtest {Public static< Span class= "hljs-function" > void main (string[] args) {MyObject Object1 = new myobject (); MyObject object2 = new myobject (); Object1.object = object1; object1 = null; object2 = null;}
This code is used to verify that the reference counting algorithm cannot detect a circular reference. The last two sentences will object1
be object2
assigned to null
, that is, object1
and the object pointed to is no object2
longer accessible, but because they refer to each other, resulting in their reference counters are not 0, the garbage collector will never reclaim them.
2.2 Accessibility Analysis algorithm
The Accessibility analysis algorithm is introduced from the graph theory of discrete mathematics, and the program regards all referential relationships as a graph, starting from a GC root of a node, looking for the corresponding reference node, and after finding the node, continue to look for the node's reference node, when all the reference nodes are searched, The remaining nodes are considered to be non-referenced nodes, i.e. useless nodes, and useless nodes will be judged as recyclable objects.
In the Java language, the objects that can be used as GC roots include the following:
A) The object referenced in the virtual machine stack (the local variable table in the stack frame);
b) The object referenced by the class static property in the method area;
c) The object referenced by the constants in the method area;
d) The object referenced by the JNI (native method) in the local method stack.
2.3 References in Java What do you know?
Whether the reference count of the object is judged by the reference counting algorithm or whether the reference chain of the object can be reached by the accessibility analysis algorithm, it is related to "reference" to determine whether the object is alive or not. In the Java language, the references are divided into strong references, soft references, weak references, and virtual references of 4, and these four reference intensities gradually weaken.
In program code, similar to Object obj = new Object()
this kind of reference, as long as a strong reference exists, the garbage collector will never reclaim the referenced object.
Used to describe objects that are useful but not necessary. For objects associated with soft references, these objects are then listed in the collection scope for a second collection before the system is about to occur with a memory overflow exception. If there is not enough memory after this collection, a memory overflow exception will be thrown.
is also used to describe a non-required object, but its strength is weaker than soft reference, and the object associated with the weak reference only survives until the next garbage collection occurs. When the garbage collector is working, the objects associated with a weak reference are reclaimed regardless of whether the current memory is sufficient.
Also called a phantom Reference or phantom Reference (a name that is actually taken, very magical), is the weakest reference relationship. Whether an object has a virtual reference exists, does not affect its lifetime at all, and cannot obtain an object instance through a virtual reference. Its function is to be able to receive a system notification when the object is reclaimed by the collector.
Do not be intimidated by the concept, do not worry, has not been irrelevant, and further into the, can not be said. The purpose of these four concepts is to illustrate whether the reference counting algorithm or the accessibility analysis algorithm is based on strong references.
2.4 Objects of death (recycled) before the last struggle
Even those unreachable in the Accessibility analysis algorithm are not "dead", when they are temporarily in the "probation" stage, to really declare an object to die, at least two times to go through the marking process.
First token : If the object discovers that there is no reference chain connected to the GC roots after the accessibility analysis, it will be marked for the first time;
Second token : After the first tag, a filter is followed, and the criteria for filtering is whether the object is required to execute the finalize()
method. In a finalize()
method that is not re-associated with the reference chain, it is marked for the second time.
The second mark of the successful object will really be recycled, and if the object is finalize()
re-associated with the reference chain in the method, it will flee this collection and continue to survive. The apes still follow, hehe.
2.5 How the method area determines if recycling is required
Apes, the way to store content in a method area is not the same as judging whether it needs to be recycled. The main contents of the method area are: Obsolete constants and useless classes. Obsolete constants can also be judged by the accessibility of references, but for useless classes it is necessary to satisfy the following 3 conditions:
- All instances of the class have been reclaimed, i.e. no instances of the class exist in the Java heap;
- The loading of the class
ClassLoader
has been recycled;
- The object of the class is
java.lang.Class
not referenced anywhere, and the method of accessing the class can be accessed anywhere by reflection.
Speaking for a long while, the protagonist finally to the stage.
Three, common garbage collection algorithm 3.1 mark-Clear algorithm
The tag-purge algorithm takes a scan from the root collection (GC Roots), marks the surviving objects, and then scans the unmarked objects in the entire space for recycling, as shown in. The tag-purge algorithm does not need to move objects, it simply handles the objects that are not alive, and is extremely efficient in the case of many surviving objects, but because the tag-purge algorithm directly reclaims the objects that are not alive, it can cause memory fragmentation.
3.2 Copy algorithm
The replication algorithm is proposed to overcome the overhead of the handle and solve the problem of memory fragmentation. It begins by dividing the heap into an object face and multiple free polygons, the program allocates space for the object from the object surface, and when the object is full, garbage collection based on the copying algorithm scans the active object from the root collection (GC Roots) and copies each active object to the free face ( So that there is no free hole between the memory occupied by the active object, so that the free surface becomes the object surface, the original object face becomes the idle surface, and the program allocates memory in the new object face.
3.3 Labeling-Sorting algorithm
The tag-grooming algorithm uses the tag-sweep algorithm to mark objects in the same way, but when it is cleared, all surviving objects are moved to the left-hand free space after reclaiming the space occupied by the objects that are not alive, and the corresponding pointers are updated. The tag-collation algorithm, which is based on the tag-purge algorithm and moves the object, is more expensive, but solves the problem of memory fragmentation. See the specific process:
3.4 Generation of collection algorithms
The generational collection algorithm is the algorithm used by most of the JVM's garbage collectors today. Its core idea is to divide the memory into several different regions based on the life cycle of the object's survival. In general, the heap zoning is divided into the old age (tenured Generation) and the Cenozoic (young Generation), there is a generation outside the heap is the permanent generation (permanet Generation). The characteristics of the old age is that only a small number of objects per garbage collection need to be recycled, and the new generation is characterized by a large number of objects to be recycled each time the garbage collected, then according to the characteristics of different generations to take the most appropriate collection algorithm.
3.4.1 Young Generation's recovery algorithm
A) All newly generated objects are first placed in the younger generation. The goal of the young generation is to collect as quickly as possible those objects with short life cycles.
b) The Cenozoic memory is divided into one Eden area and two Survivor (Survivor0,survivor1) regions according to the 8:1:1 ratio. One Eden area, two survivor districts (in general). Most objects are generated in the Eden area. The Eden zone survivor is copied to a Survivor0 area before being reclaimed, and the Eden area is emptied, and when the Survivor0 area is fully stocked, the Eden and Survivor0 zone survivors are copied to another Survivor1 area. Then empty the Eden and the Survivor0 area, where the Survivor0 area is empty, and then swap the survivor0 and Survivor1 areas, that is, keep the Survivor1 area empty, and so forth.
c) When the Survivor1 area is not sufficient to store the surviving objects of Eden and Survivor0, the surviving objects are stored directly in the old age. If the old age is full, it will trigger an all-time GC, that is, the new generation, the old generation are recycled.
D) The occurrence of a new generation of GC is also called minor GC,MINORGC frequency is higher (not necessarily when the Eden area is full before triggering).
3.4.2 Aging Generation (old Generation) recovery algorithm
A) objects that survive in the young generation after n garbage collection are placed in older generations. Therefore, it can be considered that older generations are storing objects with longer life cycles.
b) The memory is much larger than the new generation (approximately 1:2), when the old memory is full, the major GC is the Gc,full GC, which has a relatively low frequency, long survival time and high survival markers.
3.4.3 Durable generation (Permanent Generation) recovery algorithm
Used to store static files, such as Java classes, methods, and so on. The persistence generation has no significant impact on garbage collection, but some applications may dynamically generate or invoke some classes, such as Hibernate, at which point a large, persistent generation space is required to store the new class in these runs. The durable generation is also called the method area, and specific recycling can be found in section 2.5 above.
The apes are refueling to keep up, not far from the offer!!!
Iv. Common garbage collectors
The following figure is a list of all the collectors that the hotspot virtual machine contains, and the graph is borrowed from the drop:
- Serial collector (copy algorithm)
The new generation of single-threaded collectors, marking and scavenging are single-threaded, with the advantage of being simple and efficient. Is the default GC mode for the client level, which can be enforced by the -XX:+UseSerialGC
specified.
- Serial Old collector (marker-collation algorithm)
The old age single-threaded collector, the old age version of the serial collector.
- Parnew Collector (stop-copy algorithm)
The new generation collector, which can be considered a multithreaded version of the serial collector, has a better performance than serial in multi-core CPU environments.
- Parallel Scavenge Collector (stop-copy algorithm)
A parallel collector that pursues high throughput and uses CPUs efficiently. Throughput is typically 99%, throughput = user thread time/(user thread time +GC thread time). Suitable for background applications, such as the corresponding requirements for the interaction of the scene is not high. Is the default GC mode used by the server level, which can be used -XX:+UseParallelGC
to enforce the specified -XX:ParallelGCThreads=4
number of threads.
- Parallel Old Collector (stop-copy algorithm)
Parallel scavenge collector's old version, parallel collector, throughput preferred.
- CMS (Concurrent Mark Sweep) collector (tag-cleanup algorithm)
High concurrency, low pause, the pursuit of the shortest GC recovery pause time, high CPU consumption, fast response time, short pause time, multi-core CPU to pursue the choice of high response time.
V. When the GC was triggered (one of the most common questions of the interview)
Because objects are processed in a generational way, garbage collection areas and times are different. There are two types of GC: Scavenge GC and full GC.
5.1 Scavenge GC
In general, when a new object is generated and the Eden application space fails, the scavenge GC is triggered, GC is performed on the Eden Zone, the non-surviving objects are cleared, and the surviving objects are moved to the survivor area. Then tidy up the two districts of survivor. This method of GC is carried out on the young generation of the Eden area and does not affect the old generation. Because most objects start in the Eden area, and the Eden area is not very large, GC in the Eden area is frequent. Thus, it is generally necessary to use fast and efficient algorithms, so that Eden can be free as soon as possible.
5.2 Full GC
Organize the entire heap, including young, tenured and perm. The full GC is slower than the scavenge GC because it needs to be recycled across the heap, so you should minimize the number of complete GC times. In the process of tuning the JVM, a large part of the work is the tuning of the full GC. The full GC may be caused by the following reasons:
A) The old generation (tenured) is fully written;
b) The persistent generation (Perm) is fully written;
c) System.GC () is shown to be called;
d) Dynamic changes in the domain allocation strategy of the heap after the last GC;
Reference: Grilled a JVM garbage collection mechanism, next interview you ready?
Summary of the garbage collection mechanism of the JVM (garbage collector, recovery algorithms, garbage collectors)