Jvm learning notes 1 (garbage collection algorithm) and jvm learning notes
I. Reasons for the garbage collection mechanism
In java, when no object reference points to the memory originally allocated to an object, the memory becomes garbage. A jvm system-level thread Automatically releases the memory block. Garbage collection means that the object no longer needed by the program is "useless information", which will be discarded. When an object is no longer referenced, the memory recycles the occupied space so that the space is used by new objects. In fact, in addition to releasing useless objects, garbage collection can also clear memory record fragments. Because the created object and the garbage collector release the memory space occupied by the discarded object, memory fragments may occur. Fragments are idle memory holes between memory blocks allocated to objects. Fragment moves the occupied heap memory to the end of the heap, And the JVM allocates the organized memory to the new object.
Advantages of garbage collection: Garbage collection can automatically release memory space and reduce the programming burden. First, it can improve programming efficiency. When there is no garbage collection mechanism, it may take a lot of time to solve an obscure storage problem. When programming in Java, garbage collection can greatly shorten the time. Second, it protects program integrity. Garbage collection is an important part of Java's security policy.
Disadvantages of garbage collection: the overhead of garbage collection affects program performance. The Java virtual machine must track useful objects in the running program and finally release useless objects. This process takes processing time. Secondly, due to the incompleteness of the garbage collection algorithm, some garbage collection algorithms used earlier cannot guarantee that 100% of the garbage collection algorithms collect all the discarded memory. Of course, with the continuous improvement of the garbage collection algorithm and the continuous improvement of the operating efficiency of software and hardware, these problems can be solved.
Ii. Garbage collection algorithm analysis
1. Reference Counter algorithm:
The reference counting method is the only method that does not use the root set for garbage collection. This algorithm uses the reference counter to distinguish between a surviving object and an object that is no longer in use.
The counter reference algorithm sets a counter for each object. When this object is referenced somewhere, counter + 1. When the reference fails, counter-1, when the counter is 0, the JVM considers that the object is no longer used and is "junk.
Counter reference is simple and efficient, but it cannot solve the problem of circular reference (Object A references object B, object B references object A again, but object, object B is no longer referenced by any other object. At the same time, the increase or decrease of counter brings a lot of extra overhead. Therefore, after JDK1.1, this algorithm is no longer used.
2. root search method:
Most garbage collection algorithms use the root set concept; the so-called root set is the set of referenced variables that can be accessed by running Java programs (including local variables, parameters, and class variables ), the program can use reference variables to access object attributes and call object methods. Garbage collection first needs to determine which are reachable and which are inaccessible from the root, and all objects reachable from the root set are active objects, which cannot be recycled as garbage, this also includes objects indirectly accessible from the root set. Objects that cannot be reached through any path in the root SET meet the garbage collection conditions and should be recycled. The following describes several common algorithms.
2.1 Mark-clearing algorithm (Mark-Sweep)
Principle: The tag-clearing algorithm has two phases: "tag" and "clear ". In the tag phase, identify all objects to be recycled and mark them. The purge phase follows the markup phase to clear unavailable objects.
Disadvantages: The Mark-clearing algorithm is the basic collection algorithm. The efficiency of the mark and clear phases is not high, and a large amount of discontinuous space is generated after the removal. In this way, when the program needs to allocate large memory objects, you may not be able to find enough continuous space.
Before garbage collection:
After garbage collection:
Green: Active Object RED: recyclable object White: unused space
2.2 Copying)
Principle: The replication algorithm divides the memory into two equal parts. Each time one of them is used, when garbage collection is performed, the surviving objects are copied to the other, then the entire memory is cleared.
Disadvantages: The replication algorithm is easy to implement and runs efficiently. However, because only half of the algorithm can be used at a time, the memory usage is not high. The current JVM uses the replication method to collect the New Generation. Most of the objects (98%) in the new generation are killed overnight, so the ratio of the two memory blocks is not (about ).
Before garbage collection:
After garbage collection:
Green: Active Object RED: recyclable object White: unused space
2.3 Mark-collation algorithm (Mark-Compact)
Principle:The markup-sorting algorithm is the same as the markup-clearing algorithm, but the markup-sorting algorithm does not copy a surviving object to another memory, but moves the surviving object to one end of the memory, then, the memory outside the boundary is recycled directly.
The tag-sorting algorithm improves the memory utilization and is suitable for the old age when the collection object remains alive for a long time.
Before garbage collection:
After garbage collection:
Green: Active Object RED: recyclable object White: unused space
The above four are the most basic garbage collection algorithms. Based on these four algorithms, other algorithms are developed. I personally think they are more of a memory management strategy, it can even be seen as a combination of basic algorithms, just like the relationship between addition, subtraction, multiplication, division, and equation, so it is described separately.
3. Progressive Algorithm
These algorithms are generally composed of basic algorithms. Their core idea is to manage memory in different regions and adopt different memory management policies in different regions and at different times, this prevents the program from being suspended for a long time due to garbage collection across the system. These algorithms generally include the following types:
3. 1. Train Algorithm
This algorithm divides mature memory space into fixed-length memory blocks, which are executed separately in a block each time. Each block belongs to a set. The specific execution steps of this algorithm are complex and there are no specific application scenarios. You can study this algorithm on your own if you are interested.
. Generational collection Algorithm
This algorithm has been applied in sun/oracle Hotspot virtual machines. It is an algorithm that java Programmers need to focus on.
This algorithm is obtained by analyzing the object lifecycle. It divides the heap memory into three parts: the young generation, the old generation, and the persistent generation (equivalent to the method area ). Objects in different lifecycles are stored in different regions and recycled using different algorithms. The detailed execution details of this algorithm will be detailed in subsequent study notes.
For more information about generation collection, see http://www.importnew.com/19255.html
In addition, it should be noted that the heap garbage collection technology implemented by multithreading is derived from the network and extracted as follows:
4. multi-threaded heap garbage collection technology
4. 1. Serial collection
Serial collection uses a single thread to process all garbage collection tasks, because multi-thread interaction is not required, which is easy to implement and highly efficient. However, its limitations are also obvious, that is, the advantage of being unable to use multi-processor, so this collection is suitable for single-processor machines. Of course, this collector can also be used on a multi-processor machine with a small amount of data (about MB.
Applicability: applications with a small data volume (about MB) and a single processor that have no requirements for response time.
Disadvantage: it can only be used for small applications.
4. 2. Parallel collection
Parallel collection uses multiple threads to process garbage collection, which is fast and efficient. In theory, the more CPUs, the more advantageous the parallel collector is.
Applicability: medium and large applications with high throughput requirements, multiple CPUs, and no application response time requirements. Example: background processing and scientific computing.
Disadvantage: The application response time may be long.
4. Concurrent collection
Compared with serial collection and parallel collection, the first two must suspend the entire runtime environment while the garbage collection program is running. Therefore, the system will suspend garbage collection obviously, and the pause time will be longer because of the larger the heap.
Applicability: medium and large applications with high requirements on response time, multiple CPUs, and high requirements on application response time. Example: Web server/application server, telecom exchange, and integrated development environment.
Refer:
Http://blog.csdn.net/zsuguangh/article/details/6429592
Http://blog.csdn.net/ol_beta/article/details/6791229
Https://my.oschina.net/GameKing/blog/198347