Java Virtual machine Learning (iv)--garbage collection algorithm

Source: Internet
Author: User

1. Garbage collection algorithm

  The implementation of garbage collection algorithm involves a lot of program details, and the methods of virtual machine operating memory of each platform are different, and the idea and development process of several garbage collection algorithms are introduced.

1.1. Mark-Sweep algorithm

Garbage collection is divided into "mark" and "purge" phases: first mark all objects that need to be reclaimed, and all tagged objects are collected uniformly after the tag is complete . Said it is the most basic collection algorithm, and other collection algorithms are based on this idea and to improve their shortcomings. Its main deficiencies are two:

    1. efficiency issues , marking and eliminating two processes are not efficient:
    2. space Problem , after the tag is cleared, there is a lot of discontinuous memory fragmentation, and too much space fragmentation can lead to a later time when a larger object needs to be allocated during the program's run, and you cannot find enough contiguous memory to trigger another garbage collection action in advance.

The algorithm executes as shown in the following procedure:

  

1.2. Copy algorithm

  Divide available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is exhausted, copy the surviving object to the other piece, and then clean up the used memory space once. Easy to implement and efficient to run.

The algorithm executes as shown in the following procedure:

  

1.3. Marking-Sorting algorithm

First mark all objects that need to be recycled, and then let all the surviving objects move toward one end, and then directly clean out the memory outside the end boundary.

The algorithm executes the process as shown.

1.4, generational collection algorithm

The garbage collection of commercial virtual machines uses the "generational collection" (generational Collection) algorithm , which divides the memory into several blocks depending on the lifetime of the object. In general, the Java heap is divided into the new generation of the beginning of the old age, so that the characteristics of each era to adopt the most appropriate collection algorithm. In the Cenozoic, each garbage collection found that a large number of objects died, only a small number of survival, the use of replication algorithms, only a small number of surviving objects can be copied cost to complete the collection. In the old age, because of the high survival rate of the object and the lack of additional space to guarantee it, it is necessary to use the "tag-clean" or "mark-up" algorithm for recycling.

The algorithm executes the process as shown.

  

2, the hotspot algorithm realizes the survival object judgment and the garbage collection 2.1, enumerates the root node

A node that is a GC Roots is large enough to check for references within a global reference (such as Constants or class static properties) and execution contexts (for example, a table of local variables in a stack frame), which consumes a lot of time. Second, accurate accessibility analysis requires suspending the execution of threads.

The current mainstream Java virtual machines are using accurate GC (accurate memory management Management: The virtual machine knows exactly what the data type is in a location in memory), so when the execution system pauses the meter, There is no need to test all execution contexts and global reference locations without a leak, and the virtual machine has a way of knowing directly where the object references are stored. In the implementation of the HotSpot, a set of data structures called Oopmap is used to achieve this purpose, and at the time of class loading, the hotspot will calculate what type of the object is in the objects and what offsets it is, and in the JIT compilation process, It also records in a particular location which locations in the next war and number register are references. The GC can be directly aware of this information when it is scanned.

2.2 Safety points

Generates OOPMAP for each instruction, consumes a lot of space, and the GC is expensive. Hotspot also does not do this, only to record this information in a particular bit, these locations are called " Security points ", that is, when the program executes not all places can stop to start the GC, only when the security point can be paused. The choice of a security point is selected by the criteria for whether the program has a feature that allows the program to perform for a long time. The most obvious is the instruction multiplexing, such as method invocation, loop jump.

How to make all threads (excluding threads that perform JNI calls) "Run" to the nearest security point and pause. There are two scenarios of preemptive interrupt (preemptive suspension) and active interrupt (voluntary suspension)

      • preemptive Interrupt: do not need the execution code of the thread to cooperate actively, when the GC occurs, all the threads are interrupted first, if it is found that the place where the money is interrupted is not on the security point, then restore the thread, let it "run" to the security point. There are now virtually no virtual machine implementations.

      • Active Interrupt: when the GC needs to interrupt the thread, do not directly to the thread operation, simply set a flag, each thread executes the active to poll this flag, found that the flag is true when its own interrupt hangs.

The polling flag is coincident with the security point, plus the place where the object needs to be allocated memory.

2.3 Security Zone

The homicide point solves the problem of how to enter the GC, the problem still exists, if the thread is in a sleep or blocking state, the thread is unable to respond to the JVM's interrupt request and cannot "go" to a safe place to break the hang. It is also impossible for the JVM to wait for threads to be allocated CPU time again. This will require a secure zone (safe region) to resolve.

security Zone : Refers to a code fragment in which the reference relationship does not change. It is safe to start a GC anywhere in the region, and the security zone can be seen as a security point that has been expanded. When the thread executes the code in the security zone, it first identifies itself as having entered the security zone, during which time the JVM initiates a GC,JVM without a tube identified as a safe zone state. When a thread leaves the security zone, it checks to see if the system has completed the root node enumeration (or the entire GC process), and if it completes, the thread will continue to execute, otherwise it will have to wait until it receives a signal that it can safely leave the security zone.

Java Virtual machine Learning (iv)--garbage collection algorithm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.