Garbage collection mechanism and memory allocation policy

Source: Internet
Author: User

We often say that garbage collection (Grabage COLLECTION,GC) needs to determine the following three things:

Which memory needs to be recycled.
when to recycle.
how to recycle.

before these three things, let's take a look at why we're doing garbage collection.
Because the procedure exists in the instance object, the variable and so on information if does not carry on the garbage collection, along with the program operation, will inevitably bring the program performance decline, causes the card, the slow even system abnormality. which memory needs to be recycled.

We talked about the program counter, the virtual machine stack, the local method stack three zones are thread-isolated data areas, that is, the three zones are born with threads and are extinguished by threads. Because of the certainty of memory allocations and recoveries in these memory areas, there is no need to consider recycling too much in these areas. The heap and method areas in Java are different, and this part of the memory allocation and recovery is dynamic. The garbage collection we are talking about is the recovery of this part of the memory. when to recycle.

Before we recycle, we have to know which part of the object is alive and which part of the object is dead.
The so-called survival and death, simply is an object does not exist any reference, then it can be recycled.

How to judge an object to survive and die.
Reference count algorithm

To add a reference counter to the object, whenever there is a place to reference it, the value of the counter plus 1,
when the reference is invalid, the value of the counter is reduced by 1, when the value of the timer is 0 o'clock, the object is impossible to
be used again.

The reference algorithm is simple and efficient, and squirrel, such as Microsoft's COM, Python language, and game scripting domain, uses a reference timing algorithm for memory management.
But Java does not use a reference counting algorithm to manage memory, the main reason is that it is difficult to solve the problem of circular references. When two objects are not used, it should be recycled, but because each other reference to each other, resulting in a reference timer is not 0, so can not reclaim this block of memory.

Java uses another algorithm to determine whether an object survives-– accessibility analysis.
This algorithm is called GC roots objects as a starting point, from these nodes to search down, when the GC no matter how many paths can not reach an object, it is said that the object is not available, it will be judged as recyclable objects. (This object is not recyclable, we will see the object has a chance to save himself)

As the figure shows: although there are references between Object5, OBJECT6, and object7, they are not accessible to the GC, so they are judged to be recyclable objects.

So what exactly is this GC roots? You can include the following :
1. Objects referenced in the virtual machine stack (local variables table in the stack frame)
2. A member of a class static property reference in the method area
3, the object in the method constant reference
4, the local method stack (native method) refers to the object

Most of the above are references, and the reference is that a type stored in an in-memory data is the starting address of another piece of memory.
But that definition is too narrow. After the JDK1.2, Java extended the concept of reference, divided the reference into strong reference, soft reference, weak reference and virtual reference.
Strong Reference : a reference through new. If this reference exists, the referenced object will never be reclaimed.
Soft Reference : Describes some useful but not required objects. Like objects in the Java cache, this object is optional and can be recycled if there is not enough memory, whereas staying in memory can improve the performance of the program.
Weak references : It is also an object that describes a non-mandatory, but is weaker than a soft reference. That is, regardless of whether the current memory is sufficient, the object that is referenced only by a weak reference is recycled.
Virtual Reference : The weakest reference relationship in which an object's lifetime is not related to a virtual reference, nor is it possible to obtain an instance of an object through a virtual reference. The purpose of this reference is to receive a system notification when this object is reclaimed by the garbage collector.

object to help themselves.
As we've just said, when the object is recyclable, the object is not recycled directly.
If this object does not overwrite the Finalize () method, or if the method has been called by the virtual machine, it is reclaimed directly.
If the Finalize () method of this object is not invoked by the virtual machine, then the Finalize () method is the last chance for an object to survive, as long as the object is associated with any of the objects on the reference chain (the path that GC roots can reach). This will remove itself from the collection that is about to be reclaimed at the time of the second collection (and GC roots is the first time).

The following code:

/** * An object self-salvation * If the object finds no reference chain connected to GC root after the feasibility analysis, it will be marked with the first * mark and enter the filter criteria is whether this object is necessary to execute the Finalize () method * Finalize method can only
    Be executed once * * */public class FINALIZEESCAPEGC {public static FINALIZEESCAPEGC Save_hook = null;
    public void IsAlive () {System.out.println ("I Am Alive");
        @Override protected void Finalize () throws Throwable {super.finalize ();
        System.out.println ("Finalize Method executed");
    Self-help ing finalizeescapegc.save_hook = this;
        public static void Main (string[] args) throws interruptedexception {Save_hook = new finalizeescapegc ();


        Save_hook = null;
        System.GC ();
        Thread.Sleep (500);//0.5 sec if (Save_hook!=null) {save_hook.isalive ();
        }else {System.out.println ("I am Dead");
        } Save_hook = null;
        System.GC ();
            Thread.Sleep (500);//0.5 sec if (Save_hook!=null) {Save_hook.isalive ();
        }else {System.out.println ("I am Dead"); }

    }   
}

The Recycle method area , which is the permanent generation in hotspot.
The garbage collection of the permanent generation mainly collects two parts: discarded constants and useless classes
The recycle constants are similar to the objects in the Recycle Java heap.
class needs to meet the following three conditions to be considered a useless class

1, all instances of the class are reclaimed, no object 2 of the class instance exists in the Java heap, the
classloader that loaded the class has been reclaimed
3, and the corresponding Java.lang.Class object of the class is not referenced anywhere and cannot
   be Where to access a method of this class by reflection.
how to recycle.

It must have been recycled through the garbage collector, there are many algorithms in the garbage collector, so here's a look at the GC algorithm.

1, Mark-Purge Algorithm (mark-sweep):
This is the most basic collection. It is divided into two stages: marking and clearing, marking all the objects that need to be recycled, and then uniformly reclaiming all the marked objects after the mark is completed, which is what we call the accessibility analysis.
There are two drawbacks to this algorithm: the process of marking and purging is not efficient; After the cleanup is complete, there will be a large amount of fragmented memory fragmentation, so that if you need to allocate large objects later, you cannot find enough contiguous memory to trigger a garbage collection ahead of time.

2. Copy algorithm (copying):
Divide the memory into two blocks of equal size, using only one piece at a time. When the space in this block is used up, copy the surviving object to another piece of memory. Then clean up the memory that you have used. But the cost of this algorithm is to reduce the memory to half. Because most of the new generation of Java objects are dying, this algorithm is often used to recycle the new generation.
At the time of recovery, the surviving objects in Eden and survivor were copied to another piece of Survivor, then the Eden and the newly used survivor were cleaned up, and the old age was allocated guarantees when the survivor had not enough memory.
  
3, marking-finishing algorithm (mark-compact):
Because the old age of the object is after several recycling still exist objects, if the old age using the replication algorithm, then may occur all objects are alive, at this time the replication efficiency will be very low.
For the old age, there is a tagging-sorting algorithm that marks recyclable objects and then lets all surviving objects move to one end and then directly clean out the memory outside the end of the boundary.
  

garbage collector:

  
Garbage collection algorithm is only the methodology of memory recycling, then garbage collector is the specific implementation of memory recovery.
Because the Java Virtual Machine specification does not specify how the garbage collector is implemented, the collectors provided by different versions of the virtual machines differ greatly from one vendor to another.
The following figure is the garbage collector for the Hotspot virtual machine (the lines between the two collectors indicate they can be used together):

serial collector (serial)

The serial collector is the most basic, historically oldest collector, but this collector is a single thread.
The so-called single thread will only start a thread to collect garbage, and more importantly, when it collects rubbish, all threads must suspend work until it is collected. Stop the world.
However, the serial collector is the default Cenozoic collector that runs in client mode. In client mode it is acceptable to pause for dozens of milliseconds as long as the pauses are not frequent.

parnew Collector (parallel)
The Parnew collector can be understood as a multithreaded version of the serial collector, often the preferred generation collector in virtual machines in server mode, because it can be used in conjunction with CMS.

Parallel Scavenge collector (parallel recycle)
The Parallel scavenge collector is also a multi-threaded new generation collector, similar to the parnew.
But the characteristic of the parallel scavenge collector is that its focus is different from that of other collectors: the purpose of collectors such as CMS is to reduce the latency of user processes during garbage collection; parallel The purpose of the scavenge collector is to achieve a controllable throughput (run user code time/(run user code time + garbage Collection Time)). So is also called the throughput priority collector.
In addition to this, the Parallel scavenge also has an adaptive adjustment strategy that, when opened-xx:+useradaptivesizepolicy, provides a virtual opportunity to collect performance monitoring information based on the current system's operation.

Serial Old collector
Serial old is the older version of the serial collector, a single-threaded collector, using the tag-collation algorithm, the first three of which are all using the stop-copying algorithm.

Parallel Old collector
Parallel old is the older version of the Parallel scavenge collector, using multithreading and tagging algorithms. In the focus on throughput and CPU-sensitive occasions, you can use Parallel scavenge+parallel old

CMS Collector (concurrent)
The CMS (Concurrent Mark Sweep) collector is a collector with the goal of obtaining the shortest recovery pause time.
The CMS is based on the tag-purge algorithm, but its process is divided into four phases:

Initial Tags: tag GC roots objects that can be directly associated with
concurrent Tags: a accessibility analysis of GC roots to determine whether an object survives or not
: Correcting an object that makes a tag change in the concurrent markup phase because of user action, the pause time is longer than the initial Mark's pause
concurrency cleanup: clearing it for recovery

The most time-consuming part of the process is concurrent markup and concurrent cleanup, but both can work with the user thread

The CMS was once almost a landmark garbage collector, but it also has three drawbacks:

1, very sensitive to CPU resources , because it is in the concurrency phase, will not cause the user process to pause, but because the use of a part of the thread, it will cause the application slowed down, total throughput. The number of threads that the CMS starts by default is (CPU+3)/4, which can be seen, in the case of increasing number of CPUs, the more CPU will be occupied.
2, unable to handle floating garbage , because the CMS in the concurrent cleanup phase, the user process is still running, so it will also produce garbage, but the CMS collector can not be processed in the second collection process them, so stay until the next GC in cleanup. This rubbish is called floating rubbish. So the CMS should reserve enough memory space for the user thread to use, so the CMS collector cannot wait until the old age is almost completely filled up to collect, and need to reserve part of the space to provide concurrent cleanup program use.
3, because the CMS is based on the tag-purge algorithm to implement the collector, means that there is a large number of space debris generation, when not able to find large enough contiguous space, will trigger a full gc,cms can open-xx:usercmscompactatfullcollection, A consolidation process that opens memory fragmentation when the CMS collector is unable to perform a full GC. The memory collation process is not concurrent, so the pause time must not be long.

G1 Collector
G1 collectors can use more than one CPU to shorten the stop the world pause time, unlike CMS, G1 is based on the overall tag-collation algorithm to implement the collector, but in part based on the replication algorithm, which means that there is no memory space generated fragments, It will not trigger the GC in advance. In addition to reducing the pause time, a predictable pause time model can be set up to allow the user to explicitly spend less than n milliseconds on the garbage collector in a time fragment of M-millisecond length.
The G1 heap memory layout has also changed a lot, dividing the Java heap into independent regions of equal size (Region), the Cenozoic and the old generation are no longer physically isolated, and a subset of Region (which does not require continuous).
The process of GC1 collector is divided into four stages

Initial tag, concurrent tag, final tag, filter recycle

concurrency and parallelism in the garbage collector:

Parallel (Parallel): Multiple garbage collection threads work in parallel, at which point the user thread is waiting for a stop state.
concurrency (Concurrent): The user thread executes concurrently with the garbage collection thread, where the user program continues to run and the garbage collector runs on another CPU.
memory allocation and recycling policies:

object's memory allocation, the large direction is the heap allocation, the object is mainly allocated in the new generation of Eden area, if the local thread allocation buffer (TLAB) is started, will be assigned by thread priority on the Tlab.

1, the object priority in the Eden region distribution
2, the big object enters the old age directly
Parameter-xx:pretenuresizethreshold parameter to avoid a large amount of memory duplication between Eden and the two survivor areas.
3. long-lived objects enter the old Age: Virtual Machines Define an Object age counter for each object, and if the object survives at Eden's birth and after the first minor GC, and can be accepted by survivor, it will be moved to this area, with age set to 1. The object did not survive once in Survivor minor GC age is added 1, know to increase to a certain age, will be promoted to the old age, through the-xx:maxtenuringthreshold parameters to set the maximum age.
4. Dynamic Age Interpretation
If the sum of all objects in the survivor space is greater than half the size of the survivor space, an object with an age greater than or equal to that age can go directly into the old age.

5. Space Distribution Guarantee
Before minor GC, the first check is whether the maximum available contiguous space in the old age is greater than the total space of the Cenozoic object, and if this condition is true, then the minor GC is safe. If not, the virtual opportunity to see if the Handlepromotionfailure setting is allowed to fail, and if so, it will continue to check the old age continuous space is greater than the total size of the Cenozoic objects or the average size of successive promotions, and is greater than a minor GC, If less than or handlepromotionfailure is set to not allow adventure, a full GC is changed instead

JDK6 Update 24 becomes:

If the old age contiguous space is greater than the total size of the Cenozoic objects or the average size of successive promotions
 , the minor GC is performed, otherwise a full GC is performed.

two survivor required for Mao
Online Search answers:
In the original copying collection algorithm, the space is divided into two halves, called Semispace. The process of spatial allocation and recycling is to allocate half of the space as a from, and to copy the reachable objects to the to, and the From to logical exchange when the from is full or some other condition is met. Simple copying collection is not a good way to deal with long lived objects, as such objects are still alive each time they are collected, bringing the overhead of copying. For Trade-offs, the hotspot GC is currently implemented in a similar manner in addition to G1, and the copying collection algorithm is used in young Gen. But it uses the copying algorithm is a variant of the original algorithm, leaving a larger area as Eden, between Eden and the old Gen set Semispace as a buffer, so that "medium life" objects as far as possible before entering the old Gen was collected. This is hotspot's survivor spaces. Well, it's a similar approach, but hotspot parallel scavenge is a little different from the other GC implementations. The other is to share a set of generational GC frameworks starting with JDK 1.4, and PS is to implement similar functionality without using the framework.

In fact, the design of the From and to two parallel zones is to filter the objects that really match the requirements of the old area (that is, objects that require long held references) before putting them in the old area.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.