Java's heap is a run-time data area in which instances (objects) of the class allocate space. The Java Virtual machine (JVM) heap stores all objects that are established by the running application, which are established through directives such as new, NewArray, Anewarray, and Multianewarray, but they do not require program code to be explicitly released. In general, the heap is responsible for garbage collection, although the JVM specification does not require special garbage collection techniques or even garbage collection at all, but because of the limited memory, the JVM has a heap managed by garbage collection when it is implemented. Garbage collection is a dynamic storage management technology, which automatically frees objects that are no longer referenced by programs, and implements the function of automatic resource recovery according to specific garbage collection algorithms.
The significance of garbage collection
In C, the memory of an object is occupied until the program is finished running, and cannot be allocated to other objects until it is explicitly released, whereas in Java the memory becomes garbage when no object reference points to the memory originally allocated to an object. A system-level thread in the JVM automatically frees the memory block. Garbage collection means that objects that are no longer needed by the program are useless information that will be discarded. When an object is no longer referenced, the memory reclaims the space it occupies so that the space is used by the new object later. In fact, garbage collection can erase memory fragments in addition to releasing useless objects. Memory fragmentation occurs because the object is created and the garbage collector frees up the memory space of the discarded object. Fragmentation is a free memory hole between the memory blocks assigned to an object. Defragmentation moves the heap memory occupied to one end of the heap, and the JVM assigns the sorted memory to the new object.
Garbage collection automatically frees up memory space and reduces the burden of programming. This makes Java virtual machines have some advantages. First, it can make programming more efficient. In the absence of a garbage collection mechanism, it may take a lot of time to solve a difficult memory problem. When programming in the Java language, the garbage collection mechanism can greatly shorten the time. The second is that it protects the integrity of the program, and garbage collection is an important part of the Java language security policy.
A potential drawback to garbage collection is that its overhead affects program performance. The Java virtual machine must track the objects that are useful in running programs and eventually release the useless objects. This process takes a processor's time. Second, the incompleteness of the garbage collection algorithm, some of the previous garbage collection algorithms can not guarantee that 100% of all the discarded memory. Of course, with the continuous improvement of garbage collection algorithm and the increasing efficiency of software and hardware, these problems can be solved.
Algorithm analysis of garbage collection
The Java language Specification does not explicitly describe which garbage collection algorithm the JVM uses, but any garbage collection algorithm typically does 2 basic things:
(1) Discovering useless information objects;
(2) Reclaim the memory space occupied by the unwanted object so that the space can be reused by the program.
Most garbage collection algorithms use the concept of root set (root set), which is the collection of reference variables (including local variables, parameters, class variables) that are accessible to the executing Java program, and the program can use reference variables to access the object's properties and invoke the object's methods. The garbage collection preference needs to determine what is accessible from the root and what is not, and the objects from the root set are active objects that cannot be reclaimed as garbage, which also includes objects that are indirectly accessible from the root set. The root set, which can not be reached through any path, is eligible for garbage collection and should be recycled. Here are a few common algorithms.
1, reference counting method (reference counting collector)
Reference counting is the only garbage collection that does not use the root set, which uses reference counters to differentiate between surviving objects and objects that are no longer in use. In general, each object in the heap corresponds to a reference counter. When you create an object each time and assign it to a variable, the reference counter is set to 1. When an object is assigned to an arbitrary variable, the reference counter is added at 1 each time. When the object is out of scope (the object is discarded for use), the reference counter is reduced by 1, and once the reference counter is 0, the object satisfies the garbage collection condition.
The garbage collector based on the reference counter runs faster, does not interrupt program execution for a long time, and must run in real time. However, the reference counter increases the overhead of program execution because each object assigns a new variable to the counter plus 1, and each time an existing object is scoped, the counter is reduced by 1.
2, tracing algorithm (tracing collector)
The tracing algorithm is proposed to solve the problem of reference counting method, which uses the concept of root set. The garbage collector based on the tracing algorithm begins scanning from the root set, identifying which objects can be reached, which objects are unreachable, and marking the accessible objects in some way, such as setting one or more bits for each accessible object. In the scanning identification process, garbage collection based on the tracing algorithm is also known as the tag and Purge (Mark-and-sweep) garbage collector.
3, compacting algorithm (compacting collector)
To solve the heap fragmentation problem, garbage collection based on tracing absorbs the idea of the compacting algorithm, in which the algorithm moves all objects to one end of the heap, the other end of the heap becomes an adjacent free memory area, and the collector updates all references to all the objects it moves. So that these references can identify the original object in a new location. In the implementation of the collector based on the compacting algorithm, the handle and the handle table are generally added.
4, Coping algorithm (coping collector)
The proposed algorithm is designed to overcome the cost of the handle and to solve the garbage collection of heap fragments. It begins by dividing the heap into an object surface and multiple idle surfaces, the program allocates space for objects from the object surface, and when the object is full, garbage collection based on the coping algorithm scans the active object from the root set and copies each active object to the idle surface (so that there is no free hole between the active object's memory), The idle surface becomes the object surface, the original object surface becomes the idle surface, the program allocates memory in the new object face.
A typical garbage collection based on coping algorithm is stop-and-copy algorithm, which divides the heap into object plane and free area surface, and the program suspends execution in the process of switching between object surface and free area surface.
5, Generation algorithm (generational collector)
One drawback of the stop-and-copy garbage collector is that the collector must replicate all active objects, which increases the latency of the program, which is why the coping algorithm is inefficient. In the program design, there is such a law: most objects exist in a relatively short time, a few of the existence of a long time. Therefore, the generation algorithm divides the heap into two or more, each of which is a generation of objects (generation). Because most objects exist for a relatively short time, the garbage collector collects the objects from the youngest child heap as the program discards unused objects. After the generational garbage collector runs, the last surviving object moves to the next highest-generation child heap, saving time because the older generation's child heap is not often recycled.
6, Adaptive Algorithm (adaptive Collector)
In certain cases, some garbage collection algorithms are superior to other algorithms. The garbage collector based on the adaptive algorithm monitors the current heap usage and will select the appropriate algorithm for the garbage collector.