A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
SOURCE Link: Java garbage collection mechanism
1. The meaning of garbage collection
In C + +, the memory occupied by an object is occupied until the end of the program and cannot be assigned to other objects until it is explicitly released, whereas in Java the memory becomes garbage when there is no object reference to the memory originally assigned to an object. A system-level thread of the JVM automatically frees the block of memory. Garbage collection means that the object that the program no longer needs is "useless information," and that information is discarded. When an object is no longer referenced, the memory reclaims the space it occupies so that the space is later used by the new object. In fact, garbage collection can also erase memory-logged fragments in addition to releasing useless objects. The memory is fragmented because the creation object and the garbage collector free up the memory space that the discarded objects occupy. Fragmentation is a free memory hole between the blocks of memory allocated to an object. Defragmentation moves the occupied heap memory to one end of the heap, and the JVM allocates the compiled memory to the new object.
Garbage collection can automatically free up memory space and reduce the burden of programming. This gives Java virtual machines some advantages. First, it can make programming more efficient. In the absence of a garbage collection mechanism, it may take a lot of time to solve a difficult memory problem. When programming in the Java language, the garbage collection mechanism can greatly shorten the time. Second, it protects the integrity of the program, and garbage collection is an important part of the Java language Security strategy.
One potential drawback of garbage collection is that its overhead affects program performance. The Java Virtual machine must trace the objects that are useful in the running program and eventually release the useless objects. This process takes the processor's time. Second, the garbage collection algorithm is incomplete, some previous garbage collection algorithm can not guarantee that 100% collected all the discarded memory. Of course, with the continuous improvement of garbage collection algorithm and the running efficiency of hardware and software, these problems can be solved.
2. Algorithm analysis of garbage collection
The Java language Specification does not explicitly describe which garbage collection algorithm the JVM uses, but any garbage collection algorithm typically does 2 basic things: (1) Discovers useless information objects, (2) reclaims the memory space occupied by the useless objects so that the space can be reused by the program.
Most garbage collection algorithms use the concept of root set (root set), which is a collection of reference variables (including local variables, parameters, class variables) that the executing Java program can access, and the program can use reference variables to access the properties of the object and methods of invoking the object. Garbage collection first needs to determine which are accessible from the root and which are unreachable, objects that can be reached from the root set are active objects, and they cannot be recycled as garbage, which also includes objects that can be reached indirectly from the root set. The root set, which is unreachable by any path, is eligible for garbage collection and should be recycled. Here are a few common algorithms.
2.1. Reference counting method (Reference counting Collector)
Reference counting is the only method of garbage collection that does not use the root set, which uses reference counters to differentiate between surviving objects and objects that are no longer in use. In general, each object in the heap corresponds to a reference counter. When an object is created and assigned to a variable each time, the reference counter is set to 1. When an object is assigned to any variable, the reference counter is incremented by 1 each time the object is scoped (the object is discarded), the reference counter is reduced by 1, and once the reference counter is 0, the object satisfies the garbage collection condition.
A garbage collector based on a reference counter runs faster and does not interrupt program execution for long periods of time, and it is appropriate to run programs in real time. However, the reference counter increases the cost of executing the program because each time the object is assigned to a new variable, the counter adds 1, and each time the existing object is scoped, the counter is reduced by 1.
2.2. Tracing algorithm (tracing Collector)
The tracing algorithm is proposed to solve the problem of the reference counting method, which uses the concept of the root set. The garbage collector based on the tracing algorithm starts scanning from the root set to identify which objects are available, which objects are unreachable, and to mark objects in some way, such as setting one or more bits for each object that can be reached. During the scan recognition process, garbage collection based on the tracing algorithm is also known as the Mark and Purge (mark-and-sweep) garbage collector.
2.3. Compacting algorithm (compacting Collector)
To solve the problem of heap fragmentation, garbage collection based on tracing absorbs the idea of the compacting algorithm, in which the algorithm moves all objects to one end of the heap, and the other end of the heap becomes an adjacent free memory area, and the collector updates all references to all objects it moves. This allows these references to recognize the original object in the new location. In the implementation of the collector based on the compacting algorithm, the handle and the handle table are generally added.
2.4. Copying algorithm (coping Collector)
The algorithm is proposed to overcome the overhead of the handle and to solve the garbage collection of heap fragments. It begins by dividing the heap into an object area and multiple idle areas, where the program allocates space for objects from the object area, and when the object is full, garbage collection based on the coping algorithm scans the active object from the root set and copies each active object to the idle area (so that there is no idle interval between the memory occupied by the active object). The idle area becomes the object area, the original object area becomes the idle area, and the program allocates memory in the new object area.
A typical garbage collection based on coping algorithm is the stop-and-copy algorithm, which divides the heap into object and idle regions, and the program suspends execution during the switching between the object area and the idle area.
2.5.generation algorithm (generational Collector)
One drawback of the stop-and-copy garbage collector is that the collector must replicate all active objects, which increases the program wait time, which is why the coping algorithm is inefficient. In the program design there is such a law: most objects exist for a short time, a few of the existence of a long time. Therefore, the generation algorithm divides the heap into two or more, each sub-heap as the object's generation (generation). Because most objects exist for a shorter time, the garbage collector collects these objects from the youngest child heap as the program discards objects that are not used. After the generational garbage collector runs, the last surviving object is moved to the next highest generation sub-heap, saving time because the old generation of sub-heaps is not often recycled.
2.6. Adaptive algorithm (Adaptive Collector)
In certain cases, some garbage collection algorithms are better than other algorithms. The garbage collector based on the adaptive algorithm monitors the usage of the current heap and will select the appropriate algorithm for the garbage collector.
3. System.GC () method
Command-line Arguments perspective garbage collector run
Using System.GC () can request Java garbage collection regardless of which garbage collection algorithm the JVM is using. There is a parameter in the command line-VERBOSEGC can view the heap memory used by Java, which is in the following format:
You can look at an example:
In this example, a new object is created, because it is not used, so the object quickly becomes unreachable, after the program compiles, executes the command: JAVA-VERBOSEGC TESTGC After the result is:
[Full GC 168k->97k (1984K), 0.0253873 secs]
The environment for the machine is, Windows + JDK1.3.1, data 168K and 97K before and after the arrows represent the amount of memory used by all surviving objects before and after garbage collection GC, indicating that the 168k-97k=71k object capacity is recycled. The data in parentheses is 1984K to the total capacity of the heap memory, and the time required for collection is 0.0253873 seconds (this time will vary at each execution).
It is important to note that calling System.GC () is only a request (recommended). After the JVM accepts this message, it does not immediately do garbage collection, but only a few garbage collection algorithms are weighted, so that garbage collection operations are easy to occur, or earlier, or recycled more.
4. Finalize () method
Before the JVM garbage collector collects an object, it is generally required that the program call the appropriate method to dispose of the resource, but without explicitly releasing the resource, Java provides a default mechanism to terminate the object's heart release resource, which is finalize (). Its prototype is:
protected void Finalize () throws Throwable
After the Finalize () method returns, the object disappears and garbage collection begins execution. The throws Throwable in the prototype indicates that it can throw any type of exception.
The reason for using Finalize () is that there are special cases where the garbage collector cannot handle it. Suppose your object (not using the new method) obtains a "special" area of memory, because the garbage collector knows only what memory space is allocated by new, so it doesn't know how to release this "special" area of memory, At this time, Java allows the definition of a Finalize () method in a class.
Special areas such as: 1) because a C-like approach may be used when allocating memory, rather than the usual new practice of Java. This occurs primarily in native method, such as native method, which calls the C + + methods malloc () function family to allocate storage space, but these memory spaces will not be freed unless the free () function is called. Then this can cause a memory leak. However, because the free () method is a function in C + +, the local method can be used to invoke it in Finalize (). To release these "special" memory spaces. 2) or open file resources, these resources are not part of the garbage collector's recycling scope.
In other words, the primary use of Finalize () is to free up some other practices that open up memory space and do some cleanup work. Since there is no such thing as a "destructor" function or a similar concept in Java, to do something like cleanup, you have to create a common method of performing cleanup work yourself, that is, the Finalize () method in the class of override object. For example, suppose an object will draw itself to the screen during creation, and if it is not explicitly erased from the screen, it may never be cleaned up. If a erase function is added to finalize (), when the GC is working, Finalize () gets called and the image is erased. If the GC does not occur, then the image will
have been preserved.
Once the garbage collector is ready to release the storage space occupied by the object, it first calls the Finalize () method to perform some necessary cleanup work. The memory space occupied by this object is only really released until the next garbage collection action.
In normal cleanup, to clear an object, the user of that object must invoke a purge method at the point where it is expected to be cleared. This is somewhat inconsistent with the concept of C + + "destructor". In C + +, all objects are destroyed (cleared). Or, in other words, all objects "should" be destroyed. If you create a C + + object as a local object, such as in the stack (which is not possible in Java, Java is in the heap), then the cleanup or destruction will be done at the end of the scope of the object that the "closing curly brace" represents. If the object is created with new (similar to Java), then when the programmer calls the C + + DELETE command (Java does not have this command), the corresponding destructor is called. If the programmer forgets, then the destructor will never be called, and we will end up with a memory "vulnerability", and other parts of the object will never be erased.
Instead, Java does not allow us to create local (local) objects-in any case, using new. In Java, however, there is no "delete" command to dispose of objects, because the garbage collector helps us to automatically free up storage space. So if we stand in a relatively simplified position, we can say that because of the garbage collection mechanism, there is no destructor for Java. However, with further learning, you will know that the existence of the garbage collector does not completely eliminate the need for destructors, or that the mechanism represented by the destructor cannot be eliminated (see next paragraph for reasons). In addition, the Finalize () function is called when the garbage collector is ready to dispose of the storage space occupied by the object, and must not be directly called Finalize (), so try to avoid using it. If you want to perform some sort of cleanup work other than freeing the storage space, you still have to call one of the methods in Java. It is equivalent to the destructor of C + +, but not convenient for the latter.
All objects in C + + are bound to be destroyed by using delete (), and objects in Java are not always recycled by the garbage collector. In another word, 1 objects may not be garbage collected, 2 garbage collection is not equal to "destructor", 3 garbage collection is only related to memory. That is, not if an object is no longer being used, does it have to release other objects contained in this object in Finalize ()? No. Because regardless of how the object is created, the garbage collector is responsible for freeing the memory that those objects occupy.
5. Conditions that trigger the main GC (garbage Collector)
The JVM has a high frequency of secondary GC, but because this GC occupies a very short time, it has little effect on the system. More notable is the trigger condition of the main GC, because it has a noticeable effect on the system. Overall, there are two conditions that trigger the main GC:
1) When the application is idle, the GC is called when no application thread is running. Because the GC is in the lowest priority thread, the GC thread is not called when the application is busy, except for the following conditions.
2) When the Java heap is low on memory, the GC is called. When the application thread is running and a new object is created during the run, if there is not enough memory space, the JVM will forcibly invoke the GC thread to reclaim the memory for the new allocation. If the GC still fails to meet the memory allocation requirements after one time, the JVM will take another two GC for further attempts, and if it still fails to meet the requirements, the JVM will report an "out of memory" error and the Java application will stop.
Since the main GC is determined by the JVM in terms of the system environment, and the system environment is constantly changing, the main GC runs with uncertainty and cannot predict when it will inevitably occur, but it can be determined that for a long-running application, its main GC is repeated.
6. Measures to reduce GC overhead
According to the above GC mechanism, the operation of the program will directly affect the system environment changes, thus affecting the GC trigger. Without design and coding for GC features, there is a series of negative effects such as memory presence. To avoid these effects, the basic principle is to minimize the garbage and reduce the cost of the GC process as much as possible. Specific measures include the following:
(1) Do not explicitly call System.GC ()
This function recommends that the JVM perform the main GC, although it is only recommended, but not necessarily, but in many cases it triggers the main GC, increasing the frequency of the main GC, which increases the number of intermittent pauses.
(2) Minimizing the use of temporary objects
Temporary objects will become garbage after jumping out of function calls, and less temporary variables will be equivalent to reducing garbage generation, thus prolonging the occurrence of the second triggering condition mentioned above and reducing the chance of the main GC.
(3) It is best to explicitly set NULL when an object is not used
In general, NULL objects are treated as garbage, so explicitly setting unused objects to NULL is useful for GC collectors to determine garbage, which increases the efficiency of the GC.
(4) Try to use StringBuffer instead of string to accumulate strings
Because string is a fixed-length string object, when the string object is accumulated, it is not amplified in a string object, but instead re-creates a new string object, such as STR5=STR1+STR2+STR3+STR4, which produces multiple garbage objects during the execution of the statement. Because a new string object must be created for the "+" operation, these transition objects are meaningless to the system and only add more garbage. Avoid this situation can use StringBuffer to accumulate the string, because the StringBuffer is variable long, it expands on the original basis, does not produce intermediate objects.
(5) can use basic type such as Int,long, do not Integer,long object
A basic type variable consumes much less memory resources than the corresponding object, and it is best to use basic variables if it is not necessary.
(6) Minimize the use of static object variables
Static variables are global variables, are not recycled by GC, and they always consume memory.
(7) time when the object was created or deleted
Concentrating on creating new objects in a short amount of time, especially large objects, can result in a sudden need for a lot of memory, and in this case the JVM can only perform a primary GC to reclaim memory or consolidate memory fragments, thereby increasing the frequency of the main GC. The same is true for deleting objects centrally. It makes a large number of garbage objects suddenly appear, and the free space inevitably decreases, which greatly increases the chance of forcing the main GC the next time the new object is created.
The following example shows you the process of garbage collection and summarizes the previous statements.[Java]View PlainCopy
The above program creates many chair objects, and at some point after the garbage collector starts running, the program stops creating the chair. Because the garbage collector may run at any time, we don't know exactly when it starts. Therefore, the program uses a tag called Gcrun to indicate whether the garbage collector has started running. Using the second tag, F,chair can tell main () that it should stop generating the object. Both of these tokens are set internally in Finalize () and are tuned for garbage collection. The other two static variables--created and finalized--are used to track the number of objects that have been created and the number of objects that the garbage collector has finished finishing. Finally, each chair has its own (non-static) int I, so it can track how much it is specific. Chair with number 47 completes the finishing touches, the tag is set to true to end the Chair object creation process.
7. A few additions to garbage collection
After the above description, you can find that garbage collection has the following characteristics:
(1) The unpredictability of garbage collection: Due to the implementation of a different garbage collection algorithm and the use of different collection mechanisms, it may be timed, it is possible to occur when the system is idle CPU resources, it may be the same as the original garbage collection, wait until the memory consumption limit occurs, This is related to the choice of the garbage collector and the specific settings.
(2) The accuracy of garbage collection: mainly includes 2 aspects: (a) The garbage collector can accurately mark the living objects; (b) The garbage collector is able to pinpoint the referential relationships between objects. The former is a precondition for completely reclaiming all discarded objects, or it may cause a memory leak. The latter is necessary for the realization of algorithms such as merging and copying. All unreachable objects can be reliably recycled, all objects can be reassigned, objects are copied and object memory is shrunk, which effectively prevents fragmentation of memory.
(3) There are many different kinds of garbage collectors, each with its algorithm and its performance, both stopping the application when garbage collection starts, and allowing the application's thread to run when garbage collection starts, as well as the garbage collection multithreading at the same time.
(4) The implementation of garbage collection is closely related to the specific JVM and the JVM's memory model. Different JVMs may have different garbage collections, and the JVM's memory model determines what types of garbage collection the JVM can take. The memory systems in the HotSpot family of JVMs are now designed with an advanced object-oriented framework, allowing the JVM to use state-of-the-art garbage collection.
(5) With the development of technology, modern garbage collection technology provides many optional garbage collectors, and in the configuration of each collector can also set different parameters, which makes it possible to obtain the best application performance according to the different applications environment.
In view of the above characteristics, we should pay attention to when using:
(1) Do not attempt to assume that garbage collection occurs at any time, all of which are unknown. For example, a temporary object in a method becomes a useless object when the method call is complete, and its memory can be freed at this time.
(2) Java provides a number of classes that deal with garbage collection, and provides a way to enforce garbage collection-calling System.GC (), but this is also an indeterminate approach. Java does not guarantee that each call to this method will be able to start garbage collection, it will only issue to the JVM such a request, whether or not to actually perform garbage collection, everything is unknown.
(3) Pick a garbage collector that suits you. In general, if your system does not have special and demanding performance requirements, you can use the JVM's default options. Otherwise, you might consider using a targeted garbage collector, such as an incremental collector, which is more suitable for systems with higher real-time requirements. The system has a high configuration, there are more idle resources, you can consider using the parallel tag/purge collector.
(4) The key is also difficult to grasp the problem is the memory leak. Good programming habits and rigorous programming attitude is always the most important, do not let your own a small error caused a large memory leak.
(5) Release the reference of the useless object as soon as possible. Most programmers use temporary variables when the reference variable is automatically set to null after exiting the active domain (scope), implies that the garbage collector collects the object, and must be aware that the referenced object is listening, and if so, remove the listener and then assign a null value.
Additional Reference Links:Java garbage collection mechanism
Java garbage Collection mechanism (RPM)
Start building with 50+ products and up to 12 months usage for Elastic Compute Service