Java garbage collection Algorithm

Source: Internet
Author: User

 

Introduction

The Java heap is a runtime data zone, and the instance (object) of the class allocates space from it. The Java Virtual Machine (JVM) Heap stores all objects created by running applications. These objects are created through commands such as new, newarray, anewarray, and multianewarray, however, they do not need program code to be explicitly released. In general, the heap is responsible for garbage collection. Although the JVM specification does not require special garbage collection technology or even does not require garbage collection, due to the limited memory, during implementation, the JVM has a heap managed by garbage collection. Garbage collection is a kind of dynamic storage management technology that automatically releases objects that are no longer referenced by programs, and implements automatic Resource Recovery based on specific garbage collection algorithms.

Significance of garbage collection 

In C ++, objects are occupied until the program ends running and cannot be allocated to other objects before explicit release. in Java, when no object references the memory originally allocated to an object, the memory becomes garbage. A jvm system-level thread Automatically releases the memory block. Garbage collection means that the object no longer needed by the program is "useless information", which will be discarded. When an object is no longer referenced, the memory recycles the occupied space so that the space is used by new objects. In fact, apart from releasing useless objects, garbage collection can also clear memory record fragments. Because the created object and the garbage collector release the memory space occupied by the discarded object, memory fragments may occur. Fragments are idle memory holes between memory blocks allocated to objects. Fragment moves the occupied heap memory to the end of the heap, And the JVM allocates the organized memory to the new object.

Garbage collection can automatically release memory space and reduce programming burden. This gives Java virtual machines some advantages. First, it can improve programming efficiency. When there is no garbage collection mechanism, it may take a lot of time to solve an obscure storage problem. When programming in Java, the garbage collection mechanism can greatly shorten the time. Second, it protects program integrity. Garbage collection is an important part of Java's security policy.

One potential drawback of garbage collection is that its overhead affects program performance. The Java virtual machine must track useful objects in the running program and finally release useless objects. This process takes processing time. Secondly, due to the incompleteness of the garbage collection algorithm, some garbage collection algorithms used earlier cannot guarantee that 100% of the garbage collection algorithms can collect all the discarded memory. Of course, these problems can be solved with the continuous improvement of the garbage collection algorithm and the improvement of the operating efficiency of software and hardware.

  Garbage collection algorithm analysis 

The Java language specification does not clearly indicate which garbage collection algorithm is used by JVM, but any garbage collection algorithm generally requires two basic tasks: (1) discovering useless information objects; (2) reclaim the memory space occupied by useless objects so that the space can be used by the program again.

Most garbage collection algorithms use the root set concept; the so-called root set refers to the collection of referenced variables (including local variables, parameters, and class variables) that can be accessed by Java programs that are being executed ), the program can use reference variables to access object attributes and call object methods. The first choice of garbage collection is to determine which are reachable and which are inaccessible from the root, and the objects reachable from the root set are all active objects, which cannot be recycled as garbage, this also includes objects indirectly accessible from the root set. Objects that cannot be reached through any path in the root SET meet the garbage collection conditions and should be recycled. The following describes several common algorithms.

  1,Reference COUNTING METHOD(Reference Counting Collector)

The reference counting method is the only method that does not use the root set for garbage collection. This algorithm uses the reference counter to distinguish between a surviving object and an object that is no longer in use. Generally, each object in the heap corresponds to a reference counter. When an object is created and assigned to a variable, the reference counter is set to 1. When an object is assigned to any variable, the reference counter is added with 1 each time. When the object is out of scope (this object is not used anymore), the reference counter is reduced by 1. Once the reference counter is 0, the object meets the garbage collection conditions.

The reference counter-based Garbage Collector runs fast and does not interrupt program execution for a long time. It is suitable for programs that must run in real time. However, the reference counter increases the overhead of program execution because every time an object is assigned to a new variable, the counter is added with 1. Every time an existing object has a scope, the counter is reduced by 1.

  2,TracingAlgorithm(Tracing Collector)

The tracing algorithm is proposed to solve the problem of reference counting. It uses the concept of root set. The Garbage Collector Based on the tracing algorithm scans the root set to identify which objects are reachable and which objects are not reachable, and marks the reachable objects in some way, for example, you can set one or more places for each reachable object. In the scanning and identification process, the garbage collection based on the tracing algorithm is also called the mark-and-sweep garbage collector.

  3,CompactingAlgorithm(Compacting Collector)

To solve the heap fragmentation problem, tracing-based garbage collection absorbs the Compacting algorithm IDEA. In the process of clearing, the algorithm moves all objects to the end of the heap, the other end of the heap becomes an adjacent idle memory zone. The Collector updates all references of all objects it moves, so that these references can recognize the original objects at the new location. In the implementation of collectors Based on the Compacting algorithm, the handle and handle tables are generally added.
4,CopyingAlgorithm(Coping Collector)

This algorithm is proposed to overcome the handle overhead and solve the garbage collection of heap fragments. At the beginning, it divides the heap into one object plane and multiple idle planes. The program allocates space for the object from the object plane. When the object is full, garbage Collection Based on the coping algorithm scans activity objects from the root set and copies each activity object to the idle surface (so that there is no idle hole between the memory occupied by the activity object ), in this way, the idle surface becomes the object surface, and the original object surface becomes the idle surface. The program will allocate memory in the new object surface.

A typical garbage collection Algorithm Based on the coping algorithm is the stop-and-copy algorithm, which divides the heap into the object plane and the free area plane. During the switching process between the object plane and the free area, the program is suspended.

5,GenerationAlgorithm(Generational Collector)

One defect of the stop-and-copy garbage collector is that the collector must copy all the active objects, which increases the program wait time, which is the cause of the inefficiency of the coping algorithm. In program design, there is a rule that most objects have a short time and a few objects have a long time. Therefore, the generation algorithm divides the heap into two or more sub-heaps as the generation (generation) of objects ). Because most objects have a short time, as the program discards unused objects, the garbage collector collects these objects from the youngest child heap. After the generational garbage collector is run, the objects that survived the last run are moved to the subheap of the next highest generation. Because the subheap of the old generation is not often recycled, this saves time.

  6,AdaptiveAlgorithm(Adaptive Collector) 

In specific cases, some garbage collection algorithms are better than other algorithms. The Garbage Collector Based on the Adaptive algorithm monitors the usage of the current heap and selects the garbage collector of the appropriate algorithm.
View Java garbage collection
1. Run Command Line parameters to view the spam collector
2. Using System. gc () can request Java garbage collection regardless of the garbage collection algorithm used by JVM. There is a parameter in the command line-verbosegc to view the heap memory used by Java. Its format is as follows:

Java-verbosegc classfile

Let's look at an example:
class TestGC 
{
 public static void main(String[] args) 
 {
  new TestGC();
  System.gc();
  System.runFinalization();
 }


In this example, a new object is created and is quickly reachable because it is not used. After the program is compiled, run the following command: java-verbosegc TestGC and the result is:

[Full GC 168 K-> 97 K (1984 K), 0.0253873 secs]

The environment of the machine is Windows 2000 + JDK1.3.1. The data before and after the arrow is K and 97K respectively indicate the memory capacity used by all the surviving objects before and after garbage collection, this indicates that the object capacity of 168 K-97 K = 71K is recycled. The data in the brackets, K, is the total heap memory capacity, the collection takes 0.0253873 seconds (this time varies with each execution ).
2. Run the finalize method to view the execution of the Garbage Collector
Before the JVM Garbage Collector collects an object, it is generally required that the program call an appropriate method to release the resource, but without explicitly releasing the resource, java provides a default mechanism to terminate this object and release resources. This method is finalize (). Its prototype is:
protected void finalize() throws Throwable 

After the finalize () method returns, the object disappears and the garbage collection starts to be executed. Throws Throwable in the prototype indicates that it can throw any type of exception.
The reason for using finalize () is that sometimes it is necessary to adopt a method different from the general method of Java, through the allocation of memory to do something with a C style. This can be done through the "inherent method", which is a way to call non-Java methods from Java. C and C ++ are currently the only languages that support inherent methods. However, since they can call subprograms written in other languages, they can effectively call anything. In non-Java code, you may be able to call the malloc () series functions of C and use it to allocate storage space. In addition, unless free () is called, the storage space will not be released, resulting in a memory "Vulnerability. Of course, free () is a C and C ++ function, so we need to call it in an inherent method in finalize. That is to say, we cannot use finalize () too much. It is not an ideal place for general cleanup work.
To clear an object, the user of the object must call a clearing method at the location where the object is to be cleared. This is slightly in conflict with the concept of C ++ "destructor. In C ++, all objects are destroyed (cleared ). Or in other words, all objects "should" be damaged. If you create a C ++ object as a local object, such as creating a C ++ object in the stack (which is impossible in Java ), the clearing or destruction work will be performed at the end of the scope of the created object represented by "ending curly braces. If the object is created with new (similar to Java), when the programmer calls the C ++ delete command (Java does not have this command), it will call the corresponding breaker. If the programmer forgets the vulnerability, the attacker will never be called. What we get is a memory "Vulnerability", and other parts of the object will never be cleared.
On the contrary, Java does not allow us to create local (local) objects-new is used in any case. But in Java, there is no "delete" command to release the object, because the Garbage Collector will help us automatically release the bucket. Therefore, if we stand in a simplified position, we can say that Java has no destructor because of the garbage collection mechanism. However, with the further study in the future, we will know that the existence of the garbage collector cannot completely eliminate the need for the destructor, or you cannot eliminate the need for the mechanism represented by the Destructor (and you cannot directly call finalize (), so try to avoid using it ). To clear a bucket in some other way, you must still call a method in Java. It is equivalent to a C ++ breaker, but it is convenient without the latter.
The following example shows the garbage collection process and summarizes the previous statements.
class Chair {
 static boolean gcrun = false;
 static boolean f = false;
 static int created = 0;
 static int finalized = 0;
 int i;
 Chair() {
  i = ++created;
  if(created == 47) 
   System.out.println("Created 47");
 }
 protected void finalize() {
  if(!gcrun) {
   gcrun = true;
   System.out.println("Beginning to finalize after " + created + " Chairs have been created");
  }
  if(i == 47) {
   System.out.println("Finalizing Chair #47," +"Setting flag to stop Chair creation");
   f = true;
  }
  finalized++;
  if(finalized >= created)
   System.out.println("All " + finalized + " finalized");
 }
}

public class Garbage {
 public static void main(String[] args) {
  if(args.length == 0) {
   System.err.println("Usage: \n" + "java Garbage before\n or:\n" + "java Garbage after");
   return;
  }
  while(!Chair.f) {
   new Chair();
   new String("To take up space");
  }
  System.out.println("After all Chairs have been created:\n" + "total created = " + Chair.created +
",total finalized = " + Chair.finalized);
  if(args[0].equals("before")) {
    System.out.println("gc():");
    System.gc();
    System.out.println("runFinalization():");
    System.runFinalization();
  }
  System.out.println("bye!");
  if(args[0].equals("after"))
   System.runFinalizersOnExit(true);
 }


The above program creates many Chair objects, and when the spam collector starts running, the program stops creating Chair. Since the Garbage Collector may run at any time, we cannot know exactly when it will start. Therefore, the program uses a tag named garbage UN to identify whether the Garbage Collector has started running. With the second mark f, Chair can tell main () that it should stop generating objects. These two tags are both set inside finalize () and are called during garbage collection. The other two static variables, created and finalized, are used to track the number of created objects and the number of objects that have been completed by the garbage collector. Finally, each Chair has its own (non-static) int I, so you can track the specific number of it. After the Chair with the ID 47 is finished, the flag is set to true, and the Chair object creation process ends.
Some Supplements to garbage collection

According to the above descriptions, it can be found that garbage collection has the following characteristics:

(1) unpredictability of the occurrence of garbage collection: Because different garbage collection algorithms are implemented and different collection mechanisms are adopted, it may occur on a regular basis, it may occur when the system idle CPU resources occur, or it may be the same as the original garbage collection, when the memory consumption limit occurs, this is related to the selection and specific settings of the garbage collector.

(2) Accuracy of garbage collection: mainly includes two aspects: (a) the garbage collector can accurately mark the living objects; (B) the garbage collector can accurately locate the reference relationship between objects. The former is the premise to completely recycle all discarded objects, otherwise it may cause memory leakage. The latter is a necessary condition for implementing algorithms such as merging and copying. All reachable objects can be reliably recycled, and all objects can be re-allocated. This allows object replication and Object Memory reduction, effectively preventing memory fragmentation.

(3) There are many different types of garbage collectors, each of which has its own algorithms and their performance is different. They both stop the application running when the garbage collection starts, in addition, when the garbage collection starts, the application thread is also allowed to run, and at the same time, the garbage collection is run in multiple threads.

(4) The implementation of garbage collection is closely related to the specific JVM and JVM memory models. Different JVMs may adopt different garbage collection methods, and the JVM memory model determines which types of garbage collection can be used by the JVM. Now, the memory systems in the HotSpot series JVM use advanced object-oriented framework design, which enables the series JVM to adopt the most advanced garbage collection.

(5) with the development of technology, modern garbage collection technology provides many optional garbage collectors, and different parameters can be set when configuring each collector, this makes it possible to obtain the optimal application performance based on different application environments.

 For the above features, we should pay attention: 

(1) do not try to assume the time when the garbage collection occurred. This is all unknown. For example, a temporary object in a method becomes useless after the method is called, and its memory can be released.

(2) Java provides some classes dealing with garbage collection, and provides a method to forcibly execute garbage collection-call System. gc (), but this is also an uncertain method. Java does not guarantee that every time this method is called, it will be able to start garbage collection, but it will send such an application to JVM, whether or not to really execute garbage collection, everything is unknown.

(3) Select a suitable garbage collector. In general, if the system does not have special and demanding performance requirements, you can use the default JVM option. Otherwise, you can consider using targeted garbage collectors. For example, incremental collectors are suitable for systems with high real-time requirements. The system has high configuration and many idle resources. You can consider using parallel mark/clear collectors.

(4) The key problem is memory leakage. Good Programming habits and rigorous programming attitude are always the most important. Do not make yourself a small error and cause a large memory vulnerability.

(5) Release reference of useless objects as soon as possible. When using temporary variables, most programmers automatically set the reference variable to null after exiting the scope. This implies that the Garbage Collector collects the object, you must also check whether the referenced object is monitored. If so, remove the listener and assign a null value.

Conclusion 

Generally, Java developers do not pay attention to heap memory allocation and garbage collection in JVM. However, fully understanding this feature of Java allows us to use resources more effectively. At the same time, note that the finalize () method is the default mechanism of Java. Sometimes you can write your own finalize method to ensure the explicit release of object Resources.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.