[Post] spam collector and Java programming

Source: Internet
Author: User
Tags java throws

Ouyang Chen (yeekee@sina.com)
Zhou Xin (zhouxin@sei.pku.edu.cn)

The garbage collector (GC) is basically transparent to Java programmers, however, a good Java programmer must understand the working principle of GC, how to optimize GC performance, and how to perform limited interaction with GC, because some applications have high performance requirements, for example, embedded systems and real-time systems can improve the performance of the entire application only by comprehensively improving the memory management efficiency. This article first briefly introduces the working principle of GC, then discusses several key issues of GC in depth, and finally puts forward some java programming suggestions to improve the performance of Java programs from the GC perspective.

Basic Principles of GC

Java memory management is actually object management, including object allocation and release.

For programmers, The New Keyword is used to allocate objects. When releasing an object, they only need to assign null values to all references of the object so that the program cannot access this object, we call this object "inaccessible ". GC recycles the memory space of all "inaccessible" objects.

For GC, when a programmer creates an object, GC starts to monitor the address, size, and usage of the object. Generally, GC records and manages all objects in heap by Directed Graphs (see reference 1 ). This method is used to determine which objects are "reachable" and which objects are "inaccessible ". When GC determines that some objects are "inaccessible", GC has the responsibility to recycle the memory space. However, to ensure that GC can be implemented on different platforms, many GC behaviors are not strictly defined in the Java specification. For example, there are no clear rules on the types of recycling algorithms used and when to recycle them. Therefore, the implementers of different JVM often have different implementation algorithms. This also brings many uncertainties to the Development of Java programmers. This article studies several issues related to GC and strives to reduce the negative impact of such uncertainty on Java programs.

Incremental GC (incremental GC)

GC is usually implemented by one or a group of processes in JVM. It also occupies heap space like the user program, and CPU usage during runtime. The application stops running when the GC process is running. Therefore, when GC runs for a long time, the user can feel the pause of the Java program. On the other hand, if GC runs for a short time, the object recovery rate may be too low, this means that many objects that should be recycled are not recycled and still occupy a large amount of memory. Therefore, we must weigh the pause time and recovery rate when designing GC. A good GC implementation allows users to define the settings they need. For example, some devices with limited memory are very sensitive to memory usage and GC is expected to accurately recycle the memory, it does not care about program slowdown. In addition, some real-time online games cannot allow long periods of program interruptions. Incremental GC divides a long interrupt into many small interruptions through a certain collection algorithm, which reduces the impact of GC on user programs. Although the overall performance of incremental GC may be less efficient than that of normal GC, it can reduce the maximum pause time of the program.

It indicates the comparison between incremental GC and common GC. The gray part indicates the CPU usage time of the thread.

The hotspot JVM provided by Sun JDK supports incremental GC. By default, hotspot JVM does not use incremental GC. To start incremental GC, we must add the-xincgc parameter when running Java programs. Hotspot JVM incremental GC adopts the train GC algorithm. The basic idea is to group (stratified) all objects in the heap by creation and usage, and put frequently used and correlated objects in a queue. As the program runs, constantly adjust the group. When GC is run, it always recycles the oldest (rarely accessed recently) Objects first. If the entire group is recyclable, GC recycles the entire group. In this way, each GC operation only recycles a certain percentage of inaccessible objects to ensure smooth operation of the program. The train GC algorithm is a very good algorithm. For details about the algorithm, see section 4.

Iii. Finalize Functions

Finalize is a method located in the object class. The access modifier of this method is protected. Because all classes are subclasses of objects, the user class can easily access this method. Because the finalize function does not automatically implement chained calls, we must implement them manually. Therefore, the finalize function's last statement is usually super. Finalize (). In this way, we can implement finalize calling from bottom to top, that is, releasing our own resources first and then releasing the parent class resources.

According to the Java language specification, the JVM ensures that the object is reachable before the finalize function is called, but the JVM does not guarantee that the function will be called. In addition, the finalize function can run at most once.

Many Java beginners will think that this method is similar to the destructor in C ++ and put the release of many objects and resources in this function. In fact, this is not a good method. There are three reasons: First, GC needs to perform a lot of additional work on the objects that overwrite the function to support the finalize function. Second, after the finalize operation is complete, the object may become reachable. GC also checks whether the object is reachable. Therefore, using finalize reduces the Running Performance of GC. Third, the time for GC to call finalize is uncertain, so releasing resources in this way is also uncertain.

In general, finalize is used to release very important resources that are not easily controlled, such as I/O operations and data connections. The release of these resources is critical to the entire application. In this case, programmers should primarily manage (including release) these resources through the program itself, supplemented by the finalize function to release resources, to form a double-insurance management mechanism, instead of relying solely on finalize to release resources.

The following example shows that after the finalize function is called, it may still be reachable. It can also be said that the Finalize of an object can only run once.


Class myobject {
Test main; // record the test object, which is used to restore accessibility in finalize
Public myobject (test T)
{
Main = T; // Save the test object
}
Protected void finalize ()
{
Main. ref = This; // restore the object so that the object can be reached
System. Out. println ("this is finalize"); // used to test finalize only once
}
}

Class test {
Myobject ref;
Public static void main (string [] ARGs ){
Test test = new test ();
Test. ref = new myobject (test );
Test. ref = NULL; // If the myobject object is an inaccessible object, finalize will be called.
System. GC ();
If (test. Ref! = NULL) system. Out. println ("My object is still alive ");
}
}

Running result:
This is finalize
Myobject is still alive

In this example, it is worth noting that, although the myobject object becomes an reachable object in finalize, finalize will not be called the next time it is recycled, because the finalize function can only be called once at most.

How to interact with GC

Java2 Enhances memory management and adds a java. Lang. Ref package, which defines three reference classes. The three reference classes are softreference, weakreference, and phantomreference. By using these reference classes, programmers can interact with GC to a certain extent to improve GC efficiency. The reference strength of these reference classes is between reachable objects and inaccessible objects. Shows the reference strength of these two types:

It is also very easy to create a reference object. For example, if you need to create a soft reference object, first create an object and use the normal reference method (reachable object ); then create a softreference to reference the object, and set the normal reference to null. In this way, this object has only one soft reference. At the same time, we call this object a soft reference object.

Soft reference has a strong reference function. This type of memory is recycled only when the memory is insufficient. Therefore, when the memory is sufficient, it is usually not recycled. In addition, these referenced objects can be set to null before Java throws an outofmemory exception. It can be used to cache some common images and implement the cache function to ensure maximum memory usage without causing outofmemory. The following is the pseudo code used for this reference type;


// Apply for an image object
Image image = new image (); // create an image object
...
// Use Image
...
// After image is used up, set it to soft reference type and release strong reference;
Softreference sr = new softreference (image );
Image = NULL;
...
// Next time
If (SR! = NULL) image = Sr. Get ();
Else {
// Due to the low memory usage of GC, the image has been released and therefore needs to be reloaded;
Image = new image ();
Sr = new softreference (image );
}

The biggest difference between a weak reference object and a soft reference object is that GC needs to check whether the soft reference object is recycled by algorithm during collection. GC always recycles the weak reference object. Weak reference objects are easier and faster to be recycled by GC. Although the weak object must be recycled during GC running, the weak object group with complex relationships often needs several GC operations to complete. Weak reference objects are often used in the map structure to reference objects with a large amount of data. Once the strong reference of this object is null, GC can quickly recycle the object space. For this example, see reference 4;

Phantom is rarely referenced and mainly used to assist in the use of finalize functions. Phantom objects refer to some objects. They run the finalize function and are non-reachable objects, but they have not been recycled by GC. This type of object can assist finalize in some subsequent recycling work. We can enhance the flexibility of the resource recovery mechanism by overwriting the clear () method of reference.

Some Suggestions on Java Coding

Based on the working principle of GC, we can use some techniques and methods to make GC run more efficiently and better meet the requirements of applications. The following are some suggestions for program design.

  1. The most basic suggestion is to release reference of useless objects as soon as possible. When using temporary variables, most programmers automatically set the reference variables to null after they exit the scope. When using this method, we must pay special attention to some complex object graphs, such as arrays, queues, trees, and graphs. These objects have more complex reference relationships. For such objects, GC is generally less efficient to recycle them. If the program permits it, assign null to unnecessary reference objects as soon as possible. This will accelerate GC.
  2. Use the finalize function as few as possible. The finalize function is an opportunity that Java provides to programmers to release objects or resources. However, it will increase the GC workload, so use finalize as little as possible to recycle resources.
  3. If you need to use frequently used images, you can use the soft application type. It can store images in the memory as much as possible for the program to call without causing outofmemory.
  4. Note that collection data types, including arrays, trees, graphs, and linked lists, are more complex for GC. In addition, pay attention to some global variables and some static variables. These variables are often prone to dangling reference, resulting in a waste of memory.
  5. When the program has a certain waiting time, the programmer can manually execute system. GC () to notify the GC to run, but the Java language specification does not guarantee that GC will be executed. Incremental GC can shorten the pause time of Java programs.

References

Article

  1. Ouyang Chen, Zhou Xin "Java and Memory leakage" http://www-900.ibm.com/developerWorks/cn/java/l-JavaMemoryLeak/index.shtml
  2. Y. Srinivas Ramakrishna "Atuomatic memory management in the Java hotspot Virtual Machine", http://java.sun.com/javaone
  3. Monica pawlan "reference objects and Garbage Collector" This article is a JDC article, which can be found at http://developer.java.sun.com/
  4. Bill Venners chapter 9 of the "inside the Java 2 Virtual Machine" http://www.artima.com/insidejvm/ed2/ch09GarbageCollectionPrint.html
  5. Sun Microsystems, "Java language specification, second version"


About the author
Ouyang Chen graduated from Peking University with a master's degree in computer science. He started studying Java-based software development and testing and participated in the development and testing of many Java-based applications and web service projects. Contact info yeekee@sina.com
Zhou Xin, PhD student in computer science, Peking University, main research direction: Program understanding, reverse engineering and software measurement, contact information zhouxin@sei.pku.edu.cn.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.