A good Java programmer must understand how the GC works, how to optimize GC performance, how to interact with the GC in a limited way, and some applications that have higher performance requirements, such as embedded systems, real-time systems, and so on, which can improve the performance of the entire application only by comprehensively improving the management efficiency of the memory. This paper discusses the principle of GC and several key issues of GC, finally puts forward some Java programming suggestions, how to improve the performance of Java program from the perspective of GC.
Fundamentals of GC
What is a GC? Why do we have a GC?
GC is the meaning of garbage collection (garbage Collection), memory processing is where programmers are prone to problems, forgetting or wrong memory recycling can cause program or system instability or even crashes, The GC functionality provided by Java can automatically monitor whether an object exceeds the scope to achieve the purpose of automatically reclaiming memory, and the Java language does not provide a way to release the displayed operation of the allocated memory.
Therefore, Java's memory management is actually the management of objects, including the allocation and release of objects.
For programmers, the allocation object uses the new keyword; When you dispose of an object, you assign null to all references to the object, so that the program cannot access the object again, and we call the object "unreachable." The GC will be responsible for reclaiming the memory space of all "unreachable" objects.
For GC, when a programmer creates an object, the GC starts to monitor the object's address, size, and usage. Typically, a GC uses a graph to record and manage all objects in the heap. In this way, you determine which objects are "accessible" and which are "unreachable". When the GC determines that some objects are unreachable, it is the GC's responsibility to reclaim those memory spaces. However, in order to ensure that the GC can be implemented on different platforms, the Java specification does not strictly govern many of the GC's behavior. For example, there are no clear rules as to what type of recovery algorithm to use, when to recycle, and so on. Therefore, different JVM implementations often have different implementation algorithms. This also brings a lot of uncertainty to the development of Java programmers. This paper studies several issues related to GC work and tries to reduce the negative impact of this uncertainty on Java programs.
Incremental GC (Incremental GC)
A GC is typically implemented by a process or group of processes in the JVM, which itself consumes heap space as a user program and consumes CPU at runtime. When the GC process runs, the application stops running. Therefore, when the GC is running longer, the user can feel the Java program's pause, on the other hand, if the GC runs too short, the object recovery may be too low, which means that many of the objects that should be recycled are not recycled and still occupy a lot of memory. Therefore, when designing a GC, a tradeoff must be made between the pause time and the recovery rate. A good GC implementation allows the user to define the settings they need, such as some limited memory devices, very sensitive to memory usage, and the hope that the GC can accurately reclaim memory, and it doesn't care about slowing down the program. In addition to some real-time online games, it is not possible to allow long interruptions to the program. Incremental GC is the use of a certain recovery algorithm, a long interruption, divided into a number of small interruptions, in this way to reduce the impact of GC on the user program. Although an incremental GC may not be as efficient as a normal GC in overall performance, it can reduce the maximum downtime of a program.
The HotSpot JVM provided by Sun JDK can support incremental gc.hotspot JVM default GC mode to not use incremental GC, in order to start the Delta GC, we must add-XINCGC parameters when running Java program. The implementation of the HotSpot JVM incremental GC is based on the train GC algorithm. The basic idea is that all objects in the heap are grouped (layered) by creation and usage, with frequently high and relevant objects placed in a single team, and the group is constantly tuned as the program runs. When the GC is running, it always recycles the oldest (recently infrequently accessed) objects, and if the whole group is recyclable, the GC will recycle the whole group. In this way, each GC runs only a certain percentage of unreachable objects to ensure the smooth running of the program.
Detailed Finalize function
Finalize is a method that is located in the object class, and the access modifier for this method is protected, because all classes are subclasses of object, so the user class is easily accessible to this method. Because the Finalize function does not automatically implement chained calls, we must implement them manually, so the last statement of the Finalize function is usually super.finalize (). In this way, we can implement finalize calls from bottom to top, freeing our own resources before releasing the resources of the parent class.
According to the Java language Specification, the JVM guarantees that the object is unreachable until the Finalize function is called, but the JVM does not guarantee that the function will be called. In addition, the specification guarantees that the Finalize function runs at most once.
Many Java beginners will think that this method is similar to the destructor in C + +, the release of many objects, resources are placed in this function. In fact, this is not a very good way. There are three reasons why, in order for the GC to support the Finalize function, there is a lot of additional work to be done on the object that overrides the function. Second, after finalize runs, the object may become reachable, and the GC checks again if the object is reachable. Therefore, using finalize reduces the performance of the GC. Thirdly, because the time of the GC call to finalize is indeterminate, freeing the resource in this way is also indeterminate.
In general, finalize is used for the release of some non-manageable and very important resources, such as some I/O operations, data connections. The release of these resources is critical for the entire application. In this case, the programmer should be managed by the program itself, including the release of these resources, with the Finalize function to release resources as a supplement to form a double-insurance management mechanism, and should not rely solely on finalize to release resources.
An example is given below that the Finalize function may still be reachable after it is called, and that the finalize of an object can only run once.
Class myobject{
Test main; Record test object for recovery of accessibility in finalize
Public MyObject (Test t)
{
main=t; Save Test object
}
protected void Finalize ()
{
main.ref=this;//restore this object, so that this object can be up to
System.out.println ("This is Finalize");//For Test finalize only run once
}
}
Class Test {
MyObject ref;
public static void Main (string[] args) {
Test test=new test ();
Test.ref=new MyObject (test);
Test.ref=null; The MyObject object is an unreachable object and finalize is called
System.GC ();
if (test.ref!=null) System.out.println ("My object is still alive");
}
}
Operation Result:
This is finalize
MyObject is still alive: in this case, it is important to note that while the MyObject object becomes reachable in finalize, finalize is no longer called when the next collection is made because the Finalize function is called only once.
How the program interacts with the GC
JAVA2 enhances the memory management feature by adding a JAVA.LANG.REF package that defines three reference classes. These three reference classes are SoftReference, WeakReference, and Phantomreference, respectively. By using these reference classes, programmers can interact with the GC to some extent to improve GC productivity. These reference classes have a reference strength between the unreachable object and the unreachable object.
It is also very easy to create a reference object, for example, if you need to create an soft reference object, first create an object and use a normal reference (up to the object), and then create a softreference to reference the object. Finally, the normal reference is set to null. In this way, the object has only one soft reference reference. At the same time, we call this object the soft Reference object.
The main feature of Soft reference is its strong reference function. This kind of memory is only recycled when there is not enough memory, so when memory is sufficient, they are usually not recycled. In addition, these reference objects guarantee that before Java throws a OutOfMemory exception, is set to NULL. It can be used to implement some common images of the cache, to achieve the function of the cache, to ensure maximum use of memory without causing outofmemory. The following shows the use of pseudo-code for this type of reference;
Request an Image Object
Image image=new image ();//Create Image Object
...
Using the image
...
After using the image, set it to the soft reference type and release the strong reference;
SoftReference sr=new softreference (image);
Image=null;
...
Next time you use
if (sr!=null) image=sr.get ();
else{
Because the GC has been released because of low memory, the image needs to be reloaded;
Image=new Image ();
Sr=new softreference (image);
}
The biggest difference between the Weak reference object and the soft reference object is that the GC will need to check whether the soft reference object is reclaimed by the algorithm when it recycles, and the GC is always recycled for Weak reference objects. Weak reference objects are more easily and faster to be recycled by GC. Although GC must recycle weak objects at run time, the weak object groups of complex relationships often require several GC runs to complete. Weak reference objects are often used in the map structure to refer to objects with a large amount of data, and once the strong reference to the object is NULL, the GC can quickly reclaim the object space.
Phantom references are less useful and are primarily used to assist in the use of finalize functions. The Phantom object refers to objects that have completed the Finalize function and are unreachable objects, but they have not yet been recycled by GC. This object can assist finalize with some later recycling, and we enhance the flexibility of the resource recovery mechanism by overriding the reference clear () method.
Some suggestions for Java programming
Based on how the GC works, we can make the GC run more efficiently and meet the requirements of the application in a number of tricks and ways. Some suggestions on programming:
1. The most basic suggestion is to release the reference to the useless object as soon as possible. Most programmers use temporary variables to have the reference variable automatically set to null after exiting the active domain (scope). When we use this approach, we must pay special attention to complex object graphs, such as arrays, queues, trees, graphs, etc., which are more complex to refer to each other. For such objects, GC recycling is generally less efficient. If the program allows, the unused reference object will be assigned null as soon as possible. This will speed up the GC's work.
2. Use the Finalize function sparingly. The Finalize function is a chance that Java provides programmers with an opportunity to release objects or resources. However, it will increase the amount of GC work, and therefore minimize the use of a Finalize method to reclaim resources.
3. If you need to use a frequently used picture, you can use the Soft app type. It can save the picture in memory as much as possible for the program to call, without causing outofmemory.
4. Note the collection data types, including arrays, trees, graphs, linked lists and other data structures, which are more complex for GC to recycle. Also, be aware of some global variables, as well as some static variables. These variables tend to cause hanging objects (dangling reference), causing memory waste.
5. When the program has a certain wait time, the programmer can manually execute System.GC () to notify the GC to run, but the Java language specification does not guarantee that the GC will be executed. Using incremental GC can shorten the pause time for Java programs.
Invoking mechanisms and programming rules for Java GC