Working principle of GC

Last Update:2018-07-26 Source: Internet

Author: User

Tags memory usage java throws

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A good Java programmer must understand how the GC works, how to optimize the performance of the GC, and how to interact with the GC in a limited way, because some applications have higher performance requirements, such as embedded systems, real-time systems, and so on, to improve the performance of the entire application only by improving the efficiency of memory management. This article first briefly introduces the working principle of GC, then discusses several key problems of GC, and finally puts forward some suggestions of Java programming to improve the performance of Java program from the GC angle.

the basic principle of GC

Java's memory management is actually the management of objects, including the allocation and release of objects.

For programmers, the assigned object uses the New keyword, and when the object is disposed, it cannot be accessed by the program unless all references to the object are assigned null, which we call the object "unreachable." The GC is responsible for reclaiming the memory space of all "unreachable" objects.

For GC, when a programmer creates an object, the GC begins to monitor the address, size, and usage of the object. In general, the GC uses a heap approach to record and manage all objects in the heap (see Resources 1). In this way, determine which objects are "accessible" and which objects are "unreachable." When the GC determines that some objects are "unreachable", the GC is responsible for reclaiming the memory space. However, in order to ensure that GC can be implemented on different platforms, the Java specification does not strictly regulate many behaviors of GC. For example, there are no clear rules on what type of recovery algorithm to use and when to recycle. Therefore, the implementations of different JVMs often have different implementation algorithms. This also brings a lot of uncertainty to the development of Java programmers. In this paper, several problems related to GC work are studied to reduce the negative impact of this uncertainty on Java programs.

Incremental GC (incremental GC)

The GC is typically implemented by one or a group of processes in the JVM itself, which occupies heap space as well as the user program and consumes the CPU at run time. When the GC process is running, the application stops running. Therefore, when the GC is running for a long time, the user can feel the pause of the Java program, on the other hand, if the GC is running too short, the object recovery rate may be too low, which means that there are many objects that should be recycled are not recycled, still occupy a lot of memory. Therefore, when designing a GC, you must weigh the pause time against the recovery rate. A good GC implementation allows users to define the settings they need, such as some limited memory devices, very sensitive to memory usage, and expect the GC to be able to reclaim memory accurately, and it does not care about slowing down the speed of the program. In addition to some real-time online games, it is not possible to allow a program to be interrupted for a long time. Incremental GC is the use of a certain recovery algorithm, a long break, divided into many small interrupts, in this way to reduce the effect of GC on user programs. Although an incremental GC may not be as efficient as a regular GC in overall performance, it can reduce the maximum downtime of a program.

The HotSpot JVM provided by the Sun JDK supports the incremental gc.hotspot JVM default GC method for not using incremental GC, in order to start an incremental GC, we must increase the-XINCGC parameters when running the Java program. HotSpot JVM Incremental GC is implemented using the train GC algorithm. The basic idea is to group all the objects in the heap by grouping (tiering) in terms of creation and usage, putting objects in the first team with frequently high and relevant dependencies, and constantly adjusting the group as the program runs. When the GC is running, it always reclaims the oldest (most recently infrequently accessed) objects, and if the entire group is recyclable, the GC reclaims the entire group. In this way, each GC run only a certain proportion of the unreachable object, to ensure the smooth operation of the program.

Detailed Finalize function

Finalize is a method that is located in the object class, which has an access modifier of protected, because all classes are subclasses of object, so it is easy for a user class to access this method. Since the Finalize function does not automatically implement chained calls, we must implement them manually, so the last statement of the Finalize function is usually super.finalize (). In this way, we can implement a finalize call from bottom to top, that is, releasing its own resources before releasing the resources of the parent class.

According to the Java language Specification, the JVM guarantees that this object is not reachable until the Finalize function is invoked, but the JVM does not guarantee that the function will be invoked. In addition, the specification guarantees that the Finalize function runs at most once.

Many Java beginners will think that this approach is similar to the destructor in C + +, where the release of many objects and resources is placed in this function. In fact, this is not a very good way. There are three reasons why the GC, in order to be able to support the Finalize function, has a lot of additional work to do with the object that overrides the function. Second, after the finalize run is complete, the object may become reachable, and the GC will check again if the object is reachable. Therefore, using finalize will reduce the performance of the GC. Third, since the GC call finalize time is indeterminate, releasing resources in this way is also uncertain.

Typically, finalize is used for the release of some easily controlled and very important resources, such as some I/O operations, data connections. The release of these resources is critical to the entire application. In this case, the programmer should focus on the resources that are managed by the program itself, including releasing them, as a supplement to the Finalize function, releasing the resources, creating an insurance management mechanism rather than relying solely on finalize to release resources.

The following example shows that a finalize function may still be reachable after being invoked, and that the finalize of an object can only run once.

Class myobject{

Test main; Record test object, used in finalize to restore accessibility

Public MyObject (Test t)

{

main=t; Save Test object

}

protected void Finalize ()

{

main.ref=this;//restores this object so that this object can be up to

System.out.println ("This is Finalize");//To test finalize only run once

}

Class Test {

MyObject ref;

public static void Main (string[] args) {

Test test=new test ();

Test.ref=new MyObject (test);

Test.ref=null; The MyObject object is an unreachable object and finalize will be called

System.GC ();

if (test.ref!=null) System.out.println ("My object is Alive");

}

Run Result:

This is finalize

MyObject's alive.

In this example, it should be noted that although the MyObject object becomes reachable in finalize, the next time the collection is collected, Finalize is no longer invoked because the Finalize function is invoked at most once.

how a program interacts with a GC

JAVA2 enhances the memory management feature and adds a JAVA.LANG.REF package that defines three reference classes. These three reference classes are SoftReference, WeakReference, and phantomreference. By using these reference classes, programmers can interact with the GC to some extent to improve the efficiency of the GC. The reference strength of these reference classes is between the accessible object and the unreachable object.

Creating a Reference object is also very easy, for example, if you need to create a soft reference object, first create an object and use a normal reference (to reach the object); then create a softreference reference to the object , and then set the normal reference to null. In this way, this object has only one soft reference reference. At the same time, we call this object a soft Reference object.

The main feature of Soft reference is the strong reference function. This type of memory is recycled only when there is not enough memory, so they are usually not recycled when memory is sufficient. In addition, these reference objects can guarantee that before Java throws OutOfMemory exceptions, is set to NULL. It can be used to realize the cache of some commonly used pictures, realize the function of cache, ensure the maximum use of memory without causing outofmemory. The following is the use of pseudo code for this reference type;

Apply for an Image object

Image image=new image ();//Create Image Object

...

Using image

...

Use the image, set it to the soft reference type, and release the strong reference;

SoftReference sr=new softreference (image);

Image=null;

...

Next time you use

if (sr!=null) image=sr.get ();

else{

Because the GC has released image because of low memory, it needs to be reloaded;

Image=new Image ();

Sr=new softreference (image);

}

The biggest difference between a Weak reference object and a soft reference object is that the GC, when it is recycling, needs an algorithm to check whether to recycle the soft reference object, and the GC always reclaims the Weak reference object. Weak referencing objects is easier and quicker to be recycled by GC. Although GC must reclaim weak objects at runtime, the weak object groups of complex relationships often require several GC operations to complete. Weak reference objects are often used in the map structure to refer to objects with large amounts of data, and the GC can quickly reclaim the object space once the object's strong reference is null.

The Phantom reference is used for a lesser purpose, primarily to assist in the use of the Finalize function. The Phantom object refers to objects that perform the finalize functions and are unreachable objects, but they are not yet collected by GC. This object can assist finalize in some later recycling efforts, and we enhance the flexibility of the resource recycling mechanism by overwriting the reference clear () method.

Some suggestions for Java coding

Based on how the GC works, we can use some techniques and methods to make the GC run more efficiently and meet the requirements of the application. Here are a few suggestions for programming.

1. The most basic advice is to release the references to unwanted objects as soon as possible. Most programmers use a temporary variable, is to have the reference variable automatically set to null after exiting the active field (scope). When we use this method, we must pay special attention to some complex object graphs, such as arrays, queues, trees, graphs and so on, which are more complex to be related to each other. GC recycling is generally less efficient for such objects. If the program allows, assign the unused reference object to NULL as soon as possible. This can speed up the work of GC.

2. Use the Finalize function as little as possible. A finalize function is the opportunity that Java provides to programmers to release objects or resources. However, it increases the amount of GC effort and therefore minimizes the use of finalize methods to reclaim resources.

3. If you need to use a picture that you use frequently, you can use soft to apply the type. As much as possible, it can save pictures in memory for program invocation without causing outofmemory.

4. Note the collection data types, including arrays, trees, graphs, lists, and other data structures, these data structures for GC, recovery more complex. In addition, note some global variables, as well as some static variables. These variables tend to cause suspension objects (dangling reference), causing memory waste.

5. When the program has a certain waiting time, the programmer can manually execute System.GC () to notify the GC to run, but the Java language specification does not guarantee that the GC will execute. Using an incremental GC can shorten the time that a Java program pauses

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More