Java Memory leakage-Memory leakage causes and Memory leakage detection tools

Source: Internet
Author: User

Address: http://dev2dev.bea.com/pub/a/2005/06/memory_leaks.html
Eliminate memory leaks --

Author: Staffan Larsen

Summary

Although Java Virtual Machine (JVM) and its garbage collector (Garbage Collector, GC) are responsible for managing most of the memory tasks, memory leakage may still occur in Java software programs. In fact, this is a common problem in large projects. The first step to avoid Memory leakage is to figure out how it happened. This article describes some common memory leak traps for writing Java code and some best practices for writing non-Leak Code. Once a memory leak occurs, it is very difficult to point out the code that causes the leak. Therefore, this article also introduces a new tool for diagnosing leaks and pointing out the root cause. This tool has very low overhead, so you can use it to find memory leaks in the production system.

Role of the Garbage Collector

Although the Garbage Collector handles most of the memory management issues, making it easier for programmers to live, programmers may still make mistakes and cause memory problems. In short, GC cyclically traces all references from "root" objects (stack objects, static objects, objects pointed to by JNI handles, and so on, and mark all objects it can reach as active. The program can only manipulate these objects. Other objects are deleted. Because GC makes it impossible for the program to reach the deleted object, it is safe to do so.

Although memory management can be said to be automated, it does not prevent programmers from thinking about memory management issues. For example, there will always be overhead for allocating (and releasing) memory, although this overhead is invisible to programmers. Programs that have created too many objects will be slower (when other conditions are the same) than programs that have created fewer objects with the same functions ).

Furthermore, more closely related to this article is that if you forget to "release" the previously allocated memory, it may cause memory leakage. If the program retains references to objects that will never be used again, these objects will occupy and consume the memory, because the automated Garbage Collector cannot prove that these objects will no longer be used. As we mentioned earlier, if an object is referenced, the object is defined as active and therefore cannot be deleted. To ensure that the memory occupied by the object can be recycled, the programmer must ensure that the object cannot be reached. This is usually done by setting the object field to null or removing the object from the collection. However, note that when a local variable is no longer used, it is not necessary to explicitly set it to null. References to these variables are automatically cleared as the method exits.

In summary, this is the main cause of Memory leakage in the memory hosting language: Object references that are retained but never used again.

Typical Leakage

Now that we know that there is indeed a possibility of Memory leakage in Java, let's look at some typical memory leaks and their causes.

Global set

It is common to have a global data repository in a large application, such as a JNDI tree or a session table. In these cases, you must pay attention to the size of the management repository. There must be a mechanism to remove unnecessary data from the repository.

This can be done in multiple ways, but the most common one is a clear task that runs cyclically. This task verifies the data in the repository and removes any unnecessary data.

Another way to manage the repository is to use reverse link (referrer) to count. The Set collects the number of reverse links for each entry in the set. This requires the reverse link to tell the set when the entry will exit. When the number of reverse links is zero, this element can be removed from the set.

Cache

Cache is a data structure used to quickly find the results of executed operations. Therefore, if an operation is executed slowly, you can cache the operation results and use the cached data when you call the operation next time.

The cache is usually implemented dynamically. The new result is added to the Cache during execution. Typical algorithms are:

Check whether the result is cached. If yes, the result is returned.
If the result is not in the cache, computation is performed.
Add the calculated results to the cache for future calls to this operation.
The problem with this algorithm (or potential memory leakage) lies in the last step. If you call this operation with a considerable number of different inputs, a considerable number of results will be stored in the cache. Obviously, this is not the correct method.

To prevent this potentially destructive design, the program must ensure that there is an upper limit on the memory capacity used by the cache. Therefore, a better algorithm is:

Check whether the result is cached. If yes, the result is returned.
If the result is not in the cache, computation is performed.
If the cache space is too large, the longest result of the cache will be removed.
Add the calculated results to the cache for future calls to this operation.
By removing the result from the cache for the longest time, we actually assume that recently entered data will be more likely to be used than cached data. This is usually a good assumption.

The new algorithm will ensure that the cache capacity is within the predefined memory range. The exact range may be difficult to calculate because the objects in the cache are constantly changing and their references are all-encompassing. Setting the correct size for the cache is a very complex task. You need to balance the memory used with the speed of data retrieval.

Another way to solve this problem is to use the java. Lang. Ref. softreference class to track objects in the cache. This method ensures that these references can be removed, if the VM memory is used up and more heap is needed.

Classloader

The use of the Java classloader structure provides many opportunities for Memory leakage. It is precisely because of the complexity of this structure that classloader has so many problems in terms of Memory leakage. The special feature of classloader is that it not only involves "regular" Object references, but also metadata object references, such as fields, methods, and classes. This means that as long as there is a reference to the field, method, class, or classloader object, classloader will reside in JVM. Because classloader can be associated with many classes and their static fields, many memory leaks.

Determine the leak location

The first sign of Memory leakage is: An outofmemoryerror occurs in the application. This usually happens in the production environment where you least want it to happen, and debugging is almost impossible at this time. It may be because the test environment runs applications in a different way than the production system, resulting in leakage only in production. In this case, some low-overhead tools are required to monitor and find memory leaks. You also need to be able to connect these tools to a running system without restarting the system or modifying the code. Most importantly, you may need to be able to disconnect the tool during analysis to ensure that the system is not disturbed.

Although outofmemoryerror is usually a memory leak signal, it is also possible that the application is indeed using so much memory; for the latter, or you must increase the number of available JVM heaps, or make some changes to the application so that it uses less memory. However, in many cases, outofmemoryerror is a memory leak signal. One way to find out is to continuously monitor GC activities and determine whether memory usage increases over time. If so, memory leakage may occur.

Detailed output

There are many ways to monitor spam collector activity. The most widely used one is to start JVM with the-xverbose: GC option and observe the output.

[Memory] 10.109-10.235: GC 65536 K-> 16788 K (65536 K), 126.000 MS
The value (16788 K in this example) after the arrow is the size of the heap used for garbage collection.

Console

It is very boring to view the output of continuous GC detailed statistics. Fortunately, there are tools in this regard. The jrockit Management Console displays the graph of the heap usage. With this graph, you can easily see whether the heap usage increases over time.

 

Figure 1. jrockit Management Console

You can even configure the management console so that the console can send you an email if the heap usage is too large (or based on other events. This makes it easier to view memory leaks.

Memory leakage detection tool

There are other dedicated memory leak detection tools. Jrockit memory leak detector can be used to view memory leaks and find out the root cause of the leaks. This powerful tool is tightly integrated into jrockit JVM, with very low overhead and easy access to virtual machine heap.

Advantages of professional tools

Once you know that a memory leak occurs, you need more professional tools to find out why the leak occurs. The JVM won't tell you. These professional tools obtain information about the memory system from the JVM in two ways: jvmti and Byte Code Instrumentation ). Java Virtual Machine Tools interface (jvmti) and its predecessor Java Virtual Machine Monitoring Program Interface (jvmpi) it is a standardized interface for external tools to communicate with JVM and collect information from JVM. Bytecode technology refers to the technology that uses detectors to process bytecode to obtain information required by tools.

These two technologies have two disadvantages for memory leak detection, which makes them unsuitable for production environments. First, their overhead in terms of memory usage and performance reduction cannot be ignored. Information about heap usage must be exported from JVM in some way and collected to tools for processing. This means to allocate memory for the tool. Information export also affects JVM performance. For example, the garbage collector runs slowly when collecting information. Another disadvantage is that you must always connect the tool to the JVM. This is not possible: connect the tool to a started JVM for analysis, disconnecting the tool, and keeping the JVM running.

Because jrockit memory leak detector is integrated into JVM, there are no such two shortcomings. First, many processing and analysis tasks are performed within the JVM, so there is no need to convert or recreate any data. Processing can also carry (piggyback) on the garbage collector itself, which means the speed is improved. Second, as long as the JVM is started using the-xmanagement option (allowing monitoring and managing JVM through remote JMX interfaces), the memory leak detector can be connected or disconnected from the running JVM. When the tool is disconnected, nothing is left in the JVM, And the JVM will run the code at full speed, just as before the tool was connected.

Trend Analysis

Let's take a deeper look at the tool and how it is used to track memory leaks. After a memory leak occurs, the first step is to find out what data is leaked-Which class object causes the leakage? Jrockit memory leak detector is implemented by calculating the number of existing objects in each class during each garbage collection. If the number of objects in a specific class increases over time ("growth rate"), memory leakage may occur.

 
Figure 2. Trend Analysis view of Memory Leak Detector

Because the leakage may be as small as a small stream, trend analysis must run for a long time. In a short period of time, some classes may increase locally, and then they will fall. However, the overhead of trend analysis is very small (the biggest overhead is to send data packets from jrockit to Memory Leak Detector each time garbage collection is performed ). Overhead should not be a problem with any system-even a system running at full speed in production.

At first, the number will jump continuously, but after a period of time they will stabilize and show which classes are growing.

Find the root cause

Sometimes it is enough to know which class objects are being leaked to illustrate the problem. These classes may only be used for a very limited part of the code. A quick check of the Code will show the problem. Unfortunately, it is very likely that only such information is insufficient. For example, java. Lang. string objects are often leaked, but this is not helpful because strings are used in the entire program.

What we want to know is what other objects are associated with the leaked objects? In this example, it is a string. Why does the leaked object still exist? Which objects retain references to these objects? However, all objects that can be listed that are retained for string reference will be so many that they are of no practical use. To limit the number of data, you can group the data by class to see which other object classes are associated with the leaked object (string. For example, string is common in hashtable, so we may see the hashtable data item object associated with string. By pushing down hashtable data items, we can finally find hashtable objects related to these data items and string (3 ).

 
Figure 3. Sample view of the type chart displayed in the tool

Reverse push

Because we still treat objects as class objects rather than independent objects, we don't know which hashtable is leaking. If we can figure out how big all hashtable is in the system, we can assume that the biggest hashtable is the one that is being leaked (because it will accumulate leakage over time and grow quite large ). Therefore, a list of all hashtable objects and the amount of data they reference will help us identify the exact hashtabl that causes leakage.

 
Figure 4. Page: hashtable object and list of the number of referenced data

The computing overhead of the number of data referenced by an object is very large (you need to use this object as the root traversal reference graph). If you have to do this for many objects, it will take a lot of time. If you understand the internal implementation principle of hashtable, you can find a shortcut. Hashtable has an array of hashtable data items. This array increases with the number of objects in hashtable. Therefore, to find the largest hashtable, we only need to find the largest array that references the hashtable data item. This is much faster.

Figure 5. Interface: List of the largest hashtable data item array and its size

Further steps

When the leaked hashtable instance is found, we can see which other instances are referencing the hashtable and push it back to see which hashtable is leaking.

 
Figure 6. The instance diagram in the tool

For example, this hashtable may be referenced by an object of the myserver type in a field named activesessions. This information is usually enough to find the source code to locate the problem.

 
Figure 7. Check the object and its reference to other objects

Locate the allocation location

It is useful to check where objects are allocated when tracking memory leaks. It is not enough to know how they are associated with other objects (that is, which objects reference them), and information about where they are created is also useful. Of course, you do not want to create auxiliary components of an application to print the stack trace for each allocation ). You do not want to connect an analytic program to the production environment when running the application just to track memory leaks.

With jrockit memory leak detector, the code in the application can be dynamically added at the time of allocation to create a stack trace. These stack traces can be accumulated and analyzed in tools. As long as the feature is not enabled, the cost will not be incurred because of this function, which means that allocation tracking can be performed at any time. When a request is distributed to a trail, the jrockit compiler dynamically inserts code to monitor the allocation, but only for the requested specific class. Even better, all the added code is removed during data analysis, and no changes are left in the code that will cause the performance of the application to degrade.

 
Figure 8. stack trace allocated by string during execution of the sample program

Conclusion

Memory leakage is hard to detect. This article focuses on several best practices to avoid Memory leakage, including always remembering the content placed in the data structure, and closely monitoring the memory usage to discover sudden increases.

We have seen how jrockit memory leak detector is used in production systems to track memory leaks. The tool uses a three-step method to identify leaks. First, analyze the trend to find out which class objects are being leaked. Next, let's see which other classes are associated with the leaked class objects. Finally, we will further study individual objects to see how they are correlated. It is also possible to dynamically track all objects in the system. These features, as well as the features that the tool tightly integrates into the JVM, allow you to track and fix memory leaks in a secure and powerful way.

References

Download jrockit Tool
Bea jrockit 5.0 instructions
New functions and tools in jrockit 5.0
Bea jrockit devcenter
Source

Http://dev2dev.bea.com/pub/a/2005/06/memory_leaks.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.