Java memory leak debugging and resolution

Source: Internet
Author: User
Speaking of Java memory leaks, in fact, the definition is not so clear. First, if the JVM has no bugs, there is no "heap space that cannot be recycled" in theory, which means that memory leaks in C/s are not in Java. Second, if the Java program has been holding a reference to an object, but from the procedural logic, the object will never be used again, then we can assume that the object has been compromised. If the number of such objects is large, it is clear that a lot of memory space is compromised ("waste" is more accurate).

directory [-] general steps for analyzing memory leaks dump heap analyze heap reason explanation solution bugs some supplements


These days, has been in the Java "memory leak" problem entangled. The memory consumed by Java applications continues to rise regularly, exceeding the monitoring threshold. Sherlock Holmes had to take a shot. General steps for analyzing memory leaks

If you find a leak in memory consumed by Java applications, we typically use the following steps to analyze the heap dump used by Java applications using the Java Heap Analysis tool to find out where the memory footprint is more than expected (typically because too many) suspects are needed, You need to analyze the referential relationships of suspect objects and other objects. Review the source code of the program to find out why the number of suspects is excessive. Dump Heap

If there is a memory leak in the Java application, do not worry about killing the application, but save the site. If it is an Internet application, you can cut traffic to another server. The purpose of saving the site is to dump the heap of the running JVM.

JDK has its own jmap tool that can do this thing. It is executed by: Java code jmap-dump:format=b,file=heap.bin <pid>

The meaning of format=b is that the dump comes out of the file in binary format.

The meaning of File-heap.bin is that the file name of the dump is heap.bin.

<pid> is the process number of the JVM.

(under Linux) perform PS aux first | grep Java, find the JVM's PID, then execute Jmap-dump:format=b,file=heap.bin <pid>, and get heap dump file. Analyze Heap

The binary heap dump file is parsed into human-readable information, which is naturally required with the help of a professional tool, which is recommended here memory Analyzer.

Memory Analyzer, referred to as Mat, is an open-source project of the Eclipse Foundation, donated by SAP and IBM. The software produced by giant companies is still very good, mat can analyze the heap of hundreds of millions of-level objects, quickly calculate the memory size of each object, the reference relationship between objects, and detect the suspect of memory leak, powerful and user-friendly.

The Mat interface is based on Eclipse Development and is published in two forms: the Eclipse plug-in and the Eclipe RCP. Mat's analysis results are provided in the form of pictures and statements, at a glance. In a word, the individual still likes this tool very much. Here are two official screenshots:


To start with, I opened the heap.bin with Mat, it is easy to see, char[] number out of its expectation, occupy more than 90% of the memory. In general, char[] does take up a lot of memory in the JVM, as well as a very large number, because string objects are char[as internal storage. But this time the char[] too greedy, careful observation, found that there are tens of thousands of char[], each occupies hundreds of K of memory. This phenomenon indicates that the Java program holds tens of thousands of large string objects. With the logic of the program, this is not supposed to be, there must be a problem somewhere.

Shuntengmogua

In suspicious char[], select one arbitrarily, using the path to GC root function, find the reference path to the char[, and find that the string object is referenced by a hashmap. This is also expected to happen, the Java memory leak is mostly because the object is left in the global HashMap can not be released. However, the HashMap is used as a cache, setting the threshold of the cache entry, which is automatically eliminated when the threshold is reached. From this logical analysis, there should be no memory leaks. Although the string object in the cache has reached tens of thousands of, it still does not have a pre-set threshold (the threshold setting is large because the estimated string object is smaller).

However, another question caught my attention: Why the cached string object is so large. The length of the internal char[] is up to hundreds of K. Although the number of string objects in the cache has not yet reached the threshold, the string object is much larger than we expected, resulting in a significant amount of memory consumption, which is a sign of memory leaks (and, to be precise, excessive memory consumption).

Take a further look at this question and see how the string object was put into the hashmap. By looking at the source code of the program, I found that there was a big string object, but instead of putting a large string object in the HashMap, I split the large string object (calling the String.Split method) and then split out The string small object came into the HashMap.

This is strange, put in the HashMap is obviously split after the string small object, how can occupy so much space. Is there a problem with the split method of the String class?

View Code

With that in view, I looked up the code for the String class in Sun JDK6, mainly the implementation of the split method: Java code public string[] Split (String regex, int limit) {return Pa   Ttern.compile (Regex). Split (this, limit); }

As you can see, the Stirng.split method calls the Pattern.split method. Continue to look at the code for the Pattern.split method: Java code   public    String[] split (charsequence input,  Int limit)  {           int index = 0 ;           boolean matchLimited = limit  > 0;           arraylist<string> matchlist  = new    arraylist<string> ();            matcher m = matcher (input);            // Add segments before each match found            while (M.find ())  {                if  (!matchlimited | |  matchlist.size () &NBsp;< limit - 1)  {                    string match = input.subsequence (index,     M.start ()). ToString ();                    matchlist.add (match);                    index = m.end ();                } else if  (Matchlist.size ()  ==  limit - 1)  { // last one                    string match = input.subsequence (Index,                                                          input.length ()). ToString ();                     Matchlist.add (match);                    index = m.end ();                }           }            // if no match was found, return this            if  (index == 0)                 return new string[] { Input.tostring ()}; &nbsP         // Add remaining segment            if  (!matchlimited | |  matchlist.size ()  < limit)                 matchlist.add (Input.subsequence (index,    input.length ()). ToString ();            // Construct result            int resultsize = matchlist.size ();            if  (limit == 0)                 while  (resultsize > 0 &&     Matchlist.get (resultSize-1). Equals (""))                     resultsize--;           String[] result = new  string[resultsize];           return matchlist.sublist (0 ,  resultsize). ToArray (Result);       }  

Watch line 9th: stirng match = Input.subsequence (Intdex, M.start ()). ToString ();

The match here is a split string object, which is actually the result of a string object subsequence. Continue to see String.subsequence code: Java Code public charsequence subsequence (int beginindex, int endindex) {return thi   S.substring (Beginindex, endindex); }

    String.subsequence has called string.substring, continue to look: Java code   public string    SUBSTRING (int beginindex, int endindex)  {       if  ( beginindex < 0)  {           throw new  stringindexoutofboundsexception (beginindex);       }        if  (endindex > count)  {            throw new stringindexoutofboundsexception (endindex);        }       if  (Beginindex > endindex)  {           throw new stringindexoutofboundsexception (EndIndex  - beginindex);       }       return  ((Beginindex == 0)  &&  (endindex == count))  ? this :            new string (offset + beginindex, endindex -  beginindex, value);       }  

Look at the 11th and 12 lines, we finally see that, if the content of the substring is the complete original string, then the original string object is returned, otherwise a new string object is created, but this string object appears to use the original string object's char[].       We confirm this by using the string constructor: the Java code//Package private constructor which shares value array for speed.       String (int offset, int count, Char value[]) {this.value = value;       This.offset = offset;       This.count = count; }

To avoid memory copying and speed, the Sun JDK directly reused the original string object's char[], offset and length to identify different string contents. In other words, the substring of a string object will still point to the char[],split of the original string large object. This explains why the char[of String objects in HashMap are so large. Explanation of Reason

In fact, the previous section has analyzed the reason, this section is sorted again: The program obtains a string large object from each request, the object internal char[] length reaches hundreds of K. The program does a split on a string large object and puts the split string object into the HashMap as a cache. Sun JDK6 Optimized The String.Split method, and the split Stirng object directly used the char[of the original string object, and each string in the HashMap actually pointed to a huge char[] The upper limit of the hashmap is million, so the total size of the cached sting object = million k=g level. G-class memory is cached, and a lot of memory is wasted, causing memory leaks.
Solution

The reason for     is found, and the solution is there. Split is going to be used, but instead of putting the split string object directly into the HashMap, we'll call the copy constructor string (string original) of string, which is safe, and can look at the code: Java code       /**       * Initializes a  newly created {@code  String} object so that it   represents       * the same sequence of characters as the  argument; in other words,   the       * newly  created string is a copy of the argument string. unless  an       * explicit copy of {@code  original}  is needed, use of this   constructor is       *  unnecessary since Strings are immutable.       *       *  @param   original        *         a {@code  String}        */       public string (string original)  {        int size = original.count;        char[] originalValue = original.value;       char[] v;        if  (originalvalue.length > size)  {           // the array representing the string  is bigger than the new           //  String itself.  Perhaps this constructor is being called    &NBSP;&NBSP;&NBsp;     // in order to trim the baggage, so  make a copy of the array.                int off = original.offset;                v = arrays.copyofrange (originalvalue, off, off+ Size);       } else {            // The array representing the String is the same            // size as the String, so  no point in making a copy.            v = originalValue;       }        This.offset = 0;       this.count = size;        this.value = v;       }  

It's just that the code for new string is weird, embarrassing. Perhaps substring and split should provide an option for programmers to control whether or not to reuse a string object's char[]. is a bug

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.