java 記憶體泄露調試和解決

最後更新：2018-07-27 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

說起Java的記憶體泄露，其實定義不是那麼明確。首先，如果JVM沒有bug，那麼理論上是不會出現“無法回收的堆空間”，也就是說C/C++中的那種內存泄露在Java中不存在的。其次，如果由於Java程式一直持有某個對象的引用，但是從程式邏輯上看，這個對象再也不會被用到了，那麼我們可以認為這個對象被泄露了。如果這樣的對象數量很多，那麼很明顯，大量的記憶體空間就被泄露（“浪費”更準確一些）了。

目錄[-] 分析記憶體泄露的一般步驟 dump heap analyze heap 原因解釋解決方案是否Bug 一些補充

這幾天，一直在為Java的“記憶體泄露”問題糾結。Java應用程式佔用的記憶體在不斷的、有規律的上漲，最終超過了監控閾值。福爾摩斯不得不出手了。 分析記憶體泄露的一般步驟

如果發現Java應用程式佔用的記憶體出現了泄露的跡象，那麼我們一般採用下面的步驟分析把Java應用程式使用的heap dump下來使用Java heap分析工具，找出記憶體佔用超出預期（一般是因為數量太多）的嫌疑對象必要時，需要分析嫌疑對象和其他對象的參考關聯性。查看程式的原始碼，找出嫌疑對象數量過多的原因。 dump heap

如果Java應用程式出現了記憶體泄露，千萬別著急著把應用殺掉，而是要儲存現場。如果是互連網應用，可以把流量切到其他伺服器。儲存現場的目的就是為了把運行中JVM的heap dump下來。

JDK內建的jmap工具，可以做這件事情。它的執行方法是： Java代碼 jmap -dump:format=b,file=heap.bin <pid>

format=b的含義是，dump出來的檔案時二進位格式。

file-heap.bin的含義是，dump出來的檔案名稱是heap.bin。

<pid>就是JVM的進程號。

（在linux下）先執行ps aux | grep java，找到JVM的pid；然後再執行jmap -dump:format=b,file=heap.bin <pid>，得到heap dump檔案。 analyze heap

將二進位的heap dump檔案解析成human-readable的資訊，自然是需要專業工具的協助，這裡推薦Memory Analyzer 。

Memory Analyzer，簡稱MAT，是Eclipse基金會的開源項目，由SAP和IBM捐助。巨頭公司出品的軟體還是很中用的，MAT可以分析包含數億級對象的heap、快速計算每個對象佔用的記憶體大小、對象之間的參考關聯性、自動檢測記憶體泄露的嫌疑對象，功能強大，而且介面友好易用。

MAT的介面基於Eclipse開發，以兩種形式發布：Eclipse外掛程式和Eclipe RCP。MAT的分析結果以圖片和報表的形式提供，一目瞭然。總之個人還是非常喜歡這個工具的。下面先貼兩張官方的screenshots：

言歸正傳，我用MAT開啟了heap.bin，很容易看出，char[]的數量出其意料的多，佔用90%以上的記憶體。一般來說，char[]在JVM確實會佔用很多記憶體，數量也非常多，因為String對象以char[]作為內部儲存。但是這次的char[]太貪婪了，仔細一觀察，發現有數萬計的char[]，每個都佔用數百K的記憶體。這個現象說明，Java程式儲存了數以萬計的大String對象。結合程式的邏輯，這個是不應該的，肯定在某個地方出了問題。

順藤摸瓜

在可疑的char[]中，任意挑了一個，使用Path To GC Root功能，找到該char[]的引用路徑，發現String對象是被一個HashMap中引用的。這個也是意料中的事情，Java的記憶體泄露多半是因為對象被遺留在全域的HashMap中得不到釋放。不過，該HashMap被用作一個緩衝，設定了緩存條目的閾值，導達到閾值後會自動淘汰。從這個邏輯分析，應該不會出現記憶體泄露的。雖然緩衝中的String對象已經達到數萬計，但仍然沒有達到預先設定的閾值（閾值設定地比較大，因為當時預估String對象都比較小）。

但是，另一個問題引起了我的注意：為什麼緩衝的String對象如此巨大。內部char[]的長度達數百K。雖然緩衝中的 String對象數量還沒有達到閾值，但是String對象大小遠遠超出了我們的預期，最終導致記憶體被大量消耗，形成記憶體泄露的跡象（準確說應該是記憶體消耗過多）。

就這個問題進一步順藤摸瓜，看看String大對象是如何被放到HashMap中的。通過查看程式的原始碼，我發現，確實有String大對象，不過並沒有把String大對象放到HashMap中，而是把String大對象進行split（調用String.split方法），然後將split出來的String小對象放到HashMap中了。

這就奇怪了，放到HashMap中明明是split之後的String小對象，怎麼會佔用那麼大空間呢。難道是String類的split方法有問題。

查看代碼

帶著上述疑問，我查閱了Sun JDK6中String類的代碼，主要是是split方法的實現： Java代碼 public String[] split(String regex, int limit) { return Pattern.compile(regex).split(this, limit); }

可以看出，Stirng.split方法調用了Pattern.split方法。繼續看Pattern.split方法的代碼： Java代碼 public String[] split(CharSequence input, int limit) { int index = 0; boolean matchLimited = limit > 0; ArrayList<String> matchList = new ArrayList<String>(); Matcher m = matcher(input); // Add segments before each match found while(m.find()) { if (!matchLimited || matchList.size() < limit - 1) { String match = input.subSequence(index, m.start()).toString(); matchList.add(match); index = m.end(); } else if (matchList.size() == limit - 1) { // last one String match = input.subSequence(index, input.length()).toString(); matchList.add(match); index = m.end(); } } // If no match was found, return this if (index == 0) return new String[] {input.toString()}; // Add remaining segment if (!matchLimited || matchList.size() < limit) matchList.add(input.subSequence(index, input.length()).toString()); // Construct result int resultSize = matchList.size(); if (limit == 0) while (resultSize > 0 && matchList.get(resultSize-1).equals("")) resultSize--; String[] result = new String[resultSize]; return matchList.subList(0, resultSize).toArray(result); }

注意看第9行：Stirng match = input.subSequence(intdex, m.start()).toString();

這裡的match就是split出來的String小對象，它其實是String大對象subSequence的結果。繼續看 String.subSequence的代碼： Java代碼 public CharSequence subSequence(int beginIndex, int endIndex) { return this.substring(beginIndex, endIndex); }

String.subSequence有調用了String.subString，繼續看： Java代碼 public String substring(int beginIndex, int endIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex > count) { throw new StringIndexOutOfBoundsException(endIndex); } if (beginIndex > endIndex) { throw new StringIndexOutOfBoundsException(endIndex - beginIndex); } return ((beginIndex == 0) && (endIndex == count)) ? this : new String(offset + beginIndex, endIndex - beginIndex, value); }

看第11、12行，我們終於看出眉目，如果subString的內容就是完整的原字串，那麼返回原String對象；否則，就會建立一個新的 String對象，但是這個String對象貌似使用了原String對象的char[]。我們通過String的建構函式確認這一點： Java代碼 // Package private constructor which shares value array for speed. String(int offset, int count, char value[]) { this.value = value; this.offset = offset; this.count = count; }

為了避免記憶體拷貝、加快速度，Sun JDK直接複用了原String對象的char[]，位移量和長度來標識不同的字串內容。也就是說，subString出的來String小對象仍然會指向原String大對象的char[]，split也是同樣的情況。這就解釋了，為什麼HashMap中String對象的char[]都那麼大。 原因解釋

其實上一節已經分析出了原因，這一節再整理一下：程式從每個請求中得到一個String大對象，該對象內部char[]的長度達數百K。程式對String大對象做split，將split得到的String小對象放到HashMap中，用作緩衝。 Sun JDK6對String.split方法做了最佳化，split出來的Stirng對象直接使用原String對象的char[] HashMap中的每個String對象其實都指向了一個巨大的char[] HashMap的上限是萬級的，因此被緩衝的Sting對象的總大小=萬*百K=G級。 G級的記憶體被緩衝佔用了，大量的記憶體被浪費，造成記憶體泄露的跡象。
解決方案

原因找到了，解決方案也就有了。split是要用的，但是我們不要把split出來的String對象直接放到HashMap中，而是調用一下 String的拷貝建構函式String(String original)，這個建構函式是安全的，具體可以看代碼： Java代碼 /** * Initializes a newly created {@code String} object so that it represents * the same sequence of characters as the argument; in other words, the * newly created string is a copy of the argument string. Unless an * explicit copy of {@code original} is needed, use of this constructor is * unnecessary since Strings are immutable. * * @param original * A {@code String} */ public String(String original) { int size = original.count; char[] originalValue = original.value; char[] v; if (originalValue.length > size) { // The array representing the String is bigger than the new // String itself. Perhaps this constructor is being called // in order to trim the baggage, so make a copy of the array. int off = original.offset; v = Arrays.copyOfRange(originalValue, off, off+size); } else { // The array representing the String is the same // size as the String, so no point in making a copy. v = originalValue; } this.offset = 0; this.count = size; this.value = v; }

只是，new String(string)的代碼很怪異，囧。或許，subString和split應該提供一個選項，讓程式員控制是否複用String對象的 char[]。 是否Bug

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

java 記憶體泄露調試和解決

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support