JVM OOM & JAVA Finalizer-raised OOM & Thread.stop__oom

Source: Internet
Author: User
Tags throwable groovy script
background

This article is absolutely dry.
One day found that the customer environment has been oom occurred, but also a step-like memory growth. More depressing.
Abstract

This article will describe the following several things:
1. How should oom be analyzed in Java?
2. Why does JAVA finalizer cause oom?
3. Why not use the thread.stop Java oom What should be analyzed

Most of the time Java is doing well enough. But there is still a possibility that outofmemoryerror (OOM) will occur. So how do we analyze a oom error? get a memory dump Mode 1: Automatic Dump

When oom occurs, Java can automatically try (best effort) to generate a heap dump. You just need to add the following parameter to the startup parameter:

-xx:+heapdumponoutofmemoryerror

The files generated in this way are in the Java working directory, named Java_pidxxx.hprof mode 2: Manual dumps

Sometimes you can already find that a Java process takes up a lot of memory. It's time to manually export a heap of memory.
Use Jmap. Jmap is a JDK's own tool, under Jdk/bin,

Jmap-dump:format=b,file=heap.bin <pid>   
jmap-f-dump:format=b,file=heap.bin <pid>  forced export

It is important to use the Jmap tool that is the same as the one you run the jre** version for compatibility reasons. Using tool analysis

Common memory analysis tools are Eclipse's memory analyzer and JVISUALVM. Eclispe Memory Analyzer/mat

This is a Visual memory analysis tool launched by Eclipse.

Many of the components in it are easy to use:
1. Histogram lists how many instance each class has
2. Leak suspects lists the areas where the tool feels that there may be memory leaks.

For example, the following image of the leak suspects page:

As can be seen from the figure above, the finalizer thread consumes 860M of RAM, about 93% of the space.

If these standard pages don't allow you to get what you care about, you can use the OQL console. is the 4th icon in the upper-left corner of the above figure.
With OQL we can do a lot of things, such as listing which objects the finalizer object refers to:
List URLs for all urlconnection:

SELECT f.referent.url.tostring () from Java.lang.ref.Finalizer F WHERE f.referent.tostring (). StartsWith (" Sun.net.www.protocol.https.HttpsURLConnectionImpl ")

Mat's OQL reference address: http://help.eclipse.org/neon/index.jsp?topic=/org.eclipse.mat.ui.help/welcome.html More powerful JVISUALVM.

The JDK's own JVISUALVM can also analyze heap dumps.

It provides more powerful analytical capabilities than mat.
Mat is not capable of doing operations like SQL GroupBy filter, top, and so on. Or a more complex query. But in JVISUALVM we can implement very complex queries:
For example, the following query can be implemented: I want to see what the top 10 instance classes are in the objects referenced by finalizer:

var counts = {}; var alreadyreturned = {}, Top (filter sort (map (heap.ob Jects ("Java.lang.ref.Finalizer"), function (Fobject) {var className = classof (fobject
                    . referent) if (!counts[classname]) {Counts[classname] = 1;
                    else {Counts[classname] = Counts[classname] + 1;
                return {string:classname, count:counts[classname]}; }), ' Rhs.count-lhs.count '), function (Countobject) {if (!alreadyreturned[countobject.stri
                NG]) {alreadyreturned[countobject.string] = true;
            return true;
            else {return false; }), "Rhs.count > Lhs.count"; 

The general explanation:
Heap.objects ("Java.lang.ref.Finalizer") specifies all the instance of this class on the heap;
In the map, each record is mapped to an object with 2 attributes: {string:classname, Count[classname]}, a property string representing the class name, and a property count representing the number of its instance;
In the Sort function RHS represents the element to the right, and the LHS represents the element on the left.
Sample output (not in front of the heap, but not on the viewing)

JVISUALVM OQL Reference Address: https://visualvm.github.io/documentation.html (OQL part of the inside) Java Finalizer Why does it cause oom

As you can see from the previous heap, a lot of instance are referenced by finalizer threads in Java. What is finalizer

Textbooks tell us that it is best not to overwrite the Finalize method. All objects that override the Finalize method are placed in a referencequeue after GC and then have a high-priority thread in the background to execute its Finalize method.
Here are the following points to note:
1. The object is thrown into the queue after it has been identified as garbage by GC, instead of being thrown into the queue, the finalize is considered rubbish. Memory is still occupied in the queue. IMPORTANT
2. Finalizer thread in the background is a high priority, not a low-priority thread. (reference source code.)
3. Finalizer thread captures all the throwable that may be thrown in finalize.
4. A finalize method of an object is executed at most once.
5. You can also get back through finalize the object cannot be recycled. Reference < In-depth understanding of the Java Virtual Machine Second Edition > 3.2.4 Survival or Death chapter.

Attached, the source code for finalizer thread priority:
will finalizer lead to Oom?

Then, after analysis, since the Finalize method captures all the throwable, it is only possible that the Finalize method is slow to cause the finalize queue to grow, occupy memory, and eventually oom.
JMX actually has an option to get how many objects are in the current queue:

Import java.lang.management.ManagementFactory;
Managementfactory.getmemorymxbean (). Getobjectpendingfinalizationcount ();

This is our Monitor chart: (Number of objects waiting to execute the Finalize method)

So we have 2 directions:
1. Some objects cover the Finalize method, and the implementation of the Finalize method is inefficient and leads to more and more.
2. Some objects implement a block or sleep-like logic that blocks the entire thread.
Survey 1, feel not very right, found that the JDK are the implementation of some of the underlying links, should not be very slow, we also tried to cache URLConnection objects, there is a great degree of relief, but eventually there is oom happened.
Then it's only 2. Don't say much, hurry and execute the next jstack look at what this thread is doing:

"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x00000000510ae000 nid=0x1e78 in object.wait () [0x0000000000b2e000] Java.lan G.thread.state:waiting (on object monitor) at Java.lang.Object.wait (Native method) at Java.lang.Object.wait (objec t.java:502) at Sun.security.ssl.SSLSocketImpl.closeInternal (sslsocketimpl.java:1697)-Locked <0x00000000d24490 60> (a Sun.security.ssl.SSLSocketImpl) at Sun.security.ssl.SSLSocketImpl.close (sslsocketimpl.java:1602) at COM. Microsoft.sqlserver.jdbc.TDSChannel.disableSSL (Unknown Source) at Com.microsoft.sqlserver.jdbc.TDSChannel.close ( Unknown source) at Com.microsoft.sqlserver.jdbc.SQLServerConnection.close (Unknown source) at Com.microsoft.sqlserv Er.jdbc.SQLServerConnection.terminate (Unknown Source) at Com.microsoft.sqlserver.jdbc.SQLServerConnection.throwInvalidTDS (Unknown Source) at Com.microsoft.sqlserver.jdbc.TDSReader.throwInvalidTDS (Unknown Source) at Com.microsoft.sqlserver.jdbc.TDSReader.readPackET (Unknown Source) at Com.microsoft.sqlserver.jdbc.tdschannel$sslhandshakeinputstream.ensuresslpayload (Unknown SOURCE) at Com.microsoft.sqlserver.jdbc.tdschannel$sslhandshakeinputstream.read (Unknown source) at Com.microsoft.s Qlserver.jdbc.tdschannel$proxyinputstream.read (Unknown Source) at sun.security.ssl.InputRecord.readFully ( inputrecord.java:465) at Sun.security.ssl.InputRecord.read (inputrecord.java:503) at Sun.security.ssl.SSLSocketImpl . Readrecord (sslsocketimpl.java:973)-Locked <0x00000000d244d4d8> (a java.lang.Object) at Sun.security.ssl.s Slsocketimpl.waitforclose (sslsocketimpl.java:1769) at Sun.security.ssl.SSLSocketImpl.closeSocket ( sslsocketimpl.java:1579) at the sun.security.ssl.SSLSocketImpl.closeInternal (sslsocketimpl.java:1713) at sun.security. Ssl. Sslsocketimpl.close (sslsocketimpl.java:1602) at Sun.security.ssl.BaseSSLSocketImpl.finalize (
basesslsocketimpl.java:269) at Java.lang.system$2.invokefinalize (system.java:1270)    At Java.lang.ref.Finalizer.runFinalizer (finalizer.java:98) at java.lang.ref.finalizer.access$100 (finalizer.java:3
 4) at Java.lang.ref.finalizer$finalizerthread.run (finalizer.java:210)

Obviously, comparing the thread dump several times we find that this object is executing the Wait method. So is it waiting for a notification that will never come, and then it leads to thread block, which explains why the queue is getting bigger, because there's no way to do it behind it. Thread.stop

And then the above said, now we know is sun.security.ssl.SSLSocketImpl.closeInternal (sslsocketimpl.java:1697) This inside is waiting for the lock, but never get.!!
It is very strange, the official website JDK should not be implemented without such a bug, this is not a JDK bug it?
Then we compared our own business logs and found that the corresponding point in time there was a strange groovy foot in this newspaper the timeout was forced to end. From that outset, the queue waiting to execute the Finalize method increases. Then check the code:

In order to be able to ensure that the client's groovy script can be ended, each of our commit tasks triggers a timeout timer to force the end
//roughly:
_targrtthread.stop ();

This is the stop that causes object locks to be incorrectly released.
View the official website on the Thread.stop method description: https://docs.oracle.com/javase/7/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html

Because it is inherently unsafe. Stopping a thread causes it to unlock all of the monitors that it has locked. (The monitors are unlocked as the Threaddeath exception propagates up the stack.) If any of the objects previously protected by this monitors were in a inconsistent state, and the other threads could now view the SE objects in a inconsistent state. Such objects are said to be damaged. When threads operate in damaged objects, arbitrary behavior can result. This behavior may is subtle and difficult to detect, or it may is pronounced. Unlike other unchecked exceptions, Threaddeath kills threads; Thus, the user has no warning, is corrupted. The corruption can manifest itself at the actual damage occurs, even hours or days in the future.

As you can see, simply speaking, a direct stop causes the thread to release all of the monitor and can cause the variables protected by the monitor to be in an inconsistent state. Finally, there is no way to execute correctly.
So, this problem is caused by stop. Change back to interrupt method there is no more problems. Summary


1. Do not use Thread.stop.
2. Do not overwrite implementation Finalize method.
3. OQL is useful for memory analysis and can be learned more.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.