Zprofiler kick solving high CPU usage issuesJiu JuExplore 171 2015-04-08 14:11:58 Posted in: JVM Performance and debug platformZprofiler in Friday encountered a problem with high CPU usage on the online machine. The problem itself is relatively simple, but the main function of multiple zprofiler is used in the positioning process, and the feeling is a good example of using Zprofiler to locate such a problem process.
Before you start using Zprofiler, use perf to confirm that the bottleneck point is in native. (The following operations require root privileges, PE assistance is required to operate)
If the perf is not installed on the online server, you can go to http://yum.corp.taobao.com/taobao/6/x86_64/test/aliperf/aliperf-0.3.9-9.el6.x86_64.rpm Download the RPM package and install it.
Use the perf top command to view the hotspot functions of the current system.
As shown, the hotspot is in Java code, because the Java code is JIT-executed and perf cannot see its symbol, so the default is in Perf-<pid>.map.
If the hotspot is in the libjvm.so function, you can contact our team to assist in further analysis. For example, if the hotspot is a JIT-related function, usually codecache or JIT-related parameters, if it is a GC-related function, you can use Zprofiler to analyze gclog, and then adjust GC-related parameters.
To exclude other possible, determine the problem of Java code, you can do a thread dump, analysis on the Zprofiler.
Using the run-state thread hotspot stack (load) feature in thread dump, you can see the maximum number of call stacks that appear in the running thread. As shown in the following:
In fact, this has seen the problem stack, but because thread dump is just a snapshot, then didn't dare to believe so quickly to find the problem, so still think with hot method profiling look.
Hot Method Profiling has a special article introduced, here is not much to say, look at the top of the circle can be.
The results of the analysis were as follows:
This result is very obvious, the first function accounted for 99% of the CPU utilization. And the expanded call stack is exactly the same as the call stack that was seen in the hot stack above. The basic thing is to be sure that the problem is here.
But the product's small partner said this place is the normal call, the SQL statement has not been modified for a long time, the database data volume is not small. To find out, decide to make a heap dump and see what kind of data is being processed?
After the heap dump is finished, it is copied to the Zprofiler system for analysis. Probably looked at the "Object Cluster View", there are no particularly large objects.
Then we looked at "Thread Overview" and filtered the thread name to the right "regular match".
then expand to see the local objects on each layer's call stack. Such as:
You can see the contents of the object by putting the mouse on it. Here you can see the SQL statement being queried, along with the associated parameters.
The root cause of this is that there is a third-party component that does not cause the bug to escalate.
But the whole process is more useful for reference, I hope it will be helpful to everyone.
Zprofiler kick solving high CPU usage issues