52781423
1, fault phenomenon
Customer Service colleague Feedback platform system runs slowly, the webpage is serious, the problem persists after restarting the system several times, using the top command to view the server situation, found that the CPU occupancy rate is too high.
2, high CPU consumption problem location 2.1, positioning problem process
Use the top command to view resource usage, and find that the PID 14063 process consumes a lot of CPU resources, CPU usage is up to 776.1%, and memory usage is up to 29.8%.
[YLP@ylp-web-~]$ Toptop-14:51:Ten up233 days,11:50M7 Users, load average:6.85,5.62,3.97Tasks:192 Total,2 Running,Sleeping,0 stopped,0 Zombie%CPU (s):97.3 US,0.3 Sy,0.0 NI,2.5 ID,0.0 WA,0.0 Hi,0.0 Si,0.0 Stkib Mem: 16268652 Total, 5114392 free, 6907028 used, 4247232 buff/ Cachekib Swap: 4063228 Total, 3989708 free, 73520 used. 8751512 avail Mem PID USER PR NI VIRT RES SHR S %cpu %mem time+ COMMAND 14063 ylp 20 0 9260488 4.627g 11976 S 776.1 29.8 117: 41.66 java
2.2. Locating the problem thread
Use the PS-MP pid-o thread,tid,time command to view the thread condition of the process and find that the process has a high number of thread occupancy rates
[Email protected] ~]$Ps-MP 14063-OTHREAD,TidTimeUSER%CpuPriSCNTWchanUSERSYSTEMTIDTimeYLP 361------02: 05: 58YLP 0.0 19-Futex_--14063 00: 00: 00YLP 0.0 19-poll_s--14064 00: 00: 00YLP 44.5 19----14065 00: 15: 30YLP 44.5 19----14066 00: 15: 30YLP 44.4 19----14067 00: 15: 29YLP 44.5 19----14068 00: 15: 30YLP 44.5 19----14069 00: 15: 30YLP 44.5 19----14070 00: 15: 30YLP 44.5 19----14071 00: 15: 30YLP 44.6 19----14072 00: 15: 32YLP 2.2 19-Futex_--14073 00: 00: 46YLP 0.0 19-Futex_--14074 00:00:00 ylp 0.0 -futex_ Span class= "Hljs-tag" >--14075 00:00:0 0ylp 0.0 -futex_ --14076 00:00 : 00ylp 0.7 - Futex_ --14077 00:00 :15
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
As can be seen from the output information, the thread CPU usage between 14065~14072 is very high
2.3. View the problem thread stack
Pick the thread with Tid 14065, view the stack of the thread, turn the thread ID into 16, and convert using the printf "%x\n" tid command
[ylp@ylp-web-01 ~]$ printf "%x\n" 1406536f1
Then use the Jstack command to print thread stack information, command format: Jstack pid |grep tid-a 30
[YLP@ylp-web-~]$ Jstack14063 |grep36f1-a30GC task thread#0 (PARALLELGC) prio=Ten tid=0x00007fa35001e800 nid=0X36F1 runnableGC task thread#1 (PARALLELGC) prio=Ten tid=0x00007fa350020800 nid=0x36f2 runnableGC task Thread#2 (PARALLELGC) prio=Ten tid=0x00007fa350022800 nid=0X36F3 runnableGC task Thread#3 (PARALLELGC) prio=Ten tid=0x00007fa350024000 nid=0x36f4 runnableGC task Thread#4 (PARALLELGC) prio=Ten tid=0x00007fa350026000 nid=0x36f5 runnableGC task thread#5 (PARALLELGC) prio=10 tid=0x00007fa350028000 nid= 0x36f6 runnable "GC task Thread#6 (PARALLELGC)" Prio=10 Tid=0x00007fa350029800 nid=0x36f7 runnable " GC task Thread#7 (PARALLELGC) "Prio=10 tid=0x00007fa35002b800 Nid=0x36f8 runnable "VM periodic Task Thread" Prio=10 tid=0x00007fa3500a8800 nid=0x3700 waiting Span class= "hljs-literal" >on condition JNI global References: 392
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
As you can see from the output information, this thread is the GC thread of the JVM. At this point it can be basically determined that there is insufficient memory or a memory leak that causes the GC thread to run continuously, resulting in high CPU consumption.
So the next thing we're looking for in terms of memory
3, Memory problem location 3.1, use the Jstat-gcutil command to view the memory situation of the process
[[Email protected]~]$ Jstat-gcutil140632000Ten S0 S1EO P ygc ygct FGC fgct GCT0.000.00100.0099.9926.314221.9172181484.8301506.7470.000.00100.0099.9926.314221.9172181484.8301506.7470.000.00100.0099.9926.314221.9172191496.5671518.4840.000.00100.0099.9926.314221.9172191496.5671518.4840.000.00100.0099.9926.314221.9172191496.5671518.4840.000.00100.0099.9926.314221.9172191496.5671518.4840.000.00100.0099.9926.314221.9172191496.5671518.4840.000.00100.0099.9926.314221.9172201505.4391527.3550.000.00100.0099.9926.314221.9172201505.4391527.355 0.00 0.00 100.00 99.99 26.31 42 Span class= "Hljs-number" >21.917 220 1505.439 1527.355" from the output information can be seen, Eden area memory consumption 100%,old area memory consumption full GC as many times as 220 times, and frequent full GC has a particularly long duration, averaging full GC Time-consuming 6.8 seconds (1505.439/220). Based on this information, you can basically determine that there is a problem with the program code, there may be unreasonable creation of the object where ####3.2, analysis stack using jstat command to view the stack of the process
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
[Email protected] ~]$ Jstat 14063 >>jstat.out
"'
After taking the jstat.out file from the server to the local, use the editor to find the information with the project directory and the thread state is runable, you can see that the Activityutil.java class 447 rows are using the Hashmap.put () method
3.3. Code positioning
Open Project Engineering, find the Activityutil class of 477 lines, the code is as follows:
Once the relevant colleague is found, this code gets the configuration from the database and loops based on the values of the remain in the database, and the HashMap is put in the loop all the time.
Query the configuration in the database and find the number of remain is huge
At this point, the problem is fixed.
The troubleshooting process in which an online Java program causes excessive server CPU usage