Linux High-CPU solution

Source: Internet
Author: User
Tags cpu usage

Yesterday took care of an urgent issue, the customer complained that the product upgrade system will be slow and CPU usage is very high, the customer is very strong temper, claiming not to resolve this problem as soon as possible to return, make us R/R pressure is very big, solve this issue task assigned to me, the customer is a South African company, Because of the time difference problem, I had to solve the problem at home remotely, night 8:30 with gotomeeting remote to the client computer to view our system, tossing for four hours, finally in the early hours reproduce this high CPU, hurriedly grab log, with Wireshark grab bag, The related output of the system is saved with Gcore,gstack,strace and top.
After the 2nd day analysis of these files, found the product bug, the author of the code assigned 10K buffer, and assume that 10K enough, of course, 99% of the case is sufficient, but in the case of the 1% problem, the buffer is full, and then the program into a dead loop, causing high Cpu. Found the problem, fix is very easy, the temper of the customer also eased, fix quickly can deliver to the customer. Reflect on the process of solving the problem, think that this analysis process has reusability, worth summarizing.
1. Use the top command to see which process is consuming high CPU
The gateway process 14094 consumes up to 891% CPU, which is the cumulative value of each thread in the process that consumes the CPU.
PID USER PR NI VIRT RES SHR S%cpu%MEM time+ COMMAND
14094 Root 0 315m 10m 7308 S 891% 2.2 1:49.01 Gateway
20642 Root 0 17784 4148 2220 S 0.5 0.8 2:39.96 Microdasys
1679 Root 0 10984 1856 1556 R 0.3 0.4 0:22.21 sshd
22563 Root 0 2424 1060 R 0.3 0.2 0:00.03 Top
1 Root 0 2156 492 460 S 0.0 0.1 0:01.59 Init
2. Use the Top-h-P PID command to see the percentage of CPU consumed by each thread in the process
#top-H-P 14094
You can see 107 threads in top, but the following 9 threads occupy high CPU, with thread 14086 as the main reason why the CPU
PID user      pr  ni  virt  res  shr S%CPU mem    time+  command                          14086  root      25   0  922m 914m 538m R  101  10.0  21:35.46 gateway                           14087 root       25   0  922m 914m 538m r  101 10.0  10:50.22 Gateway                             14081 root      25    0  922m 914m 538m s   99 10.0   8:57.36 gateway                              14082 root      25   0   922m 914m 538m r   99 10.0  11:51.92 gateway                                14089 root      25   0   922m 914m 538m r   99 10.0  21:21.77 gateway                               14092 root      25   0   922M 914m 538m r   99 10.0  19:55.47 gateway                                 14094 root      25   0   922m 914m 538m r   99 10.0  21:02.21 gateway                                  14083 root      25    0  922m 914m 538m r   97 10.0  21:32.39 gateway                                   14088  root         25   0   922m 914m  538m R    97  10.0   11:23.12  gateway
3. Use the Gstack command to view the function call stacks for each thread in the process
#gstack 14094 > Gstack.log
Looking for thread ID14086 in Gstack.log, because the function stack exposes function details, only two function frames are shown, thread ID14086 corresponding thread number is 37
Thread 0x4696ab90 (LWP 14086):
#0 0x40000410 in __kernel_vsyscall ()
#1 0x40241f33 in poll () from/lib/i686/nosegneg/libc.so.6
4. Use the Gcore command to dump the process image and memory context
#gcore 14094
This command generates a core file core.14094
5. Use the Strace command to view system calls and time spent
#strace-T-r-c-P 14094

The-c parameter displays statistics, removing this parameter to see how long each system call takes and the return value.
% time     seconds  usecs/call     calls     errors        syscall------------------------------------- -------------------------------------99.99   22.683879         3385      6702                      poll  0.00     0.001132           0       6702                      gettimeofday  0.00    0.000127            1       208        208          accept  0.00     0.000022          22          1                     read  0.00    0.000000            0         1                      write  0.00    0.000000            0         1                     close  0.00     0.000000           0         14                     time  0.00    0.000000            0         2                     stat64   0.00    0.000000           0          4                     clock_gettime  0.00     0.000000           0          7                     send  0.00    0.000000            0        10         10          recvfrom---------------- ------------------------------------------------------------100.00   22.685160                  13652        218 Total
6. Debug the core file with GDB and switch thread to line Line 37
Gcore is almost the same as the core file produced by the actual core dump, except that some dynamic debugging is not possible with GDB
(GDB) GDB Gateway core.14094
(GDB) Thread 37
[Switching to Thread 0x4696ab90 (LWP 14086)] #0 0x40000410 in __kernel_vsyscall ()
(GDB) where
#0 0x40000410 in __kernel_vsyscall ()
#1 0x40241f33 in poll () from/lib/i686/nosegneg/libc.so.6
GDB can be debugged according to the detailed function stack, printing some variable values, and combining the source code to analyze why the poll call high CPU.
Because the code involves the company's property rights, GU does not do detailed analysis here, need to understand is the analysis of the process and the use of the command.

Process is: Process id-> thread id-> thread function call stack, function time and call statistics--Source code Analysis

Original link

Linux High-CPU solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.