How to locate the problem point of CPU high process under Linux __linux

Source: Internet
Author: User
Tags semaphore stack trace cpu usage
One, top+pstack+gdb of the combination of boxingGossip less, first directly on the operation of the example, and then do the principle of explanation.


1.1 Using top command to find the most CPU process
>top
PID USER PR NI virt RES SHR S%cpu%mem time+ COMMAND
22688 Root 0 1842m 136m 13m S 110.0 0.9 1568:44 Test-program


1.2 Using Pstack to track the process stack
This command displays the stack trace for each process.
The Pstack command must be run by the owner or root of the corresponding process. You can use Pstack to determine where the process hangs.
The only option that this command allows to use is the PID of the process to be checked.


This command is useful for troubleshooting process issues,
For example, we find that a service has been in a work state (such as suspended animation, like a dead loop),
Use this command to easily locate the problem;
You can perform several pstack in a period of time, and if you find that the code stack always stops in the same place,
That position needs to be focused, which is probably where the problem is;


>pstack 22688
Thread (thread 0x7fa97035f700 (LWP 22689)):
#0 0x00007fa96f386a00 in sem_wait () from/lib64/libpthread.so.0
#1 0x0000000000dfef12 in Uv_sem_wait ()
#2 0x0000000000d67832 in node::D ebugsignalthreadmain (void*) ()
#3 0x00007fa96f380aa1 in Start_thread () from/lib64/libpthread.so.0
#4 0x00007fa96f0cdaad in Clone () from/lib64/libc.so.6
Thread (thread 0x7fa96efe4700 (LWP 22690)):
#0 0x00007fa96f386a00 in sem_wait () from/lib64/libpthread.so.0
#1 0x0000000000e08a38 in V8::base::semaphore::wait () ()
#2 0x0000000000dddde9 in V8::p Latform::taskqueue::getnext () ()
#3 0x0000000000dddf3c in V8::p Latform::workerthread::run () ()
#4 0x0000000000e099c0 in V8::base::threadentry (void*) ()
#5 0x00007fa96f380aa1 in Start_thread () from/lib64/libpthread.so.0
#6 0x00007fa96f0cdaad in Clone () from/lib64/libc.so.6
Thread (thread 0x7fa96e5e3700 (LWP 22691)):
#0 0x00007fa96f386a00 in sem_wait () from/lib64/libpthread.so.0
#1 0x0000000000e08a38 in V8::base::semaphore::wait () ()
#2 0x0000000000dddde9 in V8::p Latform::taskqueue::getnext () ()
#3 0x0000000000dddf3c in V8::p Latform::workerthread::run () ()
#4 0x0000000000e099c0 in V8::base::threadentry (void*) ()
#5 0x00007fa96f380aa1 in Start_thread () from/lib64/libpthread.so.0
#6 0x00007fa96f0cdaad in Clone () from/lib64/libc.so.6
Thread (thread 0x7fa96dbe2700 (LWP 22692)):
#0 0x00007fa96f386a00 in sem_wait () from/lib64/libpthread.so.0
#1 0x0000000000e08a38 in V8::base::semaphore::wait () ()
#2 0x0000000000dddde9 in V8::p Latform::taskqueue::getnext () ()
#3 0x0000000000dddf3c in V8::p Latform::workerthread::run () ()
#4 0x0000000000e099c0 in V8::base::threadentry (void*) ()
#5 0x00007fa96f380aa1 in Start_thread () from/lib64/libpthread.so.0
#6 0x00007fa96f0cdaad in Clone () from/lib64/libc.so.6


Use the top command to view the most CPU-consuming threads of a specified process.
The thread number found below is 22970.
>top-h-P 22688
PID USER PR NI virt RES SHR S%cpu%mem time+ COMMAND
22970 Root 0 1842m 136m 13m R 100.2 0.9 1423:40 Test-program


Note:
The PID here is the unique thread number that the system assigns to each thread, not the process number, but the name is also PID.
The specific differences between the two are as follows:
"Pid,tid in Linux, and the relationship between real-world pid"
http://blog.csdn.net/u012398613/article/details/52183708


Use the thread number PID to reverse its corresponding thread number.
The thread 22970 corresponds to thread 10 is found as follows
>pstack 22688 | grep 22970
Thread 0x7fa92f5fe700 (LWP 22970):


Use Vim to view a snapshot of a process, navigate to a specific thread, and view its call stack;
>pstack 22688 | Vim-
Thread 0x7fa92f5fe700 (LWP 22970):
#0 0x00007fa96f02a04f in vfprintf () from/lib64/libc.so.6
#1 0x00007fa96f054712 in vsnprintf () from/lib64/libc.so.6
#2 0x00007fa967b3861c in Lv_write_log () From/opt/test-program
#3 0x00007fa967b26173 in Lvjbuf::p jmedia_jbuf_put_rtp_pkg (pjmedia_rtp_decoded_pkg const*, int*) () from/opt/ Test-program
#4 0x00007fa96782409f in Livesrv::lvaudio::on_rtp_stream (void*, unsigned int, unsigned int) () From/opt/test-program
#5 0x00007fa96781fc87 in Livesrv::lvmedia::recv_media (void*, unsigned int, unsigned char, unsigned int) () from/opt/test -program
#6 0x00007fa967818c7f in Livesrv::lvchannel::d o_recv_media_check_thread2 () () from/opt/test-program/node_modules/ Livesource/debug/linux/livesource.node
#7 0x00007fa967814699 in Recv_media_process2 (void*) () From/opt/test-program
#8 0x00007fa96f380aa1 in Start_thread () from/lib64/libpthread.so.0
#9 0x00007fa96f0cdaad in Clone () from/lib64/libc.so.6




The above operation basically locates the concrete thread and the approximate function,
If you want to see the specific reasons, such as the field functions in the variables such as the value, you will need to use the real-time debugging function of GDB.
1.3 Using GDB to debug a live process
>GDB Attach 22688
: Thread 10
: BT
: Frame X
:p xxx


second, top usage2.1 Top: Changes in dynamic observation procedures
[Root@linux ~]# Top [-d] | Top [-BNP]
Parameters:
-D: Can be followed by seconds, which is the number of seconds to update the entire program screen. The preset is 5 seconds;
-B: Top is executed in batches, and more parameters are available.
Typically, a data flow redirect is used to output the result of a batch to a file.
-N: With-B, the meaning is that several top outputs are required.
-P: Specify some PID to be observed and monitored.


Key commands you can use during top execution:
? : Display the key commands that can be entered in top;
P: Using the CPU to sort display the resources;
M: The use of Memory resources to sort the display;
N: Sort by PID.
T: The CPU Time accumulation (time+) used by the Process is sorted.
K: Give a certain PID a signal (signal)
R: Give a PID a nice value again.


2.2 Top is also a pretty good program observation tool.
Unlike PS, which is the result of static output, top this program can continuously monitor (monitor) the entire system's program working state,
For example, the above example shows. In a preset case, the time of the Update program resource is 5 seconds each time,
However, you can use-D to make modifications.
Top is divided into two screen, the above screen for the entire system of resource use status, basically a total of six lines, the contents of the display is:
• First line: Displays the time that the system has been started, the current number of people online, the overall system load (load).
The comparison needs to be noted is the system load, three data represents 1, 5, 10 minutes of average load.
In general, this load value should not be more than 1, unless your system is busy.
If it lasts above 5, then ... Take a closer look at the program that is affecting the overall system.
• Second line: Shows the current number of observed programs,
The comparison needs to be noted is the last zombie that value, if not 0,
Hey. Take a good look at the end of the process to become a corpse.
• Third line: Shows the overall load of the CPU, each item can be used? Inspection.
What needs to be observed is the value of the ID (idle), which, in general, should be close to 100%, which means that the system has very little resources to use. ^_^.
• Lines fourth and fifth: represents the current use of physical and virtual memory (MEM/SWAP).
• Line sixth: This is where the state is displayed when the command is entered in the top program. Example four is a simple example of use.


The picture underneath top is the resource used by each process. The comparison needs to be noted:
PID: ID of each process.
User: The user to whom the process belongs;
Pr:priority, the order of priority execution of the program, the smaller the sooner be executed;
The shorthand for Ni:nice is related to Priority, and the smaller the earlier it is executed;
The utilization rate of%cpu:cpu;
%mem: The usage rate of memory;
time+: The accumulation of CPU usage time;
In general, if Brother Bird want to find the most loss of CPU resources that program, most of the use of the top of this program.
Then force the CPU to use the resource to sort (press P in top) and you'll know it quickly. ^_^.




third, Pstack usageThis command displays the stack trace for each process.
The Pstack command must be run by the owner or root of the corresponding process. You can use Pstack to determine where the process hangs.
The only option that this command allows to use is the PID of the process to be checked. See proc (1) manual page.


This command is useful for troubleshooting process issues, such as when we find a service

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.