First, from a question.
Believe that many people play mobile phone or PC, have encountered such a situation, the installation of software more system performance is slow, but to see the CPU utilization has been lower than 10%, memory is also very good. I encountered a similar situation in the recent development work, the difference is that the system at this time only a test program and a few sleep background process, indicating that the system, especially the driver part of the possible problems caused. From an operating system perspective, here are some more possible causes:
- Large number of interrupts
May be in the continuous disk read and write, network communications, may be improper use of the module or hardware problems caused by the peripheral to the CPU to send interrupts;
- High system Load (note: not CPU utilization)
High load indicates that there are many programs waiting to be scheduled to run, which causes the context switch to be frequent.
- Context switches are too frequent
Context switching refers to the CPU switching from one process to another, a process that takes some time. If context switching is too frequent, it means that the CPU is using less time to execute the process code. The 2nd mention of high load will cause the context switch to be frequent, but the context switch frequently load is not necessarily high.
In the previous troubleshooting experience, the system performance decreased mainly by 1, in the impact on the performance of the system is more obvious, while the 2,3 is more covert, even if the value is already abnormal, as long as the application of real-time requirements are not high, the most is a slightly slower response, see what is wrong. Therefore, the low-level driver developers generally do not consider 2, 32 points, let alone to evaluate the performance of the system as a test indicator. Just the module I want to test is very demanding for real-time, and because the system has a very frequent context switch during idle time, the test results are naturally poor.
Second, how to confirm the context switch frequently?
To solve the problem of frequent context switching during idle time, first understand how frequently it is not normal. The/proc/stat file contains information about the activity of the CPU, and context switching is one of them, as shown in the bold large font below the command output, beginning with Ctxt, which represents the total number of context switches that the system has been booting so far.
~ # cat/proc/statcpu 635 0 2319 90669 0 2 0 0cpu0 635 0 2319 90669 0 2 0 0intr 267849 0 0 0 119 0 0 0 0 13 13 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 196280 0 0 0 0 61242 1614 0 0 4770 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 3779 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0ctxt 444616Btime 1437983701processes 514procs_running 1procs_blocked 0softirq 135729 0 93679 2 0 0 1768 0 2 40278
For the analysis of the problem, we are more concerned about the number of context switches per second, which can be computed using the following command.
$cat/proc/stat | grep ctxt && sleep && cat/proc/stat | grep ctxt ctxt 211015 ctxt 218128
The number of context switches per second = The difference between the two values/30. For this example, there are 237 switches per second, which is quite unusual for an idle system. The normal range of values is typically no more than 50 times/second. However, this is not fixed, need to be based on different system environment, for example, I in the embedded development is mainly engaged in the development of POS products, compared with the Linux system as a server, the actual running service will be much less, context switching naturally should be less. In addition, if higher real-time is required, then the range of values should be further reduced, the range of values determined by the product I analyze requires no more than 30 times/second at idle. The methods described above are supported on all Linux products, and if the system vmstat
is taken, vmstat
viewing is a much more convenient way to do this. The test results are shown in the red box.
Third, the context switch frequently when how to troubleshoot?
The total number of context switches can not be located to which process or which driver problem, then need Pidstat (in the Sysstat package) to see the specific to each process per second of the context switch, is $pidstat -w 1
the result of execution.
Before continuing with the analysis, let's talk about a concept: CPU-intensive processes and IO-intensive processes. CPU-intensive process time slices are always inadequate, CPU usage is bound to increase, no need to view context switches, and IO-intensive processes are frequently dispatched, but only for a small session at a time, from the CPU utilization is not seen abnormal, then you need to look at the context switch. In general, IO-intensive processes (kernel threads or user processes) are primarily processing read-write disks, or data coming in from the network, but there may be processes that do not have any IO interaction but behave like IO-intensive processes, such as the following:
- Create a kernel thread, sleep on startup, and then create a 10MS timer to wake up the kernel thread every time.
- Create a user process, get into the kernel on a system call, sleep, and then create a 10MS timer that wakes up the process every time.
Of course, we are not foolish enough to do such a thing directly, but may inadvertently do such things indirectly, especially to the 1th, such as
- Create a work queue, create a 10MS timer, and go to schedule_work every time.
The work queue is also managed by a kernel thread, and the result is that the kernel thread context switches will become extremely frequent. For processes with frequent context switches, our focus is on confirming that they are IO intensive and whether they should behave like IO-intensive processes in the current system state. For example, a process of processing network data, if there is no data, but also behave like IO-intensive, there is a high switching context action, it is possible that the process or the process opened by the device driver design is unreasonable.
There are two kernel threads KSWAPD and events (the new kernel is changed to kworker) and will behave like IO-intensive processes under certain circumstances. Where KSWAPD is used to manage virtual memory, when there is not enough physical memory to Exchange virtual memory frequently, the context switch of KSWAPD will increase significantly, while events is used to process work queues, and when there are continuous tasks queued and those work hours are short, There is a noticeable increase in context switching of events, and in this case, it is necessary to analyze which driver is the cause.
Back to the output of pidstat, you can see that the context switch of events/0 is very high, since no process can cause the result, it is only possible that a driver file (KO) is constantly queued for work. At the moment I have not found a more efficient way to quickly locate the use of work for each KO file, only through the lsmod to play the loaded driver, and compared to the previous system version of what has been modified to do the comparison. Finally, locate a driver that does exist to create a 10MS timer and schedule_work each time. Once you have optimized the driver, re-enter the pidstat -w 1
test, and you can see that the context switch for all processes is down.
Iv. Appendices
1, cross-compiling sysstat General Board will not take pidstat
, you need to download Sysstat package compilation. Fortunately, Sysstat can be compiled directly using the toolchain. Operation steps: Enter the source directory
$mkdir output $export sa_dir= ' pwd '/output/var/log/sa $export conf_dir= ' pwd '/output/etc/kksysconfig $./configure-- prefix= ' pwd '/output--host=arm-none-linux-gnueabi--disable-man-group $make $make Install
The bin and Lib under output can be copied to the board. Note that the tool chain name is named when cross-compiling is configure, and if your toolchain is ARM-NONE-LINUX-GNUEABI-GCC, pass the parameter--host=arm-none-linux-gnueabi, If it is ARM-EABI-GCC, then pass the parameter--host=arm-eabi and modify it according to this rule.
[Embedded development] Linux performance Analysis-context switching