Linux The kernel has been tested for a long time in terms of performance, especially 2.6/3.x kernel. However, in the case of high IO, especially the network, the processing of interrupts can become a problem. We have found this problem on a high-performance system with one or more saturated 1Gbps NICs, and recently there are many small packets concurrently (about 10000packets/second ) overload on the virtual machine also found this problem.
The reason is clear: in the simplest mode, the kernel handles each packet from the network card in a hardware interrupt manner. But withPacket rate growth, resulting in more interruptions than a singleCPUthe range that can be processed. SingleCPUconcepts are important, and system administrators often lack awareness of this. In an ordinary4-16nuclear system, the for the overallCPUthe usage rate in6-25%around and the system looks normal, so an overloaded kernel is hard to find. But the system will run very slowly, and will be in no alarm, noDMESGDay the serious loss of the packet in the absence of obvious indications.
but you useTopView MultipleCPUMode(RunTop, and then type1)when the%sicolumn (System interrupt) orMpstatin the commandIRQcolumn(mpstat-p all 1), in some busy systems you will find that the interruption is significantly high, through furtherMpstatuse, and you'll see whichCPUor which device has a problem.
you need a newer version of the Mpstat , you can run - I. mode to list IRQ load, run the following command:
Mpstat-i Sum-p all 1
more than 5000/ seconds a little busy, million " Span style= "font-family:arial, Sans-serif;" >-2 million / seconds are pretty high.
run the following command to confirm the device / project leads to load:
Mpstat-i Cpu-p all 1
15 19 , 995.  You can also define the
mpstat-i cpu-p 3 1 # 3 can be positioned in different and mpstat 0 Span style= "font-family: the song Body;" > start, htop from Span style= "font-family:arial, Sans-serif;" >1 start counting)
record the number of interrupts and you can view the interrupt table , "Cat/proc/interrupts" Find Mpstat ' s get the numbers and you can find out which device is using interrupts. This file also indicates the use of this interrupt # can tell you what caused the overload.
What do we need to do?
First, confirm that you are running irqbalance nice Span style= "font-family: the song Body;" > Daemon It will automatically be in cpu All interrupts are handled and the system is easily overloaded. irqbalance These interrupts are diffused to reduce the load. For maximum performance, you can manually balance these interrupts by splitting the sockets and Hyper-Threading shared kernel into
overloaded. It depends on your network card and driver, but there are usually two effective ways to prevent such a thing from happening.
The first is a multi-NIC queue, some Intel The NIC can do this. If they have 4 queues, there are four CPU cores that can handle different interrupts at the same time to distribute the load. Usually the driver will do this automatically, and you can confirm it by mpstat command.
second, and often more important, the NIC driver option--' IRQ coalescing ', interrupt request merging. This option has a powerful feature that allows the NIC to cache several packets before invoking the interrupt request, thus saving the system a significant amount of time and load. As an example: if the network card is cachedTena bag, thenCPUthe load will be reduced by approximately90%. This feature is typically usedEthtooltools to control, use'-c/-c 'parameters, but some drivers are required to drive the initialMake the relevant settings when loading. How to set up the need to view native documents. For example, some network cards, such as the one we useIntelnetwork card, there isAutomaticmode can be automatically based on loadoptimized.
This article consists of Steve Mushero, co-founder and CEO published in - years 4 Month - Day
Author Profile:
Steve MusheroChinanetcloud's founder, chief technology officer
steve mushero 20 intermind Advanced management system, in the beyond Access Communications and airreview > as chief architect. He is The author of a book, inventor of a number of patents.
Chinanetcloud Network Technology (Shanghai) Co., Ltd. holds the final interpretation right
Reprint-linux Network card interrupt causes a single CPU to overload