Nic binding Kernel

Last Update:2018-12-07 Source: Internet

Author: User

Tags bitmask

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Single Process, asynchronous I/O should be able to achieve optimal communication performance, but in reality, we often find that this mode does not achieve the expected results, this may be due to the network adapter and ApplicationProgramCompete for CPU resources. Frequent hardware interruptions consume CPU resources, if there is a way to allocate a large number of hardware interruptions to a specific CPU Core for processing, you can achieve better performance. The current servers are mostly multi-CPU, multi-core, multi-nic, and multi-hard disk. If each interruption can be dispersed and balanced, the binding of specific hardware is interrupted to a specific CPU core, for example, if the NIC interrupt excludes one CPU core and the disk I/O interrupt excludes one CPU core, it will greatly reduce the burden on a single CPU and improve the overall processing efficiency.

1What is interruption?

The definition of "interrupt" in Chinese textbooks is too rigid. Simply put, each hardware device (such as a hard disk or network card) needs to communicate with the CPU in some form, so that the CPU can know what has happened in time, so that the CPU may put down things in its hands to handle emergency incidents. The active interruption of hardware devices to the CPU can be called a hardware interruption. Just like QQ interference when you are working, a QQ Avatar flash can be understood as being interrupted.

Interruption is a good way to communicate between the CPU and hardware. Another method is polling, which allows the CPU to periodically query the hardware status and perform corresponding processing, it's like checking QQ every five minutes to see if anyone is looking for you. Is this a waste of time? Therefore, interruption is an active hardware method, which is more effective than polling (CPU active. There is another problem here. Every hardware device is interrupted. How can we differentiate different hardware? How do I know which one is from the hard disk and which one is from the NIC when the device is interrupted at the same time? This is actually very easy, as if every QQ number is different. Similarly, the system will assign an IRQ Number to each hardware device, this unique IRQ Number can be used to differentiate different hardware.

In a computer, interruption is an electrical signal generated by hardware and directly sent to the interrupt controller. Then, the interrupt controller sends a signal to the CPU. After the CPU detects the signal, in this case, the current job is interrupted and the interrupted job is processed. Then, the processor will notify the operating system that an interruption has occurred, so that the operating system will handle the interruption as appropriate. Now let's take a look at the interrupt controller. There are two common types of Interrupt controllers: Programmable Interrupt Controller 8259a and advanced programmable interrupt controller (APIC ). The traditional 8259a is only suitable for a single CPU. It is now a multi-CPU, multi-core SMP system, so in order to make full use of the SMP architecture, intel introduced an advanced programmable interrupt controller (APIC) to deliver interruptions to each CPU in the system for better parallel performance and performance improvement ).

Hardware Support for advanced programmable interrupt controllers is not enough, and the Linux kernel must be able to take advantage of these hardware features. Therefore, only Versions later than kernel 2.4 support different hardware interrupt requests (irqs) allocated to a specific CPU core. This binding technology is called smp irq affinity. For more information, see Linux KernelSource codeBuilt-in documentation: linux-2.6.31.8/documentation/IRQ-affinity.txt.

2, How to use?

Let's first understand two basic commands:

CAT/proc/interruptsTo view the interruption status of the system. Generally, the interruption of the network card is allocated to cpu0.
CAT/proc/cpuinfoView the CPU information, CPU, and core.

Then we run the command to check how the system interrupts are allocated to the CPU. Obviously, there are more interruptions to be processed on cpu0:

# Cat/proc/interrupts

Cpu0

Cpu1

12:

14:

50:

58:

90:

233:

NMI:

LOC:

err:

MIS:

918926335

8248017

194

31673

1070374

5077

918809969

2032

918809894

io-APIC-edge

io-APIC-level

io-APIC-edge

io-APIC-level

PCI-MSI

io-APIC-level

Timer

I8042

RTC

ACPI

I8042

Ide0

Ohci_hcd: usb2

Sata_nv

Eth0

Ehci_hcd: usb1

To prevent excessive CPU 0 loads, how can I transfer some interruptions to CPU 1? Or how to switch the interruption of the eth0 Nic to cpu1? First, we need to check the SMP affinity interrupted by IRQ 90 (that is, the IRQ Number of eth0 Nic, check how the current interrupt is allocated to different CPUs (ffffffff means it is allocated to all available CPUs ):

# Cat/proc/IRQ/90/smp_affinity

7 fffffff, ffffffff

Before proceeding, we need to stop the service process automatically adjusted by IRQ so that we can manually bind IRQ to different CPUs, otherwise, the changes made by manually binding will be overwritten by the automatic adjustment process. To modify the Interrupt Processing of IRQ 90 and bind it to 2nd CPUs (cpu1), run the following command:

#/Etc/init. d/irqbalance stop

# Echo "2">/proc/IRQ/90/smp_affinity

Here we need to explain how "2" in "Echo 2>/proc/IRQ/90/smp_affinity" comes from. This is actually a binary number, representing 00000010. If 00000001 represents cpu0, then 00000010 represents cpu1, "Echo 2>/proc/IRQ/90/smp_affinity" means to bind a 90 interrupt to 00000010 (cpu1. Therefore, each CPU can be expressed in binary or hexadecimal format:

Binary hex

CPU 0 00000001 1

CPU 1 00000010 2

CPU 2, 00000100, 4

CPU 3, 00001000, 8

If you want to bind IRQ to cpu2 (00000100 = 4), run the following command:

# Echo "4">/proc/IRQ/90/smp_affinity

If you want to balance IRQ to cpu0 and cpu2 at the same time, that is, 00000001 + 00000100 = 00000101 = 5, execute the following command:

# Echo "5">/proc/IRQ/90/smp_affinity

There is also a limitation that IO-APIC has two working modes: logic and physical. In logic mode, the IO-APIC can distribute the same interrupt to eight cpu cores at the same time (limited by bitmask registers because bitmask is only 8 bits); in physical mode, the same type of interruption cannot be distributed to different CPU cores at the same time. For example, the eth0 interruption cannot be handled by CPU 0 and CPU 1 at the same time. In this case, only eth0 to CPU 0 can be located, and eth1 to CPU 1 can be located, that is, the eth0 interrupt cannot be processed by multiple CPU cores at the same time as the logic mode.

After a period of time, check the/proc/interrupts information and find that the 90: eth0 interrupt on cpu1 increased by 145 times, continuously print/proc/interrupts information, we will find that the number of eth0 interruptions on cpu0 remains unchanged, and the number of interruptions on cpu1 continues to increase. This is what we want:

# Cat/proc/interrupts

Cpu0

Cpu1

12:

14:

50:

58:

90:

233:

NMI:

LOC:

err:

MIS:

922506515

8280147

194

31907

1073399

5093

922389696

145

2043

922389621

io-APIC-edge

io-APIC-level

io-APIC-edge

io-APIC-level

PCI-MSI

io-APIC-level

Timer

I8042

RTC

ACPI

I8042

Ide0

Ohci_hcd: usb2

Sata_nv

Eth0

Ehci_hcd: usb1

3What is the purpose?

When the network load is very heavy, for applications such as file servers and high-traffic Web servers, the IRQ of different NICS is evenly bound to different CPU cores, this reduces the burden on a single CPU and improves the overall interrupt handling capability of multiple CPUs and cores. For applications such as database servers, binding a disk controller to a CPU core and a NIC to another CPU core will increase the database response time and optimize the performance. Balancing IRQ interruptions based on your production environment and application characteristics helps improve the overall throughput and performance of the system.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More