Linux Kernel 3.11 socket busy poll mechanism to avoid sleep switching, 3.11 poll
The network protocol stack of Linux is very independent. The upstream and downstream interfaces are connected to the user State and devices respectively. They can also be seen as North and South interfaces... the north direction uses the socket interface and the south direction uses the qdisc interface (you can think of it as the upper-layer netdev queue. For the receiving interface, the NAPI poll queue is another example), whether it is socket or qdisc, all are managed based on queues. That is to say, the three parts are independent. The socket can only see read/write queues, but not the protocol stack itself. When the socket is reading a data, it takes the data in the queue. As for who put the data in, it does not know whether the protocol stack is put in, and it does not need to be verified.
Socket isolates user processes and protocol stacks, and RX/TX queue isolates protocol stacks and device drivers.
This isolation method makes programming and design simple, but it is not conducive to performance.
The RPS Design of Linux aims to enable a CPU to receive packets in the protocol stack (Soft Interrupt kernel thread context or Soft Interrupt Processing in any context ), and processes that process the data packet. I said that this design has both advantages and disadvantages. If it is only intended to improve the cache utilization, then this design is correct, but have you ever thought about other situations, if a CPU pushes a skb to a socket queue at the end of net rx Soft Interrupt Processing and tries to wake up the waiting process, what should it do next? In fact, it should return to the device next step and continue to poll the next skb. However, the RPS design is not like this. The RPS design aims to let the CPU continue to process user-State processes .... this requires a process switchover and user/kernel switching. Although the CPU cache utilization of the server is improved, the CPU cache utilization related to Protocol Stack processing is reduced. In fact, whether the CPU cache is refreshed after process switching and user/kernel mode switching is related to the architecture does not mean that all architectures can bring good results.
Further tests are required.
I think the best way is that the user process and the NET RX Soft Interrupt in the kernel are at different CPU cores. However, these two CPU cores Share Level 2 or level 3 cache.
...
The Linux kernel has developed a better solution, that is, to break through the three independent parts above, so that the socket directly goes deep into the device layer and directly poll skb !! Note that this is a poll operation, and does not allow the socket to directly process the protocol stack process. Socket direct poll means that when the socket does not read the data packet in the queue, It is not sleep, and then waits for the net rx kernel thread to put the data packet into the queue and wake it up, instead, I directly asked the device: Is there a packet? If there are, I will take them directly to the Protocol Stack without sending them. This is a "pull" method, rather than the previous "push" method. The difference between pull and push is that pull is the same entity for the recipient and is active, however, pushing is passive.
This solves the problem that RPS tries to solve but does not solve it perfectly. This mechanism is called busy poll.
RPS tries to allow the Soft Interrupt to switch to the user process after the packet is processed. The Soft Interrupt will be paused, and the packet will be switched back after the packet is interrupted... busy poll is not like this. It directly bypasses the Soft Interrupt execution body and actively pulls data packets for processing by the execution body of the socket itself. This avoids switching problems caused by a large number of tasks.
I don't know whether or not busy poll can be used to improve the performance of forwarding. This requires testing. The above description is only an ideal situation. The actual situation is that the socket may pull a data packet from the device for another socket. Even this data packet is forwarded and is not associated with any socket... because data packets can be associated with a specific socket only after passing through standard routing and layer-4 processing, it is futile and hopeless to find this association at the device driver layer! In any case, the control is in the hands of the user. In probability, if a large number of data packets on your device are forwarded packets, do not enable this function, if your process has a small number of sockets to process a large number of data packets, enable it. In any case, this is just a usage and configuration problem, when to enable it, and how much share is set, A sampling process is required.
I started up too early this morning and wrote two essays, so I didn't go out for a slide. Now it's almost 7 o'clock, and her mother and she are still asleep, I am going to work ....
Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.