Linux Kernel protocol stack socket query cache routing mechanism, protocol stack socket

Source: Internet
Author: User

Linux Kernel protocol stack socket query cache routing mechanism, protocol stack socket
Is it faster to query the route table? Or is it faster to query the socket hash table? This is not the root of the problem. The root cause of the problem is how to use the two effectively, so that the two can become partners rather than competitors. What's going on?
We know that if a data packet is to arrive locally, it will go through two lookup processes (for the time being, conntrack is not considered): IP Layer Lookup routing and Transport Layer Lookup socket. How to merge the two.
The Linux kernel protocol stack adopts a method: Add a dst field to the socket as the cache route method. skb first searches for the socket before finding the route, set the cache dst to skb. When you look for a route, you will find that there is dst, saving the route search process.
The problem is, when will the dst field of the socket be set? Of course, it was set when "the first skb related to the socket" arrived. Undoubtedly, even if the first skb finds the socket, dst on the socket is also NULL at this time, the route table will be searched honestly. If it is found, the route item will be set to the dst field of the socket.
This feature is called ip_early_demux in Linux. In the kernel documentation, it is described as follows:
Ip_early_demux-BOOLEAN Optimize input packet processing down to one demux
Certain kinds of local sockets. Currently we only do this
For established TCP sockets.
It may add an additional cost for pure routing workloads that
Reduces overall throughput, in such case you shoshould disable it.
Default: 1
For forward forwarding, this feature will inevitably reduce performance, but I don't want to talk about this obvious problem. I want to talk about two points:
1. Hierarchical cache Logic
We know that route lookup is a "best effort" Multi-to-one matching process. skb and route entry do not have a exactly one-to-one correspondence relationship. Therefore, the socket cannot be cached in the route entry, however, the routing entry can be cached in the socket, because the socket and skb have a one-to-one correspondence relationship (I am not talking about TCP listen socket ..), similarly, I can cache the socket in the route cache, because the routing cache and skb are also one-to-one mappings.
However, the Linux kernel still removes the support of the route cache, but this does not matter, as long as you know one thing is enough: the exact matching items corresponding to one can be cached more loose than one matching item. If you move your eyes to conntrack, you will know how to do it. I used to cache the route entry into conntrack. According to this logic, it is reasonable. Likewise, the socket can also be cached in conntrack, which has the match and target related to iptables.
2. automatic or manual
Since Linux has the ip_early_demux configuration parameter, the problem is when to enable it and when to disable it. This is especially difficult to answer when you do not know how many packets arrive locally and how many packets are forward. In this case, do you believe that the Administrator's non-zero configuration is 1, or do you want the system to dynamically adapt?
How to collect statistics is particularly important. In typical cases, if more than 60% of the packages arrive locally, enable them, and vice versa. Ip_early_demux configuration parameter as a global parameter is not bad, because if not, there will be another problem, that is, how to determine whether a package needs early_demux... for non-seven-layer border devices, traffic is classified into management traffic and data traffic. For the former, the traffic end point is local, while for the latter, the local machine only performs forward. If we can efficiently classify packets in a plane in advance, the configuration of two ip_early_demux servers will be better. for out-of-band management, Linux nsnamespace can perform this task well.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.