Using multi-zone nf conntrack to cache routing and socket to build high-performance processing, nfconntrack

Source: Internet
Author: User

Using multi-zone nf conntrack to cache routing and socket to build high-performance processing, nfconntrack
Some days ago, I completed an extension for nf_conntrack underlying infrastructure, and then wrote a test code:
1. Route results for one data stream in two directions;
2. If the data stream destination is a local machine, the socket associated with the stream;

Cached in conntrack extension. What does this mean?
This means that almost all search operations can be compressed in the conntrack search:
1. routes can be saved in conntrack;
2. the socket can be saved in conntrack;
3. The SA pointer of IPSec traffic can be stored in conntrack;
4. The description of the data stream can be saved in conntrack;
...

This is a real reality. Although this technology can be used to achieve many of the above, this article does not discuss software engineering, nor is it extravagant to talk about the so-called modularization/reusability, this article involves only one choice. If there is one more choice, good people may experience better or worse, but at least it is more monotonous than not fixed.
The pressure seems to fall on the head of the conntrack lookup. After all, I did not say that I have implemented the high-speed cache forwarding mechanism in Linux when I was holding a High-Speed ASIC board. After all, conntrack is a software. However, it is also because it is software that makes it more fun! Here, we want to remind people who are crazy about forwarding optimization that Linux BOX or any intermediate layer-3 and layer-3 devices must be used to achieve one or more of the following purposes except forwarding, otherwise, it would be better to directly connect the vro with a cable (of course, The vro is more flexible than the hard connection, but is there a layer-3 switch... standalone efficiency: Low-layer devices are always better than high-level devices )? If you think that you have cached the route entry so that the packet can be forward when PREROUTING or not entering netif_receive_skb, I can only say, good, high level, proficient in Linux protocol stack... but not necessarily understand the network... are all other Netfilter hooks displayed? How does audit work? How to sample intrusion detection? How does traffic control work? How does package classification work ?... Can I directly turn off the machine and let the packet pass through?
After having spent a crazy time on the protocol stack itself, I thought it was a bit interesting and correct way to optimize the conntrack search. I didn't try to bypass it, because many checks are based on it. After talking about this, I want to say that I have already done this, and the effect is good... my approach is to divide a conntrack into several sheets, for example, one for each L4 protocol, or more commonly, assign a conntrack table to the skb in the raw table. Therefore, I modified the kernel, in skb, a field connmarkidx is added, and 16 conntrack tables are built in. The skb connmarkidx is used to index the specific conntrack table.
However, like many times, I found that I did nothing, because the Linux kernel already supports zone, which does not need to be modified to add fields to the skb kernel, I just need to add a conntrack extend. I reviewed my ideas. In fact, I thought of adding a conntrack extend to save some things, even when this does not meet my needs, I thought of designing a general extension of conntrack extend mechanism to save arbitrary data, and I did the same, but why didn't I think of extending a conntrack table to multi conntrack tables using the extend mechanism? I am in introspection, but it is precisely because of internal problems. It's so ironic that I already know what to do, but I failed to do it. It's because I am over-saving myself, but I don't trust conntrack's introspection mechanism: multiple conntrack tables are inherent in the conntrack mechanism and cannot be extended using the conntrack extend mechanism. It is only suitable for saving something unrelated to conntrack...
Look, that's it. It's too ironic.
Now we can build a fast forwarding mechanism using the multi zone conntrack mechanism and my extension for nf_conntrack, this Mechanism retains all other routine logic of the protocol stack, such as the check code check, TTL decrease, NAT, and intrusion detection. The method is as follows:
1. Use the CT target of iptables to bind a conntrack zone ID to the skb meeting certain matches in the raw table;
2. Load the module compiled by the last instance in "cache private data in Linux Connection Tracking (nf_conntrack) to save every search" (which may need to be improved );
3. debug the panic

Since its birth, nf conntrack has been criticized by many people because of its low efficiency, because it will drop data packets after the percentage of connections allocated to it is exhausted... these are not problems. First of all, we should look at the efficiency of conntrack. Since the first generation, the operation efficiency of conntrack has been continuously optimized until the latest kernel version, for example, if a single spin lock is changed to RCU lock, the Code mutex changes, the lock granularity changes, and the most significant hardware changes. In fact, because of Moore's Law, efficiency improvement is an unconditional result. It is wise to pass a mechanism on the grounds of efficiency. The key lies in what it can bring to you. As for the limit on the number of connections, you can think about this problem from another perspective. Isn't this a anti-DDos or anti-SYN-Flood solution? Or a side effect can bring beneficial results. You cannot unilaterally consider that conntrack's share of connections is a limit on your system, the default value of this number of connections is calculated based on the memory usage of your system. (although it is called "elaborate", it is really not a real name, but I believe it will do it !), In fact, the number of connections is a hard indicator. Whether you limit it or not, it always exists. Even if you have not loaded the conntrack module, can your system process infinite connections? When your conntrack connection share is calculated as 65535 by default, it actually wants to tell you the existence of this hard limit, even if there is no limit, or you change it to 6553500, when the number of connections exceeds 65535, data packets forcibly entering the Protocol Stack may also be discarded by other logic because they cannot be processed. Although this does not mean that it will happen after you give it a try, remember that protocol stack optimization is an extremely complex engineering process and can be perfect without adjusting a certain parameter.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.