Linux TC (Traffic Control) Framework principle Analysis

Source: Internet
Author: User

Recent work is somewhat related to the Linux flow control. Since I knew about TC a few years ago and more or less understood its principle, I did not move it, because I do not like the TC command line, it is too cumbersome. iptables command line is also more cumbersome, but more intuitive than the TC command line, and the TC command line is too technical.

Perhaps I have no understanding of the TC framework for the NetFilter framework. Yes, maybe. Iptables/netfilter corresponding is the TC/TC.
The Linux kernel has a built-in traffic control framework. can achieve speed limit. Traffic shaping, policy application (discard, NAT, etc.). Can you think of anything else from this frame? Maybe not now, but I'll start by saying it briefly. Similar to the TC framework is the NetFilter framework, but the two are very different.
After mastering the NetFilter framework. It is much simpler to experience the TC framework, especially when you think that NetFilter has such limitations, and with these problems to experience the design of the TC framework, you may find that the TC compensates for netfilter deficiencies in some ways. Before I go into specifics, let me introduce you to the similarities and differences in design that result from different intentions.
Say NetFilter first. Undoubtedly, this framework is designed to filter packets on the kernel path of the network protocol stack, like a checkpoint on a road, where NetFilter sets up the level in 5 locations on the path of the protocol stack processing network packets, and a packet is checked on the processed path through these levels. The result is a number of actions: acceptance. Discard, queue, import other paths, etc., the framework simply needs to produce a single result for a single packet, and what services are provided within the level within the NetFilter framework does not matter what the rules are.
Now we look at TC. It is designed to provide a service for packet or data flow, such as speed limit, shaping, and so on, and this is not a similar netfilter result can be expressed, providing these services need to run a series of actions. So how to "plan and organize the operation of these actions" is the key to the design of TC framework!

In other words, the TC framework focuses on how to run rather than just want to get a running action. In other words, what the NetFilter framework does, and the TC framework focuses on what to do. (About NetFilter I have written a lot of code and articles, no longer repeat ...)
About speed limit. Traffic shaping theory has been very much, compared to the common use of token bucket, but this article is concerned about the implementation of the TC framework of Linux and not the content of the token bucket algorithm, however in a short article can not be described in detail from the flow control theory to various operating system version number of the implementation of the history, But we know. The use of queues is the real choice in most implementations, so now the question is how the Linux TC framework organizes the queues. Before the specific in-depth discussion of the queue organization. The last time I compare netfilter and TC.
Suppose you know the difference between a UNIX character device and a block device. Then it is easier to understand the difference between the NetFilter frame and the TC frame. NetFilter a hook point similar to a pipe character device, and SKB is the one-way character stream in this device, usually in accordance with the flow from one end. It then flows from one end to another in the order of entry, with a result, for example, accept. Drop and so on. And the TC frame is more similar to a block device. The content is randomly stored and randomly accessed, i.e. the order in which the SKB enters is not necessarily the order of SKB. And that's what traffic shaping needs to do. Other words. The TC framework must implement a random access packet storage buffer. Flow control is performed in this buffer. Of course, we already know that this is done by the queue.
Of course, no matter what is not absolute, netfilter a hook point can also have storage buffer or run a series of actions, typical is conntrack in the Shard and Nat function, for prerouting This hook point of the Shard reorganization. No doubt for the Shard, just enter the hook, temporarily saved in the inside. Until all the shards have come to the successful completion of the cut and reorganization, and for NAT, the result of NetFilter is undoubtedly "running a series of actions" rather than just accept. In addition, I have written a number of modules, using NetFilter to achieve flow control, in turn, the TC framework can also be implemented NetFilter functions, in short, when you understand the design principles of these frameworks and their nature. On the use and expansion. You'll be able to discovering. It's all very effortless.
Personally, for a single netfilter hook point, the TC frame is its superset. More flexibility in implementation and, of course, more complexity.

NetFilter's own TC does not have the charm of its hook point location definition.
All right. The design of the TC framework is now formally introduced.
A lot of information found on the Internet when introducing TC. Without exception the TC was introduced by "The queue procedure", the category. Filter "is composed of three. Most of the ambiguity, I dare say, comes from a document or a book.

Very few people from another perspective to understand the design of the TC framework, which in itself is a more challenging thing, I personally prefer such things. Before the queue organization that describes the TC. Let me introduce you to what is called recursive control. The so-called recursive control is controlled hierarchically, but for each level, the control mode is consistent. Familiar with the CFS dispatch know that the group scheduling and task scheduling are all using the same scheduling method. However, it is clear that the group and task are at different levels, and I have drawn the following diagram to briefly describe the situation:




Not only the organization of control logic, even Linux in the implementation of the UNIX process model, but also used the tree-like recursive control logic, each level is a two-layer tree, showing this model:




It can be seen that the recursive control is fractal. Let's say it's better to use a three-dimensional chart. Each node except the leaf node is a separate small tree, whether it is a tree or a small tree. The nature of control logic or organizational logic is the same.
Recursive control facilitates arbitrary stacking of logic, which we have seen in the design of the protocol stack, such as x over Y, abbreviated Xoy, for example Pppoe,ip over UDP (tun mode OpenVPN), TCP over IP (native TCP/IP stack) ... For TC, consider one of the following requirements:
1. Divide the entire bandwidth into TCP and UDP in accordance with the 2:3 ratio;
2. In TCP traffic, it is divided into different priority according to the source IP address segment.
3. In the same priority queue, divide the bandwidth into HTTP applications and others according to the ratio of 2:8;
4 ....
.

From the above requirements can be seen, this is a recursive control needs. Both 1 and 3 use a bandwidth ratio allocation. But it is obvious that this is at different levels. The whole architecture should look something like this:




But things are far from the imagination of the simple, although the above diagram has let you see the TC framework of the clues. However, it does not help to achieve it.

A few typical questions are placed there, how do you identify the packets to different queues, what data structures the non-leaf nodes of the graph are going to present, and how to express them if they are not really queues but have queue behavior? ...
When Linux implements TC, the "queue" is abstracted. Basically it maintains two callback function pointers, one is the enqueue queue operation, the other is the dequeue out of the team operation. Both Enqueue and dequeue do not necessarily queue packets, but simply "run a series of operations."

This "run a series of operations" can be:
1. For leaf nodes. A real queue or pull a packet out of a real queue;
2. Recursively calls the enqueue/dequeue of other abstract queues.


Notice the 2nd above. Referring to "other abstract queues", how do you locate this abstract queue? This will require a choice. That is, a selector, according to the characteristics of the packet to the packet into an abstract queue, this time, the TC design diagram can be used to express:


Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvzg9nmjuw/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70 /gravity/center "/>


To see, I didn't really use the classic "Queue protocol, category, filter" ternary to define the TC framework. Instead, it is explained by the meaning of a recursive control. Assuming that the classic triples are set on this picture, it will look like this. Notice that I deleted the unnecessary text. This picture is not too chaotic, the need for text please refer to:


Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvzg9nmjuw/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70 /gravity/center "/>


It can be seen that the change is not a compromise or think alike.
Well, now say humorous, or netfilter related. Of course it's not a comparison with TC. It's my personal idea.

Once upon a while, I highly respected Cisco's ACLs, which should be applied to the NIC interface. The netfilter is intercepted on the processing path rather than on the processing device. For netfilter, the processing device is just a match with no special purpose, regardless of whether it has any relationship. All the packets have to go through the NetFilter Hook point choice. At least you have to infer if it matches the-I ethX ... I want to hang a filter_list on the Net_device and write some code. Found the effect is better, ready to use. I am a man who often makes wheels repeatedly. When I later saw the implementation of the TC, I found that the TC framework was exactly what I wanted to find. So I put a word. Can be achieved with netfilter. The same can be achieved with TC. And. TC is based on the queue discipline (the data structure field is so written, Qdisc-queue discipline, which is not affected by the classical ternary expression), the abstract queue/out of the team does not specify how to implement, and queuing procedures and network card binding ( More precisely the queue of the NIC-assuming that the network card supports multiple queues, rather than intercepting it on the processing path. So I have two options:
1. Implement a new Qdisc, its built-in a simple FIFO queue, enqueue operation from NetFilter transplanted matches/target, all accept packets into the FIFO;
2. Make a fuss on the classifier, whether to attribute the packet to a category not only to see the characteristics of the packet, but also to run an additional action callback function, only the function returned 0 to represent success. And since it's a callback. You will be able to do whatever action (Drop,nat, etc.) in it. Shut the door to Lualu.


In the above 1 and 2, the 2nd has been achieved, the 1th very easy to implement, you just need to implement a queue protocol can, or for each queue of the procedures to add an action, looks like for example, as seen in:


Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvzg9nmjuw/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70 /gravity/center "/>


For the 2nd. Relatively simple, its essence is in the diamond in the fuss, enlarged after the diamond for example to see:


Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvzg9nmjuw/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70 /gravity/center "/>


This is the TC framework to implement the function of the firewall and NAT function, which I have always wanted. As a matter of fact, I've known this for a long time, just because I don't like the TC command, because it's too technical to be configured, it's extremely difficult to maintain, and even more difficult to maintain than the iptables rules, and maintenance is super important, it's even more important than how you want to write this rule, because how it's written is instantaneous. , assuming you have enough accumulation, then you can take a moment, if you encounter a problem, dare to say that the revelation of inspiration is instantaneous, for example, drink, but maintenance is a long time, and the maintenance of the person is not necessarily you, you have to consider for others. Because the technical society is altruistic society.
All right. So far, I believe I have said all that. are frame-like, no matter what the details are inside, although not very fond of TC command line. But I still want to finally use a picture to show the relationship between each TC command and the kernel data structure, there is still no detail. The command is incomplete, omitting the match, because I know that it is not important:




Look at my article. You may be very rare to the kind of thing that can be used to paste directly after copying it. Code omitted, the command omitted, even if it is my own, in seeing what I wrote many years ago, I really want to run something high speed, but there is no such thing.

But I think that thought is greater than fulfillment. Assuming you understand the nature behind the implementation or the reality, then you will be handy and comfortable.

Linux TC (Traffic Control) Framework principle Analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.