Data Packet Classification and scheduling-another explanation of Linux TC-linuxtc

Source: Internet
Author: User

Data Packet Classification and scheduling-another explanation of Linux TC-linuxtc
If you understand the Linux TC framework from the perspective of layered recursion, it is easy to classify queues into class queues and class-free queues. In this perspective, the status of a queue is equal to that of a queue. But in fact, there is a hierarchical relationship between them. The reason for dividing it into a class queue and a class queue is completely implemented. You can see that the implementation of the TC framework in Linux is very compact, based on this recursive "Queuing rule, category, filter "triple. However, aside from implementation, we need to use a more reasonable way to thoroughly understand packet scheduling.
1. packet scheduling data packet scheduling is a layer that isolates the NIC-driven transceiver module and protocol stack. That is to say, data packets are not directly transmitted from the protocol stack to the NIC-driven transmission module, but to the packet scheduling layer. Then, the driver module extracts data packets from the scheduling layer to send data packets. Similarly, the same is true for receiving data packets. You can regard this packet scheduling layer as a buffer between the protocol stack and the network adapter. If you are familiar with the IO process of UNIX block device files, you can think of this scheduling layer as the buffer of the block device. The block device uses the buffer metadata to solve the problem of random I/O and real data entering the media first and then, nic devices rely on this packet scheduling layer to solve the traffic control problem.
With such an intermediate layer, the NIC device is more like a block device than a simple FIFO stream character device. In this scheduling layer, data packets can be reordered, discarded, blocked, and so on to achieve speed limiting or integer operations on data packets or data streams. This isolation layer can be called the strategy module, whether on the NIC or block device, that is, the policy! In fact, it plays such a role. For network data packets, the upper-layer protocol stack does not care about how data packets are sent ", instead, you only need to complete the encapsulation at the protocol stack layer. By sending data packets to the strategy layer, the task is finished. For NICs, it does not care about the protocol stack. No matter whether packets are written through the Protocol Stack or by other means, it only knows how to pull a "most worth sending" packet from the strategy layer, and how to pull a "most worth sending" packet is exactly the embodiment of strategy, it is also the key to implementing the strategy layer, namely the data packet scheduling layer!
If you are familiar with process scheduling, you should understand what scheduling means. schedule in English has the concept of a schedule, which means to do something at the current or future time. For process scheduling, it is to find the most worth-running process in the progress table to put it into operation. The specific way to find it is the embodiment of the strategy of scheduling. for modern Linux, It is RT in the scheduling class, FAIR: for the former, FIFO is used for scheduling. For the latter, the CFS algorithm is used for scheduling. Each process can classify itself into a scheduling class through APIS at the beginning or after creation, once the scheduling class is applied, the scheduling algorithm of the scheduling class should be used when it is the scheduling class scheduling process.
Understanding Process Scheduling is very helpful for understanding the scheduling of data packets. In fact, the only difference between data packet scheduling and process scheduling lies in the difference in the scheduling entity, it means "finding out the data packets that are most worthy of sending [ingress Scheduling with IMQ or ifb]". The basic scheduling method is as follows.
1. Scheduling according to FIFO rules
The simplest scheduling method is to maintain a queue, first in, first out, as shown in the figure below:




1. 2. scheduling by priority is a little more complex. Maintain multiple queues with one priority. The scheduling algorithm first selects a queue with the highest priority and then executes the FIFO algorithm in the queue, this is a second-level scheduling algorithm, as shown in the figure below:




1. 3. random fair scheduling is the opposite of priority scheduling. It ensures that all data packets have the same sending opportunities. It also maintains multiple queues, each of which is a hash bucket, each data packet is hashed to these hash buckets Based on the specified key values. The scheduling algorithm schedules the hash queue from left to right. Each queue executes the FIFO algorithm, the hash algorithm is changed every time interval, as shown in the figure below:




. Any other scheduling methods that you can think of in the Linux kernel that have not yet been implemented and the algorithms that have been implemented have been improved.
In addition to packet scheduling, there is also a basic concept, that is, traffic shaping. The basic shaping method is the token bucket. Note that the purpose of shaping is not to find the most worthwhile data packet, but to determine whether a arrived data packet can be sent, because it should not be classified as the category of packet scheduling. The difference between scheduling and shaping is that a lot of data packets are already blocked and the most worthwhile one is to be sent. On the contrary, shaping refers to no data packet queuing, this data packet can be directly sent, and the integer logic determines whether it can be sent immediately. The similarities between scheduling and shaping are that they are called when a packet is selected for the NIC to be sent (or received.
. Token abstraction we all know that a token bucket is a powerful tool for Traffic Shaping and used by almost all network devices. Therefore, it must have its own special features. When talking about the token bucket, you have to raise a question, that is, how to speed up a data stream. The data stream here can be defined at will. You can think of it as a data packet set in a FIFO queue, it can also be a set of data packets defined by the classic quintuple, or a set of data packets with the same hash value calculated based on a certain field of the data packet. The most obvious speed limit method is to record the statistical information of a data stream, and then speed limit according to the information, such as record the last time a data stream was sent and the amount of data sent, then, the weights are weighted to the historical data volume and the historical sending time point. The smaller the weights are, the more time the data packets are sent. When the packets arrive, the system determines whether they can be sent, make a time difference between the current time and the time of the last calculation, and then divide the total data volume by the time difference to calculate a rate. This serves as a benchmark to weigh whether the data stream has been speeding. In fact, Linux tc cbq queues are used in this way. But even the author is not satisfied with it, so he has HTB. Of course, this is a post.
However, this computation is extremely inaccurate and is greatly affected by the implementation of the protocol stack of the operating system, the implementation of the NIC driver, and the implementation of the timer. Therefore, another abstraction is required as a reference, this is the token. The token enters the token bucket at a constant speed in bytes or bits at a certain rate. When the data packet arrives, you only need to check whether the token bucket has enough tokens, if yes, it can be sent. Because the token bucket is a container that can accumulate tokens, it can easily meet the needs of burst data.

Note: so far, we only describe how the data packet is scheduled, that is, how to find the data packet that is most worth sending. But there is a basic premise that the data packet is already there, in this example, I used the basic data structure of the queue, which is actually the queue. There is another problem that cannot be solved, that is, how data packets are sent to the queue. For process scheduling, when a process is created or running, you can call the setscheduler system call to put the process into a scheduling linked list, queue, or red/black tree to schedule the process scheduling module. But what about data packets? There must also be such a mechanism, which can be collectively referred to as queuing rules. So far, the overall picture of packet scheduling is as follows:




It can be seen that the separation of Packet Classification and packet scheduling are advantageous, that is, the data packet scheduling system can concentrate on completing the scheduling details according to its own algorithm, without having to identify the data stream, the identification and classification of data streams are done by the upper layer of the scheduling system.
2. the upper layer of data packet scheduling is in section 1st. We talked about the meaning of "scheduling" and compared it with process scheduling in detail, but so far, the concept corresponding to the scheduling class in the process scheduling system has not yet been mentioned, but only some details of the scheduling algorithm are described. In process scheduling, the scheduling class is implemented in the kernel, "A scheduling process is scheduled to run according to the same algorithm." corresponding to packet scheduling, there is also the concept of a class, that is, all data packets belonging to a data packet category, the same algorithm will be used for scheduling. Just as the scheduling class is at the upper layer of process scheduling, data packet classification is also at the upper layer of data packet scheduling.
For scheduling classes in process scheduling, there are unequal scheduling classes. before scheduling a real process, you must first schedule among scheduling classes to select a scheduling class, then, scheduling is performed among all processes that belong to the scheduling class. Likewise, data packets are classified into unequal categories according to their characteristics. First, they must be scheduled between categories. The scheduling algorithm is similar to the scheduling algorithm of data packets, the difference is that, just as the scheduling class is at the upper layer of the process scheduling, the scheduling of the data packet category is also at the upper layer of the data packet scheduling.
. Data Packet Classification-the queuing process can implement precise scheduling and Shaping Based on "data packet scheduling. The problem is how to queue a data packet to the scheduling queues described in section 1st. For the upper layer of packet scheduling, I am collectively referred to as a queuing rule. Note that this queuing rule is completely different from the queuing rule in the TC document. The queuing rule here refers to the entry of data packets into the scheduling system, until all the rules are finally queued to a queue, I think it will be easier to understand than the recursive "Queuing rules, categories, filters". After all, except for the final queue, the intermediate process only determines the next branch of the data packet and is not a real queue.
Therefore, the upper layer of the entire packet scheduling is a series of decisions, and finally draws a unique path to reach the final scheduling queue, which is understood according to graph theory and implementation efficiency, the best graph for determining a unique path is a tree. There is a unique path from the root to a leaf node. Therefore, this series of decision processes is the process in which data packets reach the leaves. The only problem of decision points is how to branch data packets after they reach the branches of data. Obviously, this tree can have N forks, and the height of each branch is not necessarily the same. In the end, you only need to reach the scheduling queue represented by a leaf node.
The specific choice of the branch to which the data packet goes is built inside the intermediate node. The decision algorithms of each intermediate node can be the same or different. This forms a layered recursive structure tree, as shown in:




According to this understanding, the queuing logic of the Linux TC framework is much simpler, but it is similar to the classic triples. We can configure a filter on each intermediate node. if it adopts a different choice algorithm than the parent filter, We can redefine a Qdisc, although this name is often misunderstood during the queuing process (note that it is correct during the queuing process because it is recursive because it is different from the queuing process ). Corresponds to the classic "Queuing rules, classification, and filter" Triple. Classification represents the child node under a tree, the filter is used to select the child node of the next layer based on the characteristics of the data packet.
Therefore, Linux's HTB rule is the most classic and avoids misunderstanding to the maximum extent, because it tries to make all branch Selection Algorithms consistent (because it can be divided into multiple layers ), it actually places a token bucket on each intermediate node, which can control the data packet rate that enters any branch. Note, these token buckets are not used during the queuing process, but used during the queuing process.
2. 2. scheduling-the process of queuing if the packet queuing process is a process of connecting a unique path from the root node to the leaf node, the process of queuing is a recursive scheduling process at each layer of the tree. This is why the TC document of the Linux kernel uses the "Queuing rules, classification, and filter" triple to describe the TC framework. However, please note that it cannot be understood literally. The final scheduling and classification of data packets are irrelevant. Classification is only the behavior of the filter based on the characteristics of data packets. Similarly, there is no relationship between the final scheduling of data packets and queuing rules. queuing is only a queue action, while scheduling is a queue action.
If the queuing operation is performed on each node by selecting a branch for a packet according to the filter configuration policy and finally entering the real queue of the leaf node, the dispatch operation is a process where a branch is selected from the root node according to the scheduling algorithm and a data packet is retrieved from the leaf node. No matter whether it is in the queue or out of the queue, all operations are from the root to the leaves. The nodes closer to the root are involved in classification and scheduling first.
From the process of leaving the team, we can see that this is a process of scheduling data packets on each layer of each sub-tree according to the scheduling algorithm specified by the sub-tree root. It is the same as the process of joining the team, this is also a layered recursive structure tree, as shown in:




According to this understanding, the queuing logic of the Linux TC framework can be called a new triple, namely, "scheduling rules, scheduling entities, and scheduling algorithms". The scheduling entity is each node in the tree, of course, it also includes leaf nodes. For leaf nodes, the scheduling entity can be in any data structure and does not have to be a tree node (because it does not have any Subtrees ), the scheduling algorithm selects the scheduling entity at the following layer.
3. TC Scheduling System for process scheduling, the kernel collectively refers to the scheduling algorithm and scheduling class as the process scheduling system. Likewise, data packet classification and data packet scheduling can also be collectively referred to as data packet scheduling. The entire data packet scheduling module is divided into the scheduling module and the scheduling module's upper-layer inbound and outbound modules. The inbound and outbound modules can also be divided into two processes: the inbound and outbound processes, A new triple of "scheduling rules, scheduling entities, and scheduling algorithms" is introduced to correspond to the classic "Queuing rules, classifications, and filters" in the queuing process, this is good for understanding the Linux TC framework. The structure of the Linux TC framework is as follows:




I have never mentioned ingress throttling, because in Linux TC implementation, the ingress node cannot have a queue, which means that the throttling is not controlled. However, this does not mean that Linux cannot implement traffic control for ingress.
Linux tc command

If you ask for details, you should go directly to the tutorial.
However, it is really a little effort to get started with TC. You should write more and read the scripts written by others to understand what they mean.
Class is the rule branch. If you use TC to limit the speed, you must first write some rules, such as priority and traffic limit.

After the rules are written, filter the data packets to let the system know which packages comply with the rules and let the data packets go through different branches (rules, class) according to your filtering conditions)

The basic steps are as follows. For more information, see the tutorial.

A linux tc Speed Limit

Enter tc directly under the terminal and it will be OK. tc should be used in combination with iptables or ebtables. The above two can be used to mark layer-3 and layer-2 data streams, and then tc can mark resume htb or qdisc Based on the tag, this allows you to control the flow of specific tags.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.