Linux uses TC to control network QoS (2)

Source: Internet
Author: User

First, let's take a look at how traffic control is implemented in the kernel. First, when the kernel sends data, it will eventually call dev_queue_xmit,

Struct qdisc * q

If (Q-> enqueue ){
Rc = _ dev_xmit_skb (SKB, Q, Dev, txq );
Goto out;
}

If the Q-> enqueue function is not empty, the traffic control logic is entered. The following calls _ dev_xmit_skb.

Static inline int _ dev_xmit_skb (struct sk_buff * SKB, struct qdisc * Q,
Struct net_device * Dev,
Struct netdev_queue * txq)

The function determines the qdisc-> state. If _ qdisc_state_deactivated, free SKB returns net_xmit_drop. If the status of qdisc-> state is not _ qdisc_state_running and the qdisc label contains tcq_f_can_bypass, the package is directly sent out. Otherwise, call qdisc_enqueue_skb to put the SKB in root qdisc, and then call qdisc_run.

Qdisc_run: If qdisc-> State is determined as _ qdisc_state_running, _ qdisc_run is called,

Void _ qdisc_run (struct qdisc * q)
{
Unsigned long start_time = jiffies;

While (qdisc_restart (q )){
/*
* Postpone processing if
* 1. Another process needs the CPU;
* 2. We 've been doing it for too long.
*/
If (need_resched () | jiffies! = Start_time ){
_ Netif_schedule (Q );
Break;
}
}

Clear_bit (_ qdisc_state_running, & Q-> State );
}

_ Qdisc_run cyclically calls qdisc_restart until a jiffy is consumed or the CPU needs to be scheduled to other processes (need_resched ), in this case, call _ netif_reschedule to hand over the softnet_data-> output_queue of the current CPU to qdisc-> output_queue and trigger a net_tx_softirq Soft Interrupt.

The qdisc_restart function is as follows:

Static inline int qdisc_restart (struct qdisc * q)
{
Struct netdev_queue * txq;
Struct net_device * dev;
Spinlock_t * root_lock;
Struct sk_buff * SKB;

/* Dequeue packet */
SKB = dequeue_skb (Q );
If (unlikely (! SKB ))
Return 0;

Root_lock = qdisc_lock (Q );
Dev = qdisc_dev (Q );
Txq = netdev_get_tx_queue (Dev, skb_get_queue_mapping (SKB ));

Return sch_direct_xmit (SKB, Q, Dev, txq, root_lock );
}

Qdisc_restart obtains a SKB from the qdisc header, calls qdisc_lock to obtain a qdisc root lock, then calls netdev_get_tx_queue to obtain the netdev_queue Queue Based on the SKB hash, and CALLS sch_direct_xmit to directly send the sk

Sch_direct_xmit first calls qdisc_unlock to release the qdisc root lock, calls supervisor to send SKB through the driver, and then determines the return value. If it is netdev_tx_ OK, it returns the qdisc queue length of qdisc_qlen. If it is netdev, I will not go into details here. If netdev_tx_busy is returned, call dev_requeue_skb to re-import the SKB to the queue.

Traffic control also supports control of inbound packets. The kernel calls netif_receive_skb for inbound packets. This function calls handle_ing. Handle_ing first checks whether SKB-> Dev-> rx_queue.qdisc is noop_qdisc. If it is noop_qdisc, there will be no QoS control; otherwise, ing_filter is called.

Static int ing_filter (struct sk_buff * SKB)
{
Struct net_device * Dev = SKB-> dev;
U32 TTL = g_tc_rttl (SKB-> tc_verd );
Struct netdev_queue * rxq;
Int result = tc_act_ OK;
Struct qdisc * q;

If (max_red_loop <TTL ++ ){
Printk (kern_warning
"Redir loop detected dropping packet (% d-> % d) \ n ",
SKB-> IIF, Dev-> ifindex );
Return tc_act_shot;
}

SKB-> tc_verd = set_tc_rttl (SKB-> tc_verd, TTL );
SKB-> tc_verd = set_tc_at (SKB-> tc_verd, at_ingress );
Rxq = & Dev-> rx_queue;
Q = rxq-> qdisc;
If (Q! = & Noop_qdisc ){
Spin_lock (qdisc_lock (q ));
If (likely (! Test_bit (_ qdisc_state_deactivated, & Q-> State )))
Result = qdisc_enqueue_root (SKB, q );
Spin_unlock (qdisc_lock (q ));
}
Return result;
}

Ing_filter first finds SKB-> Dev-> rx_queue.qdisc. If it is not noop_qdisc and the status is not _ qdisc_state_deactivated, call qdisc_enqueue_root to put SKB into the queue.

It can be seen that if the network device supports traffic control, the transceiver function must support enqueue/dequeue of qdisc. From the test results and code, xen netback can be supported, bridge cannot support

For xen netback, there are two methods to limit the QoS of the Virtual Machine output packet. First, you must know that netfront sends packets to netback, which triggers net_tx_action tasklet of netback and calls net_tx_submit, then the netif_rx function is finally called. I still remember this function is not. When the kernel protocol stack processes non-napi, it calls netif_rx to collect packets. Netif_rx will call netif_receive_skb. There will be a handle_ing function in it to implement QoS for the ingress package. This is the first method to implement TC for netback.
Qdisc ingress rules for traffic control;

The second method of limiting outgoing packets has also been mentioned earlier, that is, Mark Shenma is implemented on the bridge, and traffic control is implemented on the physical port for Mark. Because this is the egress rule, therefore, you can use various TC classes.

What if netback limits Virtual Machine packages? For the incoming packet, first the physical port enters the bridge and then the net back from the bridge. When the bridge switches the packet to the network device, br_forward_finish is called, then call br_dev_queue_push_xmit to "send" The package through dev_queue_xmit. Therefore, to limit the QoS of Virtual Machine inbound packets, you only need to set TC egress rules on netback. Because netback is a superset of the net_device, it supports the dev_queue_xmit method and qdisc rules in the egress direction.

Finally, let's take a look at the settings related to ingress QoS. The kernel's built-in ingress qdisc function is very simple. It only supports the most basic rate limit and will drop all the excess traffic, currently, a good practice is to redirect ingress traffic to this device through a virtual network device, and then set traffic control rules on this virtual device.

Modprobe IFB

IP link set ifb0 up

IP link set ifb1 up

The IFB device is used by the kernel for traffic control redirect. After the driver is loaded, you can set the same rules for the device and egress.

TC qdisc add Dev ifb0 root handle 1: htb default 100

TC class add Dev ifb0 parent 1: classid htb rate 100 Mbit Ceil 100 Mbit

TC filter add Dev ifb0 parent 1: Protocol ip prio 1 u32 Match ip SRC 0.0.0.0/0 flowid # all traffic match classid

Finally, we need a rule to redirect the inbound traffic of peth0 to the ifb0 device.

TC qdisc add Dev peth0 ingress

TC filter add Dev peth0 parent FFFF: Protocol IP u32 match u32 0 0 action mirred egress redirect Dev ifb0

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.