| 9.4. softnet_data Structure We will see in Chapter 10 that each CPU has its own queue for incoming frames. because each CPU has its own data structure to manage ingress and egress traffic, there is no need for any locking among different CPUs. the data structure for this queue,Softnet_data, Is defined inInclude/Linux/netdevice. hAs follows: struct softnet_data{ int throttle; int cng_level; int avg_blog; struct sk_buff_head input_pkt_queue; struct list_head poll_list; struct net_device *output_queue; struct sk_buff *completion_queue; struct net_device backlog_dev;}The structure provided des both fields used for registration and fields used for transmission. In other words, bothNet_rx_softirqAndNet_tx_softirqSoftirqs refer to the structure. ingress frames are queuedInput_pkt_queue,[*]And egress frames are placed into the specialized queues handled by traffic control (the QoS layer) instead of being handled by softirqs andSoftnet_dataStructure, but softirqs are still used to clean up transmitted buffers afterward, to keep that task from slowing transmission.
[*] You will see in Chapter 10 that this is no longer true for drivers using napi.
9.4.1. Fields of softnet_dataThe following is a brief field-by-field description of this data structure; details will be given in later chapters. some drivers use the napi interface, whereas others have not yet been updated to napi; both types of driver use this structure, but some fields are reserved for the non-napi drivers.
-
Throttle
-
-
Avg_blog
-
-
Cng_level
-
These three parameters are used by the congestion management algorithm and are further described following this list, as well as in the "congestion management" section in chapter 10. all three, by default, are updated with the partition tion of every frame.
-
Input_pkt_queue
-
This queue, initialized inNet_dev_init, Is where incoming frames are stored before being processed by the driver. It is used by non-napi drivers; those that have been upgraded to napi use their own private queues.
-
Backlog_dev
-
This is an entire embedded data structure (not just a pointer to one) of TypeNet_device, Which represents a device that has scheduledNet_rx_actionFor execution on the associated CPU. this field is used by non-napi drivers. the name stands for "backlog device. "You will see how it is used in the section" old interface between device drivers and kernel: First part of netif_rx "in Chapter 10.
-
Poll_list
-
This is a bidirectional list of devices with input frames waiting to be processed. More details can be found in the section "processing the net_rx_softirq: net_rx_action" in Chapter 10.
-
Output_queue
-
-
Completion_queue
-
Output_queueIs the list of devices that have something to transmit, andCompletion_queueIs the list of buffers that have been successfully transmitted and therefore can be released. More details are given in the section "processing the net_tx_softirq: net_tx_action" in Chapter 11.
ThrottleIs treated as a Boolean variable whose value is true when the CPU is overloaded and false otherwise. Its value depends on the number of frames inInput_pkt_queue. WhenThrottleFlag is set, all input frames stored ed by this CPU are dropped, regardless of the number of frames in the queue.[*]
[*] Drivers using napi might not drop incoming traffic under these conditions.
Avg_blogRepresents the weighted average value ofInput_pkt_queueQueue Length; it can range from 0 to the maximum length representedNetdev_max_backlog.Avg_blogIs used to computeCng_level. Cng_level, Which represents the congestion level, can take any of the values shown in Figure 9-4.Avg_blogHits one of the thresholds shown in the figure,Cng_levelChanges value. The definitions ofNet_rx _XxxEnum values are inInclude/Linux/netdevice. h, And the definitions of the congestion levelsMod_cong,Lo_cong, AndNo_congAre inNet/CORE/dev. c.[] The strings within brackets (/DropAnd/High) Are explained in the section "congestion management" in Chapter 10.Avg_blogAndCng_levelAre recalculated with each frame, by default, but recalculation can be postponed and tied to a timer to avoid adding too much overhead.
[]Net_rx _XxxValues are also used outside this context, and there are otherNet_rx _XxxValues not used here. The valueNo_cong_threshIs not used; it used to be usedProcess_backlog(Described in chapter 10) to remove a queue from the throttle state under some conditions when the kernel still had support for the feature (which has been dropped ).
Figure 9-4. Congestion level (net_rx_xxx) based on the average backlog avg_blog Avg_blogAndCng_levelAre associated with the CPU and therefore apply to non-napi devices, which share the queueInput_pkt_queueThat is used by each CPU. 9.4.2. Initialization of softnet_dataEach CPU'sSoftnet_dataStructure is initializedNet_dev_init, Which runs at boot time and is described in Chapter 5. The initialization code is: for (i = 0; i < NR_CPUS; i++) { struct softnet_data *queue; queue = &per_cpu(softnet_data,i); skb_queue_head_init(&queue->input_pkt_queue); queue->throttle = 0; queue->cng_level = 0; queue->avg_blog = 10; /* arbitrary non-zero */ queue->completion_queue = NULL; INIT_LIST_HEAD(&queue->poll_list); set_bit(_ _LINK_STATE_START, &queue->backlog_dev.state); queue->backlog_dev.weight = weight_p; queue->backlog_dev.poll = process_backlog; atomic_set(&queue->backlog_dev.refcnt, 1); } |