The registration and initialization process of the pci NIC Driver is implemented. the TCP/IP protocol stack and the pci protocol stack are implemented after soft interruptions occur.
0x01 reason
The purpose of this study is to follow the outline below:
The three handshakes, data transmission, and four handshakes are debugged by the server client program.
The setup environment is as follows: debug the Linux kernel environment in the Early Stage
From the physical layer up:
Go to the physical layer: How does the driver process interruptions?
Enter the link layer: what the driver processes after the NIC device receives the packet, whether to perform mac verification, and what conditions will be met to send the packet to the protocol stack, the difference between the hybrid mode and the non-hybrid mode, how does the NIC Driver work?
Enter the network layer: the Key Path after the ip package comes up. How does the ip layer route selection process work?
Entering the transmission layer: three-way handshake, packet transmission, and four-way handshake?
Go to the application layer: How to put data into the buffer queue, and how does the socket know that the data is available?
Down from Application Layer:
Go to the application layer: Does the application layer call send to cache data?
Entering the transmission layer: three-way handshake, packet transmission, and four-way handshake?
Enter the network layer: the Key Path after the ip package comes up. How does the ip layer route selection process work?
Enter the link layer: what the driver processes after the NIC device receives the packet, whether to perform mac verification, and what conditions will be met to send the packet to the protocol stack, the difference between the hybrid mode and the non-hybrid mode, how does the NIC Driver work?
Go to the physical layer: How does the driver process interruptions?
0x02 Physical Layer
The physical layer mainly involves the interaction between network device drivers and hardware. I will not go into depth here. After all, I will not do the relevant driver development. I will do a deeper understanding when I learn the DPDK driver later.
1. Standard Mode
Generally, we know that the network adapter will only accept data packets whose destination address is his instead of its data packets, therefore, we should know that the NIC will only accept the packets we receive but not the network packets from other addresses.
2. hybrid mode
Hybrid mode means that a host can accept all the data streams that pass through it, regardless of whether the destination address of the data stream is it or not, it will accept the data packet. That is to say, in hybrid mode, the NIC receives all packets sent to it. In this case, you can receive all data from the same hub LAN.
0x03 Link Layer 1. Network Packet collecting Principle
There are roughly three scenarios for collecting network-driven packages:
No NAPI: each time a mac receives an Ethernet packet, it generates a receiving interruption to the cpu, that is, packets are collected completely by means of interruption.
The disadvantage is that when the network traffic is large, the cpu consumes most of the time to process mac interruptions.
Netpoll: when the network and I/O sub-systems are not fully available, the network is interrupted by a specified device, that is, the packet is collected by polling.
Disadvantage: poor real-time performance
NAPI: interrupt + round robin: after a packet is received by the mac, the receiving is interrupted, but the packet is immediately closed.
The receiving interruption is not enabled until netdev_max_backlog packets are enough (300 by default), or after all packets on the mac are collected.
Use sysctl to modify net. core. netdev_max_backlog
Or use proc to modify/proc/sys/net/core/netdev_max_backlog.
2. Preparations for Kernel startup 2.1 initialize the global data structure related to the network and hook functions for attaching soft interruptions
start_kernel() --> rest_init() --> do_basic_setup() --> do_initcall -->net_dev_init ()
Explanation of the net_dev_init () function:
/** Initialize the global data structure that stores device-related information. The device is not initialized yet. * This function is called by a single thread during boot, you do not need to obtain the rtnl semaphore */static int _ init net_dev_init (void) {int I, rc =-ENOMEM ;/*...... */for_each_possible_cpu (I) {// each CPU has its own queue struct softnet_data * queue;/* There is a global variable called per_cpu _ softnet_data, it is defined as a data queue with a soft interrupt of CPU (including receiving and sending). This variable is passed through DEFINE_PER_CPU (struct softnet_data, softnet_data) = {NULL }; defined */queue = & per_cpu (softnet_data, I); skb_que Ue_head_init (& queue-> input_pkt_queue); // initialize the input package queue-> completion_queue = NULL; INIT_LIST_HEAD (& queue-> poll_list); queue-> backlog. poll = process_backlog; // This queue has a device named backlog_dev. Its poll function pointer points to a function called process_backlog. The reception Soft Interrupt (RX_SOFTIRQ) We set will deal with this queue and back_log device queue-> backlog. weight = weight_p; queue-> backlog. gro_list = NULL; queue-> backlog. gro_count = 0 ;}/*...... */open_softirq (token, net_tx_action); // send handler open_softirq (token, net_rx_action) on the Soft Interrupt; // Receive handler callback (dev_cpu_callback, 0); dst_init (); // The parameter is the notification chain dst_dev_notifier. It is of no particular significance during initialization, but it is used to delete the device interface. It is very important because it only responds to deletion events. Dev_mcast_init (); rc = 0; out: return rc ;}
2.2 load the Network Driver
The NIC Driver I analyzed is Ethernet controller: Advanced Micro Devices, Inc. [AMD] 79c970 [PCnet32 LANCE] (rev 10) pci nic Driver. Here we will study how the PCI Nic is manipulated, in this way, we can infer the implementation basis of the driver under different bus technologies.
Platform pci driver structure:
Static struct pci_driver pcnet32_driver = {. name = DRV_NAME, // device name. probe = pcnet32_probe_pci, // configure and bind the driver, which is called by the PCI module. remove = _ devexit_p (pcnet32_remove_one ),. id_table = pcnet32_pci_tbl, // This is an array of the pci_device_id {} structure and must be consistent with the hardware information of the device. suspend = pcnet32_pm_suspend ,. resume = pcnet32_pm_resume ,};
I will not repeat the detailed operations of other drivers, but I have many knowledge points. I have called this point to speed up the learning of the one I care about.
Static void pcnet32_rx_entry (struct net_device * dev, struct pcnet32_private * lp, struct pcnet32_rx_head * rxp, int entry) {int status = (short) le16_to_cpu (rxp-> status)> 8; int rx_in_place = 0; struct sk_buff * skb; short pkt_len;/* center omitted */dev-> stats. rx_bytes + = skb-> len; skb-> protocol = eth_type_trans (skb, dev); netif_receive_skb (skb); // enter the protocol stack dev-> stats. rx_packets ++; return ;}
2.3netif _ receive_skb Function
Int netif_receive_skb (struct sk_buff * skb) {// skip some code rcu_read_lock (); // Step 1: First process all packet_type-> func () on ptype_all () // All packages will be tuned to func, which has a serious impact on performance! By default, the kernel does not have any hook function list_for_each_entry_rcu (ptype, & ptype_all, list) {// traverses the ptye_all linked list if (! Ptype-> dev | ptype-> dev = skb-> dev) {// The above paket_type.type is ETH_P_ALL if (pt_prev) // call paket_type.func () for all packages () ret = deliver_skb (skb, pt_prev, orig_dev); // This function finally calls paket_type.func () pt_prev = ptype ;}// Step 2: If BRIDGE is selected during kernel compilation, the following will execute the bridge module // call the function pointer br_handle_frame_hook (skb), in the dynamic module linux_2_6_24/net/bridge/br. in c // br_handle_frame_hook = br_handle_frame; // the actual function br_handle_frame. // Note: In this bridge module, initialize skb-> pkt_type to PACKET_HOST, PACKET_OTHERHOST skb = handle_bridge (skb, & pt_prev, & ret, orig_dev); if (! Skb) goto out; // Step 3: select the MAC_VLAN module when compiling the kernel. The following code will run // call macvlan_handle_frame_hook (skb). In the dynamic module, linux_2_6_24/drivers/net/macvlan. in c, // macvlan_handle_frame_hook = macvlan_handle_frame; // the actual function is macvlan_handle_frame. // Note: This function initializes skb-> pkt_type to PACKET_BROADCAST, PACKET_MULTICAST, PACKET_HOST skb = handle_macvlan (skb, & pt_prev, & ret, orig_dev); if (! Skb) goto out; // Step 4: type = skb-> protocol; & ptype_base [ntohs (type) & 15] // process ptype_base [ntohs (type) & 15] All packet_type-> func () // enter different hook functions according to the protocol of the second layer. The important include: ip_rcv () arp_rcv () type = skb-> protocol; list_for_each_entry_rcu (ptype, & ptype_base [ntohs (type) & 15], list) {if (ptype-> type = type & // traverse the linked list corresponding to the package type (! Ptype-> dev | ptype-> dev = skb-> dev) {// call all pakcet_type.func () if (pt_prev) ret = deliver_skb (skb, pt_prev, orig_dev); // that's it! The arp packet calls the arp_rcv () pt_prev = ptype; // The IP packet calls ip_rcv () }}if (pt_prev) {ret = pt_prev-> func (skb, skb-> dev, pt_prev, orig_dev);} else {// The following is the kfree_skb (skb) returned by the data packet from the protocol stack; // note that if skb does not enter the receiving queue of the socket, the ret = NET_RX_DROP is released here; // If skb enters the receiving queue, skb is released when the system calls the get package. Here, the number of skb references is reduced by one.} out: rcu_read_unlock (); return ret;} int forward (struct sk_buff * skb, struct packet_type * pt_prev, struct net_device * orig_dev) {atomic_inc (& skb-> users); return pt_prev-> func (skb, skb-> dev, pt_prev, orig_dev); // call the ip_rcv () arp_rcv () function}
Important data structure:
The second layer of the kernel processing network has the following two important list_head variables (File linux_2_6_24/net/core/dev. c)
Many packet_type data structures are mounted on the list_head linked list.
Static struct list_head ptype_base [16] _ read_mostly;/* 16 way hashed list */static struct list_head ptype_all _ read_mostly;/* Taps */struct packet_type {_ be16 type; /* The member saves the L2 protocol type, such as ETH_P_IP and ETH_P_ARP */struct net_device * dev;/* NULL is wildcarded here */int (* func) (struct sk_buff *, struct net_device *, struct packet_type *, struct net_device *);/* Members are hook functions, such as ip_rcv (), arp_rcv (), and so on */struct sk_buff * (* gso_segment) (struct sk_buff * skb, int features); int (* gso_send_check) (struct sk_buff * skb); void * af_packet_priv; struct list_head list ;};
Operation API:
Void dev_add_pack (struct packet_type * pt); // call void dev_remove_pack (struct packet_type * pt) during protocol initialization at each layer );
The following is the arp_rcv () process:
Ip_rcv:
0x04 Summary
This section describes how to register and initialize a pci NIC Driver and how to implement a protocol stack after soft interruptions. During this period, we learned some key data structures.