Level: elementary M. Tim Jones (mtj@mtjones.com), consultant engineer, emulex July 16, 2007
One of the biggest features of Linux is its network stack. It was originally originated from the BSD network stack and has a very clean set of interfaces, which are well organized. Its interfaces range from protocol-independent layers (such as general Socket Layer interfaces or device layers) to specific layers of various network protocols. This article will explore the Linux network stack interface from the perspective of layering and introduce some of the main structures.
Protocol Introduction Although the formal introduction to the network generally refers to the OSI (Open Systems Interconnection) model, this article introduces the basic network stack in Linux into a four-layer Internet model (1 ). Figure 1. Internet model of the network stack
The bottom of this stack is the link layer.Link LayerIt refers to the device driver that provides access to the physical layer, which can be various media, such as serial links or Ethernet devices. Above the link layer isNetwork LayerIt is responsible for directing packets to the target location. The last layer is calledTransport Layer, Responsible for end-to-end communication (for example, within a host ). Although the network layer is responsible for managing communications between hosts, the transport layer is responsible for managing communications between various terminals within the host. The last layer isApplication LayerIt is usually a semantic layer that can understand the data to be transmitted. For example, Hypertext Transfer Protocol (HTTP) is responsible for transmitting requests and responses to web content between the server and the client. Actually, each layer of the network stack has some more well-known names. Ethernet can be found on the link layer, which is the most commonly used high-speed media. Earlier link layer protocols include some serial protocols, such as slip (Serial Line Internet Protocol), cslip (compressed slip), and PPP (Point-to-Point Protocol ). The most common network layer protocol is IP (Internet Protocol), but there are still some protocols at the network layer that meet other requirements, such as ICMP (Internet Control Message Protocol) and ARP (Address Resolution Protocol ). On the transport layer, it is TCP (Transmission Control Protocol) and UDP (User datatime protocol ). Finally, the application layer contains many familiar protocols, including the standard web protocol HTTP and the E-mail Protocol SMTP (Simple Mail Transfer Protocol ).
Core Network Architecture Now I continue to understand the Linux network stack architecture and how to implement this Internet model. Figure 2 provides an advanced view of the Linux network stack. The top is the user space layer, orApplication LayerWhich defines the users of the network stack. At the bottom isPhysical DeviceProvides network connection capabilities (serial ports or high-speed networks such as Ethernet ). In the middle isKernel spaceThe network subsystem is also the focus of this article. The socket buffer (sk_buffs ), Which transmits packet data between the source and sink. You will soon seesk_buff . Figure 2. Linux advanced network stack architecture
First, let's take a quick look at the core elements of the Linux network subsystem, which will be described in more detail in subsequent chapters. The top part (see figure 2) is the system call interface. It provides a method for accessing the kernel network subsystem for user space applications. The following is a protocol-independent layer. It provides a common method to use the underlying transport layer protocol. Next is the actual protocol. in Linux, it includes embedded TCP, UDP, and IP. Then there is another protocol-independent layer that provides a common interface for communication with the driver of each device. The device driver itself is at the bottom.
System Call Interface The system call interface can be described in two aspects. When a user initiates a network call, the process of entering the kernel through the system call interface should be multiple channels. Finally, in./NET/socket. Csys_socketcall End the process and send the call to the specified target. Another description of the system call interface is the use of common file operations as network I/O. For example, a typical read/write operation can be performed on a network socket (the socket uses a file descriptor, which is the same as a common file ). Therefore, although many operations are dedicated to the network (socket Call to create a socket and useconnect Call a receiver, etc.), but some standard file operations can be applied to network objects, just like operating a common file. Finally, the system call interface provides a transfer control method between the user space application and the kernel.
Protocol-independent interface The socket layer is a protocol-independent interface that provides a set of common functions to support different protocols. The socket layer not only supports typical TCP and UDP protocols, but also supports IP, bare Ethernet, and other transmission protocols, such as sctp (stream control transmission protocol ). Socket operations are required for communication through the network stack. In Linux, the socket structure isstruct sock This structure is defined in Linux/include/NET/sock. h. This huge structure contains all the status information required by a specific socket, including the specific protocol used by the socket and some operations that can be performed on the socket. The network subsystem can understand the available protocols by defining a special structure of its own functions. Each Protocol maintainsproto In Linux/include/NET/sock. h ). This structure defines how to execute specific socket operations from the socket layer to the transport layer (for example, how to create a socket, how to use the socket to establish a connection, and how to close a socket ).
Network Protocol The network protocol section defines available specific network protocols (such as TCP and UDP ). They are all in the Linux/NET/IPv4/af_inet.c file namedinet_init (Because both TCP and UDP areinet Part of the cluster protocol ). inet_init Function usageproto_register Function to register each embedded protocol. This function is in Linux/NET/CORE/sock. in addition to adding the protocol to the active protocol list, as defined in C, this function can also allocate one or more slab caches if needed. In the UDP. C and raw. c files in the Linux/NET/IPv4/directoryproto Interface, you can understand how each Protocol identifies itself. Each of these protocol interfaces mapsinetsw_array , Which maps the embedded protocol and operation together.inetsw_array The Structure and Its Relationship are shown in 3. Initiallyinet_init Ininet_register_protosw Initialize each protocol in this arrayinetsw . Functioninet_init It will alsoinet Module initialization, such as ARP, ICMP, and IP modules, and TCP and UDP modules. Figure 3. Internet Protocol Array Structure
|
Relationship between socket protocols Recall that when creating a socket, You need to specify the type and Protocol, for examplemy_sock = socket( AF_INET, SOCK_STREAM, 0 ) .AF_INET Indicates an Internet address cluster, which uses a stream socket and is definedSOCK_STREAM (In this caseinetsw_array ). |
|
Note that in Figure 3,proto The structure defines the transmission-specific method, whileproto_ops The structure defines the general socket method. You can callinet_register_protosw Add other protocolsinetsw Protocol. For example, sctp callssctp_init . For more information about sctp, see references. Data movement in the socket uses a so-called Socket buffer (sk_buff .sk_buff Contains packet data and State data that involves multiple layers of the protocol stack. Each message sent or received uses onesk_buff .sk_buff The structure is defined in Linux/include/Linux/skbuff. H, as shown in figure 4. Figure 4. Socket buffer and its relationship with other structures
, Multiplesk_buff You can link a given connection together. Eachsk_buff All in the device structure (net_device To identify the destination of the message or the source of the received message. Because each packet usessk_buff Therefore, the packet header can be expressed by a group of pointers (th ,iph Andmac [For media access control or MAC header. Becausesk_buff Is the center of socket data management, so many support functions are created to manage them. Some of these functions are used to create and destroysk_buff Structure, or clone or queue it. For a given socket, the socket buffer can be linked together, which can contain a large amount of information, including the link to the protocol header and the timestamp (when the message is sent or received ), and the devices related to the message.
Device-independent interface Under the protocol layer is another unrelated interface layer, which connects the protocol with hardware devices with many different features. This layer provides a set of common functions for underlying network device drivers to operate on high-level protocol stacks. First, the device driver may callregister_netdevice Orunregister_netdevice Register or log out in the kernel. Enternet_device Structure, and then pass the structure for registration. The kernel calls itsinit Function (if this function is defined), then execute a set of health checks and createsysfs And add the new device to the device list (linked list of active devices in the kernel ). You can find this in Linux/include/Linux/netdevice. h.net_device Structure. These functions are implemented in Linux/NET/CORE/dev. C. To send to the device from the protocol layersk_buff , You need to usedev_queue_xmit Function. This function cansk_buff The underlying Device Driver performs final transmission (Usesk_buff Referenced innet_device Orsk_buff->dev Network device ).dev The structure containshard_start_xmit Method, which is saved as an initiatorsk_buff The driver function used for transmission. Message receiving is usually usednetif_rx Executed. When the underlying device driver receives a packet (includingsk_buff .netif_rx Setsk_buff Upload to the network layer. Then, this function usesnetif_rx_schedule Setsk_buff Queue in the upper-layer protocol queue for later processing. It can be found in Linux/NET/CORE/dev. C.dev_queue_xmit Andnetif_rx Function. Recently, a new Application Programming Interface (napi) is introduced in the kernel, which allows the driver to be independent from the device layer (dev . Some drivers use napi, but most drivers still use the old-fashioned frame receiving interface (the ratio is about 6 to 1 ). Napi can produce better performance under high load, which avoids interruption for each incoming frame.
Device Driver The bottom of the network stack is the device driver responsible for managing physical network devices. For example, the slip driver used by the packet serial port and the Ethernet driver used by the Ethernet device are both devices at this layer. During initialization, the device driver assignsnet_device Structure, and then use the required program to initialize it. One of these programs isdev->hard_start_xmit It defines how the upper layersk_buff Queue for transmission. The parameter of this program issk_buff . The operation of this function depends on the underlying hardware, but usuallysk_buff All the messages described will be moved to the hardware ring or queue. As described in the device-independent layer, for napi-compatible network drivers, frame receiving usesnetif_rx Andnetif_receive_skb Interface. The napi driver limits the underlying hardware capabilities. For more information, see references. The device driverdev After configuring your own interface in the structure, callregister_netdevice You can use this configuration. In Linux/Drivers/net, you can find the driver dedicated to the network device.
Outlook Linux source code is the best way to learn about Device Driver Design for most device types, including network device drivers. Here we can find various design changes and use of available kernel APIs. However, every point we have learned is very useful and can be used as the starting point of the new device driver. Unless you need a new protocol, the rest of the code in the network stack is common and useful. Even now, the implementation of TCP (for stream Protocol) or UDP (for message-based protocol) can be used as a new useful module for development. |