Linux network stack Analysis

Source: Internet
Author: User
Linux network stack Analysis

From socket to Device Driver

Document options

Send this email


Level: elementary

M. Tim Jones (mtj@mtjones.com), consultant engineer, emulex

July 16, 2007

One of the biggest features of Linux is its network stack. It was originally originated from the BSD network stack and has a very clean set of interfaces, which are well organized. Its interfaces range from protocol-independent layers (such as general Socket Layer interfaces or device layers) to specific layers of various network protocols. This article will explore the Linux network stack interface from the perspective of layering and introduce some of the main structures.

Protocol Introduction

Although the formal introduction to the network generally refers to the OSI (Open Systems Interconnection) model, this article introduces the basic network stack in Linux into a four-layer Internet model (1 ).

Figure 1. Internet model of the network stack

The bottom of this stack is the link layer.Link LayerIt refers to the device driver that provides access to the physical layer, which can be various media, such as serial links or Ethernet devices. Above the link layer isNetwork LayerIt is responsible for directing packets to the target location. The last layer is calledTransport Layer, Responsible for end-to-end communication (for example, within a host ). Although the network layer is responsible for managing communications between hosts, the transport layer is responsible for managing communications between various terminals within the host. The last layer isApplication LayerIt is usually a semantic layer that can understand the data to be transmitted. For example, Hypertext Transfer Protocol (HTTP) is responsible for transmitting requests and responses to web content between the server and the client.

Actually, each layer of the network stack has some more well-known names. Ethernet can be found on the link layer, which is the most commonly used high-speed media. Earlier link layer protocols include some serial protocols, such as slip (Serial Line Internet Protocol), cslip (compressed slip), and PPP (Point-to-Point Protocol ). The most common network layer protocol is IP (Internet Protocol), but there are still some protocols at the network layer that meet other requirements, such as ICMP (Internet Control Message Protocol) and ARP (Address Resolution Protocol ). On the transport layer, it is TCP (Transmission Control Protocol) and UDP (User datatime protocol ). Finally, the application layer contains many familiar protocols, including the standard web protocol HTTP and the E-mail Protocol SMTP (Simple Mail Transfer Protocol ).



Back to Top

Core Network Architecture

Now I continue to understand the Linux network stack architecture and how to implement this Internet model. Figure 2 provides an advanced view of the Linux network stack. The top is the user space layer, orApplication LayerWhich defines the users of the network stack. At the bottom isPhysical DeviceProvides network connection capabilities (serial ports or high-speed networks such as Ethernet ). In the middle isKernel spaceThe network subsystem is also the focus of this article. The socket buffer (sk_buffs), Which transmits packet data between the source and sink. You will soon seesk_buff.

Figure 2. Linux advanced network stack architecture

First, let's take a quick look at the core elements of the Linux network subsystem, which will be described in more detail in subsequent chapters. The top part (see figure 2) is the system call interface. It provides a method for accessing the kernel network subsystem for user space applications. The following is a protocol-independent layer. It provides a common method to use the underlying transport layer protocol. Next is the actual protocol. in Linux, it includes embedded TCP, UDP, and IP. Then there is another protocol-independent layer that provides a common interface for communication with the driver of each device. The device driver itself is at the bottom.



Back to Top

System Call Interface

The system call interface can be described in two aspects. When a user initiates a network call, the process of entering the kernel through the system call interface should be multiple channels. Finally, in./NET/socket. Csys_socketcallEnd the process and send the call to the specified target. Another description of the system call interface is the use of common file operations as network I/O. For example, a typical read/write operation can be performed on a network socket (the socket uses a file descriptor, which is the same as a common file ). Therefore, although many operations are dedicated to the network (socketCall to create a socket and useconnectCall a receiver, etc.), but some standard file operations can be applied to network objects, just like operating a common file. Finally, the system call interface provides a transfer control method between the user space application and the kernel.



Back to Top

Protocol-independent interface

The socket layer is a protocol-independent interface that provides a set of common functions to support different protocols. The socket layer not only supports typical TCP and UDP protocols, but also supports IP, bare Ethernet, and other transmission protocols, such as sctp (stream control transmission protocol ).

Socket operations are required for communication through the network stack. In Linux, the socket structure isstruct sockThis structure is defined in Linux/include/NET/sock. h. This huge structure contains all the status information required by a specific socket, including the specific protocol used by the socket and some operations that can be performed on the socket.

The network subsystem can understand the available protocols by defining a special structure of its own functions. Each Protocol maintainsprotoIn Linux/include/NET/sock. h ). This structure defines how to execute specific socket operations from the socket layer to the transport layer (for example, how to create a socket, how to use the socket to establish a connection, and how to close a socket ).



Back to Top

Network Protocol

The network protocol section defines available specific network protocols (such as TCP and UDP ). They are all in the Linux/NET/IPv4/af_inet.c file namedinet_init(Because both TCP and UDP areinetPart of the cluster protocol ). inet_initFunction usageproto_registerFunction to register each embedded protocol. This function is in Linux/NET/CORE/sock. in addition to adding the protocol to the active protocol list, as defined in C, this function can also allocate one or more slab caches if needed.

In the UDP. C and raw. c files in the Linux/NET/IPv4/directoryprotoInterface, you can understand how each Protocol identifies itself. Each of these protocol interfaces mapsinetsw_array, Which maps the embedded protocol and operation together.inetsw_arrayThe Structure and Its Relationship are shown in 3. Initiallyinet_initIninet_register_protoswInitialize each protocol in this arrayinetsw. Functioninet_initIt will alsoinetModule initialization, such as ARP, ICMP, and IP modules, and TCP and UDP modules.

Figure 3. Internet Protocol Array Structure

Relationship between socket protocols
Recall that when creating a socket, You need to specify the type and Protocol, for examplemy_sock = socket( AF_INET, SOCK_STREAM, 0 ).AF_INETIndicates an Internet address cluster, which uses a stream socket and is definedSOCK_STREAM(In this caseinetsw_array).

Note that in Figure 3,protoThe structure defines the transmission-specific method, whileproto_opsThe structure defines the general socket method. You can callinet_register_protoswAdd other protocolsinetswProtocol. For example, sctp callssctp_init. For more information about sctp, see references.

Data movement in the socket uses a so-called Socket buffer (sk_buff.sk_buffContains packet data and State data that involves multiple layers of the protocol stack. Each message sent or received uses onesk_buff.sk_buffThe structure is defined in Linux/include/Linux/skbuff. H, as shown in figure 4.

Figure 4. Socket buffer and its relationship with other structures

, Multiplesk_buffYou can link a given connection together. Eachsk_buffAll in the device structure (net_deviceTo identify the destination of the message or the source of the received message. Because each packet usessk_buffTherefore, the packet header can be expressed by a group of pointers (th,iphAndmac[For media access control or MAC header. Becausesk_buffIs the center of socket data management, so many support functions are created to manage them. Some of these functions are used to create and destroysk_buffStructure, or clone or queue it.

For a given socket, the socket buffer can be linked together, which can contain a large amount of information, including the link to the protocol header and the timestamp (when the message is sent or received ), and the devices related to the message.



Back to Top

Device-independent interface

Under the protocol layer is another unrelated interface layer, which connects the protocol with hardware devices with many different features. This layer provides a set of common functions for underlying network device drivers to operate on high-level protocol stacks.

First, the device driver may callregister_netdeviceOrunregister_netdeviceRegister or log out in the kernel. Enternet_deviceStructure, and then pass the structure for registration. The kernel calls itsinitFunction (if this function is defined), then execute a set of health checks and createsysfsAnd add the new device to the device list (linked list of active devices in the kernel ). You can find this in Linux/include/Linux/netdevice. h.net_deviceStructure. These functions are implemented in Linux/NET/CORE/dev. C.

To send to the device from the protocol layersk_buff, You need to usedev_queue_xmitFunction. This function cansk_buffThe underlying Device Driver performs final transmission (Usesk_buffReferenced innet_deviceOrsk_buff->devNetwork device ).devThe structure containshard_start_xmitMethod, which is saved as an initiatorsk_buffThe driver function used for transmission.

Message receiving is usually usednetif_rxExecuted. When the underlying device driver receives a packet (includingsk_buff.netif_rxSetsk_buffUpload to the network layer. Then, this function usesnetif_rx_scheduleSetsk_buffQueue in the upper-layer protocol queue for later processing. It can be found in Linux/NET/CORE/dev. C.dev_queue_xmitAndnetif_rxFunction.

Recently, a new Application Programming Interface (napi) is introduced in the kernel, which allows the driver to be independent from the device layer (dev. Some drivers use napi, but most drivers still use the old-fashioned frame receiving interface (the ratio is about 6 to 1 ). Napi can produce better performance under high load, which avoids interruption for each incoming frame.



Back to Top

Device Driver

The bottom of the network stack is the device driver responsible for managing physical network devices. For example, the slip driver used by the packet serial port and the Ethernet driver used by the Ethernet device are both devices at this layer.

During initialization, the device driver assignsnet_deviceStructure, and then use the required program to initialize it. One of these programs isdev->hard_start_xmitIt defines how the upper layersk_buffQueue for transmission. The parameter of this program issk_buff. The operation of this function depends on the underlying hardware, but usuallysk_buffAll the messages described will be moved to the hardware ring or queue. As described in the device-independent layer, for napi-compatible network drivers, frame receiving usesnetif_rxAndnetif_receive_skbInterface. The napi driver limits the underlying hardware capabilities. For more information, see references.

The device driverdevAfter configuring your own interface in the structure, callregister_netdeviceYou can use this configuration. In Linux/Drivers/net, you can find the driver dedicated to the network device.



Back to Top

Outlook

Linux source code is the best way to learn about Device Driver Design for most device types, including network device drivers. Here we can find various design changes and use of available kernel APIs. However, every point we have learned is very useful and can be used as the starting point of the new device driver. Unless you need a new protocol, the rest of the code in the network stack is common and useful. Even now, the implementation of TCP (for stream Protocol) or UDP (for message-based protocol) can be used as a new useful module for development.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.