The process of sk_buff encapsulation and Network Packet encapsulation is described in detail.
It can be said that the sk_buff struct is the core of the Linux network protocol stack, and almost all operations are performed around the sk_buff struct, its importance is similar to that of BSD's mbuf (I have read "TCP/IP details Volume 2"). What is sk_buff?
Sk_buff is the network data packet itself and the operation metadata for it.
To understand sk_buff, the simplest way is to encapsulate a data frame till the ethereum layer based on your understanding of the network protocol stack and send it successfully, I personally think This is much better than reading code, reading documents, or searching materials online. Of course, there have been a lot of articles in this area on the Internet, but I think many of them are too complicated. They are all refined into every pointer FIELD OF THE sk_buff struct, and they are also illustrated, however, it is generally unable to escape from the book "deep understanding of Linux Network Technology insider. Imagine what if the field is added or its name is changed after the kernel version is upgraded in the future? Can these articles, including the classic "ULN", be helpful?
Therefore, this article will not go deep into the details of sk_buff, but I believe this simple method will make me forget what is the Linux protocol stack for many years, it instantly understands how Linux encapsulates data packets through sk_buff. We start with the layered network model.
Network layered model is the essence of everything. The network is designed as a component layer, So network operations can be called a "stack", which is the origin of the network protocol stack name. In specific operations, the final process of data packet formation is a layer-by-layer encapsulation process, forming a continuous piece of data on the stack. We can call it a layer-by-layer push operation. Similarly, the packet encapsulation process can be considered as a layer-by-layer pop operation.
Sk_buff operations need to form a final data packet, that is, the Ethernet frame (regardless of other link layers ). Perform the following operations:
1. Allocate a skb struct
2. Data zone for packet allocation
3. Locate the start position of the application layer in the skb data area
4. Copy data to the application layer (assuming that the application layer protocol is not encapsulated on the socket interface)
5. Locate the starting position of the transport layer in the skb data Zone
6. Set fields in the transport layer Header
7. Locate the starting position of the IP layer in the skb data area
8. set IP layer header fields
9. Locate the starting position of the ethereum in the skb data area
10. Set the Ethernet header field
It can be seen that the basic mode is "positioning/setting". The difference is that the application layer operations are generally completed on the socket interface. However, since this article describes general operations of skb, we will not distinguish this.
The core operations of skb are shown in the preceding section. However, the core operations of skb are involved at the interface level.
1. skb allocation is completed by alloc_skb. interfaces that complete the same task form an interface family, But alloc_skb is the most basic interface.
The alloc_skb interface completes two tasks: allocating the skb struct and the skb data packet buffer, and setting the initial value. The size parameter indicates the size of the Data Packet Buffer of skb, which includes the sum of all layers. If this function is successfully returned, it is equivalent to an empty data packet buffer with a size and skb metadata that operates the data packet buffer. As shown in:
2. The key to layer-by-layer encapsulation of the skb_reserve skb lies in the positioning of the write pointer, that is, where the layer starts to write. From the image of the Protocol-encapsulated pressure stack, this positioning should be in a regular order. Initial positioning is very important, and subsequent positioning is routine. The initial positioning is, of course, to the end of the application layer. From here, the protocol header is pushed layer by layer to the Data Packet Buffer of skb. The initial positioning diagram is as follows:
3. copy application layer data (skb_push/copy) after skb is allocated, You need to locate the Protocol "stack" in the "lowest position" of the data packet. This is the initial position, in this way, the data or protocol header of each layer can be pushed to the stack. This operation is completed by skb_reserve. The data on the application layer has been encapsulated on the socket, so the data packet buffer write pointer of skb is located at the beginning of the Application Data. At this time, the write pointer is located at the end of the application layer buffer, therefore, you need to use the skb_push operation to locate the write pointer at the beginning of the application layer, which means that the frame is pushed to the application layer stack.
The skb_push interface is an interface that pushes a protocol stack frame into the protocol stack. It returns a position, which is the write pointer of the skb data packet, telling the caller, here we start to encapsulate data packets according to your encapsulation logic. How many bytes are written? Indicated by parameter n of skb_push. The operation on the application layer is shown in:
After the application layer stack frame is pushed into the protocol stack, you can start at the write pointer position and write n Bytes of application layer data continuously. Generally, the data comes from the socket.
4. Set the transport layer header and the application layer. This time, you need to push the transport layer stack frame into the protocol stack, as shown in:
Next, you can happily set the transport layer header in the position returned by skb_push, UDP and TCP. It depends on your understanding of the transport layer. The header of the transport layer is set to write data at the position returned by skb_push. The write length is specified by the skb_push parameter, that is, n.
5. Set the IP layer header to be similar to the application layer and transport layer operations. This time, the stack frame of the IP layer needs to be pushed into the protocol stack, as shown in:
Next we can happily set the IP layer header in the position returned by skb_push. How to set it depends on your understanding of the IP layer. Because it only demonstrates how skb is encapsulated, there is no IP routing process that is very important to the IP layer.
6. Set the Ethernet frame header, which is similar to the above... As shown in:
So far, I have encapsulated a complete ethereum frame that can be sent directly through dev_queue_xmit. Along the way, you will find that the skb data packet buffer is gradually filled in the form of "push". Each layer is pushed to a stack frame through the skb_push interface, and the write pointer is returned, then write the stack frame length data starting from the write pointer according to the protocol logic of this layer.
At the moment that skb_push returns, a stack frame is pushed into the protocol stack, and the stack frame is still not written into the data. That is to say, the encapsulation process has not been completed. The specific encapsulation process is implemented by the caller.
Skb_push causes the forward push of the write pointer position in the buffer zone of the skb data packet, and several variables are changed. First, the length of the data packet increases by n Bytes, and secondly, the headroom space is reduced, then, through the call of reset_XXX_header, skb remembers the position of a layer protocol header in the data packet (this is particularly important! For example, in the case of TSO/UFO, the NIC Driver requires the location information of the protocol header to calculate the verification value. Therefore, although skb does not remember the location of the protocol header, a data packet can be encapsulated, however, it is incorrect for the complete implementation of the protocol stack. After all, the NIC Verification Code has become a de facto standard [even if it violates the strict layering principle!])
7. append the PADDING after the application data. As shown in the final figure, two areas are not used in the skb data packet buffer. One is the headroom and the other is the tailroom, what are these functions? As an example of an exercise, due to some alignment principle, after encapsulation is complete, I need to append some padding at the end of the data packet, or I need to add a forward code at the beginning, or the most common one is to add an error correction code at the end of the data packet. What should I do?
Headroom or tailroom is required at this time. For example, to append data to the data packet, see:
In fact, the skb_put operation is to append data to the end of the data packet. As for how to use headroom, I will not mention it. In fact, it is skb_push. What is the use of headroom? The Front Guide code, which is encapsulated by X over Y.
The actual example is as follows: encapsulate an Ethernet frame and send it out:
Skb = alloc_skb (1500, GFP_ATOMIC); skb-> dev = dev; // routinely populate the skb metadata/* retain the skb Region */skb_reserve (skb, 2 + sizeof (struct ethhdr) + sizeof (struct iphdr) + sizeof (app_data);/* construct the data zone */p = skb_push (skb, sizeof (app_data )); memcpy (p, & app_data [0], sizeof (app_data); p = skb_push (skb, sizeof (struct udphdr); udphdr = (struct udphdr *) p; // fill in the udphdr field, for example, skb_reset_transport_header (skb);/* construct the IP header */p = skb_push (skb, sizeof (struct iphdr); iphdr = (struct iphdr *) p; // fill in the iphdr field, slightly skb_reset_network_header (skb);/* construct the Ethernet header */p = skb_push (skb, sizeof (struct ethhdr )); ethhdr = (struct ethhdr *) p; // fill in the ethhdr field, slightly skb_reset_mac_header (skb);/* launch */dev_queue_xmit (skb );
The encapsulation process is opposite to the encapsulation process. The encapsulation process is the layer-by-layer pop process of the Protocol Stack stack frames. However, the Linux protocol stack does not use the stack terminology to define the interface name, the opposite of push is defined by pull. skb_pull is the core interface, which is strictly opposite to skb_push. I will not draw images one by one.
Encoding by interface rather than coding by implementation seems to be one of Objective C ++ and is also suitable for skb operation scenarios. The typical interface is "How To Make skb remember the IP layer protocol header, transport layer protocol header, and mac header location". The interface is:
skb_reset_mac_headerskb_reset_network_headerskb_reset_transport_header
The call time is the time when skb_push returns. Once upon a time, I set the location of the protocol header in the following way:
/* Construct the IP header */p = skb_push (skb, sizeof (struct iphdr); iphdr = (struct iphdr *) p; // fill in the iphdr field, // skb_reset_network_header (skb); skb-> network_header = p;
Is it wrong? At first glance, it was correct, but an error was reported:
Protocol 0008 is buggy, dev eth2
What's going on? The reason is that the protocol header recorded by skb is incorrect! Is there any improper way to set the network_header field of skb above? Of course not! This is the consequence of not coding according to the interface.
The reason is that the system sets two methods for the network_header field of skb. One macro is used to identify NET_SKBUFF_DATA_USES_OFFSET. That is to say, you can use the offset relative to the head pointer of skb to locate the position of the protocol header, or you can use an absolute address to locate the position. Which of the following depends on whether the system has defined the NET_SKBUFF_DATA_USES_OFFSET macro, the above skb-> network_header = p is clearly located through an absolute address. Once the system defines the NET_SKBUFF_DATA_USES_OFFSET macro, it is definitely incorrect. Since the macro definition is determined during the compilation period, you can use the definition interface to uniquely determine an implementation during the compilation period. The programmer does not have to care whether the NET_SKBUFF_DATA_USES_OFFSET macro is defined. This is the benefit of interface programming. If you are programming based on the skb implementation, you have to write several sets of implementations for all the situations. The above error implementations are only one of them, and you also use the wrong scenario! How painful it is!
NET_SKBUFF_DATA_USES_OFFSET macro is a detailed problem. If you use interface programming, you do not need to pay attention to this detail. Otherwise, you must understand why the system is designed like this, even if this is not your concern! Why?
Because the pointer length is different in 32-bit and 64-bit systems, the size of the pointer metadata in skb is also different, the 64-bit system will double the 32-bit system. To smooth this difference and make the metadata size consistent, the corresponding pointer type of the 64-bit system must be changed to 4 bytes, which is impossible. Therefore, in a 64-bit system, offset is used to locate the metadata, and the offset type is a fixed unsigned int, that is, four bytes. To support the above statement, skb adds a new level, that is, defines a new data type sk_buff_data_t, which is determined during compilation:
#if BITS_PER_LONG > 32#define NET_SKBUFF_DATA_USES_OFFSET 1#endif#ifdef NET_SKBUFF_DATA_USES_OFFSETtypedef unsigned int sk_buff_data_t;#elsetypedef unsigned char *sk_buff_data_t;#endif
In addition to space saving, the interface implementation for size-related operations is also more unified. This is the details, and these details are not the concern of people playing the network protocol stack, are they? This is completely at the system implementation level and has nothing to do with the business logic.
Why is it not all done. In fact, sk_buff has more details, but it cannot be described one by one, because it violates the original intention of this article, that is, to expose the essence in the simplest way, if I describe them one by one, this article will become a document rather than an sentiment. After many years, I believe I will not continue to read it.
There are a lot of content about sk_buff. The meaning of the rich fields in the struct is enough for a long time. In addition, how does it work with the implementation of Linux protocol layers makes the content richer. However, the most basic thing is described in this article. You need to know how data is inserted into a skb and encapsulated into a data packet that can be actually sent by the NIC. Okay, that's all. Finally, I will summarize several interfaces mentioned in this article:
Alloc_skb:Allocate a skb;
Skb_reserver:The write pointer moves backward to a position p, which is determined as the end of the data packet. From the beginning, the write pointer starts to move forward from this position to encapsulate the data packet;
Skb_push:The write pointer moves forward n to update the length of the data packet. From the returned position, you can write n bytes of data-that is, the protocol that encapsulates n Bytes;
Skb_put:The write pointer moves to the end of the data packet and returns the tail pointer. You can write n bytes of data from this position and update the tail pointer and the length of the data packet;
...