Linux network protocol stack implementation analysis (-) Implementation of skbuff

Source: Internet
Author: User

Linux network protocol stack implementation analysis (-)
Skbuff implementation
This article is the first in a series of articles I tried to analyze the implementation of Linux network protocol stack.
The implementation of skbuff in the Linux network protocol stack. The analysis is based on linux2.2.x and also includes
Including the new changes of the same description object in linux2.4.x. The versions of the Code referenced in this article are respectively
Yes: linux2.2.25, linux2.4.20.
1 Overview
Anyone who knows about the network protocol stack knows that the network protocol stack is a hierarchical software structure, layer and Layer
Transmits network packets through a predefined interface. The network packet contains the information used at each layer of the protocol stack.
Type information. The length of the network packet is not fixed, so what kind of data structure is used to store these
Network packets are very important. In the implementation of BSD, the data structure adopted is mbuf, which
The length of data that can be stored is fixed. If a network packet requires multiple mbuckets
Link to a linked list. Therefore, the data stored in the same network packet may be discontinuous in the memory.
. In Linux, the data of the same network packet is stored continuously in the memory.
All Network reports have a control structure called sk_buff. Of course, this is only in linux2.2.x.
Sk_buff has a slight change in linux2.4.x, which will be discussed below.
2 skbuff in linux2.2.x
2.1 sk_buff Definition
As mentioned above, sk_buff is a control structure through which various packets can be accessed.
Data. Therefore, when allocating network packet storage space, it also allocates its control structure sk_buff. Here
In the control structure, there are pointers to network packets, and there are also variables that describe network packets. Below is
The definition of sk_buff is annotated as follows:
Struct sk_buff {
Struct sk_buff * next;
Struct sk_buff * Prev;
Struct sk_buff_head * List;
The above three variables link sk_buff to a bidirectional cyclic linked list. The structure of the linked list will be described later.
To.
Struct sock * SK;
The sock structure to which the message belongs. This value is valid in the message sent by the local device and received from the network device.
This value is null.
Struct timeval stamp; // time when the message is received
Struct device * dev; // the network device that receives this message
Union
{
Struct tcphdr * th;
Struct udphdr * uh;
Struct icmphdr * icmph;
Struct igmphdr * igmph;
Struct iphdr * ipiph;
Struct spxhdr * spxh;
Unsigned char * Raw;
} H;
Union
{
PDF created with pdffacloud trial version www.pdffactory.com
Struct iphdr * IPH;
Struct listen 6hdr * Listen 6 h;
Struct arphdr * arph;
Struct ipxhdr * ipxh;
Unsigned char * Raw;
} NH;
Union
{
Struct ethhdr * Ethernet;
Unsigned char * Raw;
} MAC;
The above three Union structures are transport layer, network layer, and link layer header structure pointers. These
The needle is assigned a value when the network packet enters this layer. Raw is a non-structured character pointer, used
Extended protocol.
Struct dst_entry * DST; // route of the packet. This value is assigned after the route is determined.
Char cb [48]; // used to transmit parameters between protocol stacks. The meaning of the parameter content is as follows:
Use its function to determine.
Unsigned int Len;
The length of the packet, which refers to the length of the network packet in different protocol layers, including the header and data. In
Different layers of the protocol stack have different lengths.
Unsigned char is_clone,
Cloned,
The preceding two variables describe whether the control structure is a clone control structure. One network packet can
There should be multiple control structures, only one of which is the original structure, and the others are cloned. By
Multiple control structures may exist. Therefore, when releasing a network report, make sure that all of its control structures are
Released.
Pkt_type,
The type of network packets. The common types are packet_host, which indicates the packets sent to the local machine.
Packet_outgoing indicates the packets sent by the local machine.
Unsigned short protocol; // link layer protocol
Unsigned int truesize; // The length of the message storage area, which is 16 bytes
Alignment is generally larger than the length of the message.
Unsigned char * head;
Unsigned char * data;
Unsigned char * tail;
Unsigned char * end;
The preceding four variables point to the message storage area. The specific meanings will be explained later.
_ U32 fwmark; // firewall flag in the message
};
The storage space of network packets is allocated when network devices receive network packets or applications send data.
The allocated space is 16 bytes aligned. After the allocation is successful, fill in the network packet to this storage space
. When filling, a certain number of gaps are reserved in the bucket header, and then the network packets are
Put it in the remaining space. However, network packets may not fill up the entire storage space.
There is still a certain number of gaps at the back of the space, so the head pointer in sk_buff points to the bucket
The start address, the end Pointer Points to the end address of the bucket, and the Data Pointer Points to the start address of the network packet.
Address. The tail Pointer Points to the end address of the network packet. Storage sequence of network packets in a bucket
In sequence, the link layer header, network layer header, transmission layer header, and transmission layer data. In Association
PDF created with pdffacloud trial version www.pdffactory.com
For different layers of the stack, the sk_buff pointer data points to the network packet header of this layer. At the same time
In sk_buff, there are also related data structures to indicate information about different layer headers. Sk_buff and network packets
Relationship:
[Figure 2.1 relationship between sk_buff and network packets]
(Note: the control structure sk_buff and the storage space of network packets are allocated from two different caches,
Therefore, they are not stored continuously in the memory. There is also a reference about sk_buff and network
A diagram of the relationship between packets, but do not misunderstand that they are stored continuously in the memory)
2.2 functions related to sk_buff
Functions related to sk_buff involve the allocation, replication, and release of the network packet storage structure and control structure.
And control the operation of each pointer in the structure, as well as the check of various labels. Important Functions
As follows:
Struct sk_buff * alloc_skb (unsigned int size, int gfp_mask)
A storage space with a size allocated to store network packets and allocate its control structure. Size Value
Is 16-byte alignment, and gfp_mask is the priority of memory allocation. Common memory allocation priorities are:
Gfp_atomic indicates that the allocation process cannot be interrupted and is generally used to interrupt the memory allocation in the context;
Gfp_kernel indicates that the allocation process can be interrupted and the allocation request is put in the waiting queue.
. After the allocation is successful, the sk_buff data,
The tail Pointer Points to the starting address of the bucket. The LEN size is 0, and is_clone and cloned
The tag values are all 0.
Struct sk_buff * skb_clone (struct sk_buff * SKB, int gfp_mask)
Clone a new control structure from the control structure SKB, all pointing to the same network packet.
After the clone succeeds, the new control structure and the is_clone and cloned of the original control structure are marked
Set all records. The reference count of network packets is also increased (the reference count is stored in the bucket
The memory of the end address is accessed by the function atomic_t * skb_datarefp (struct sk_buff * SKB ).
Count the number of control structures in the bucket ). There are multiple control structures pointing
The same bucket, so when you modify the content in the bucket, you must first determine
The reference count of the bucket is 1, or use the copy function below to copy a new bucket. Then
To modify the content.
Struct sk_buff * skb_copy (struct sk_buff * SKB, int gfp_mask)
The replication control structure SKB and the content of the bucket it refers. After the replication is successful, the new control structure
And storage space is relatively independent from the original control structure and storage space. So in the new control structure
The is_clone and cloned tags are both 0, and the reference count of the new bucket is 1.
Void kfree_skb (struct sk_buff * SKB)
PDF created with pdffacloud trial version www.pdffactory.com
Release the control structure SKB and the bucket it refers. Because a bucket can have multiple control nodes
Therefore, the bucket is released only when the reference count of the bucket is 1.
Only the control structure SKB is released.
Unsigned char * skb_put (struct sk_buff * SKB, unsigned int Len)
Move the tail pointer down and increase the Len value of SKB. The space between data and tail is the storage network.
The space of the packets. This operation increases the space for storing network packets, but does not allow tail
The value of is greater than the end value, and the Len value of SKB is greater than the value of truesize.
Unsigned char * skb_push (struct sk_buff * SKB, unsigned int Len)
Move the Data Pointer up and increase the Len value of SKB. This operation adds one to the bucket header.
The storage space where network packets can be stored. The last operation adds a storage space at the end of the bucket.
Space for storing network packets. However, the increase cannot make the data value smaller than the head value. The LEN value of SKB is large.
The value of truesize.
Unsigned char * skb_pull (struct sk_buff * SKB, unsigned int Len)
Move the Data Pointer down and reduce the Len value of SKB. This operation directs the data pointer to the next layer of network.
The header of the message.
Void skb_reserve (struct sk_buff * SKB, unsigned int Len)
Move the Data Pointer and tail pointer down simultaneously. This operation reserves an empty LEN Length in the bucket header.
Gap.
Void skb_trim (struct sk_buff * SKB, unsigned int Len)
Reduces the length of network packets to Len. This operation discards the padding value at the end of the network packet.
Int skb_cloned (struct sk_buff * SKB)
Determine whether SKB is a clone control structure. If it is clone, its cloned mark is
1, and the reference count of the bucket to which it points is greater than 1.
2.3 definition of sk_buff_head
In the implementation of the network protocol stack, many network packets need to be put in a queue for asynchronous processing.
. Linux defines the relevant data structure sk_buff_head. This is a two-way linked list
It links sk_buff to a two-way linked list,
[Figure 2.2 Relationship Between sk_buff_head and sk_buff]
2.4 functions related to sk_buff_head
Functions related to the linked list. The function is nothing more than adding or deleting nodes on the linked list. Important Function descriptions
As follows:
Void skb_queue_head (struct sk_buff_head * List, struct sk_buff * newsk)
Add newsk to the head of the linked list.
Void skb_queue_tail (struct sk_buff_head * List, struct sk_buff * newsk)
PDF created with pdffacloud trial version www.pdffactory.com
Add newsk to the end of the linked list.
Struct sk_buff * skb_dequeue (struct sk_buff_head * List)
Obtain the next sk_buff from the head of the list.
Struct sk_buff * skb_dequeue_tail (struct sk_buff_head * List)
Take the next sk_buff from the end of the list.
Skb_insert (struct sk_buff * Old, struct sk_buff * newsk)
Add newsk to the linked list where old is located, and newsk is in front of old.
Void skb_append (struct sk_buff * Old, struct sk_buff * newsk)
Add newsk to the linked list where old is located, and newsk is behind old.
Void skb_unlink (struct sk_buff * SKB)
Remove the SKB from its linked list.
All the above linked list operations are disconnected first. This is not required in the interrupt context.
A set of functions with the same name as the above function but with the prefix "_" for function calls running in the interrupt Context
.
3 skbuff in linux2.4.x
Network Packets in linux2.4.x may not be stored continuously in the memory.
Different from linux2.2.x (note not
IP address fragmentation obfuscation. IP address fragmentation refers to dividing a network packet into multiple network packets.
Packets are divided into several parts and stored in different Memory Spaces ). A rough example is as follows:
[Relationship between sk_buff of 3.1 linux2.4.x and network packets]
In the figure, frags is an array and frag_list is a one-way linked list. The bucket they direct to is
The size of a page (4 K ). These extra buckets are not used in the first place.
Data indicates that the buckets are used when they are insufficient. Divided by pages
The storage space is conducive to sharing the memory data with the user space program.
To record the length of the network packet, a variable data_len is added to sk_buff. This variable
The length of the message stored in frags and frag_list is recorded. Original variable Len record Network reports
The total length of the text. Truesize is the size of the storage area indicated by the head.
In linux2.2.x, allocate, copy, and release sk_buff and the stored area functions in linux2.4.x
The meaning remains unchanged, but the distribution, replication, and release of frags and frag_list are added during the operation,
In addition, network packets in distributed storage are integrated into a continuously stored network packet as needed. With
You can refer to the source code.
PDF created with pdffacloud trial version www.pdffactory.com
The operations on sk_buff_head in linux2.4.x are basically the same as those in linux2.2.x, but only one more
Spinlocks enable better queue sharing on SMP machines. For specific examples, refer to source generation
Code.
4. Summary
The storage structure of network packets is the basis for implementing the network protocol stack. Network packets are transmitted between different layers of the protocol stack.
Therefore, how to quickly locate the data that this layer cares about and avoid copying Network reports during processing
To improve the performance of the protocol stack. This article analyzes the network in linux2.2.x and linux2.4.x
The storage structure of the message and the operations on the storage structure. We can see that the protocol stack implementation in Linux
Normally, only one network packet storage space is allocated.
Different layers or different processing functions share the network packet through the control structure sk_buff.
. Only one copy is copied when you need to modify this packet. In this way, the storage space is also saved.
This makes it easy to locate data so that the performance of the Linux network protocol stack is good in applications.
5 References
1: TCP/IP details 2: implementation, Gray R. Wright, W. Richard Steven S, mechanical engineer
Industry Publishing House
2: Kernel Korner: Network buffers and memory management,
Www.linuxjournal.com
3: Linux IP networking by Glenn Herrin
4: building into the Linux network layer, phrack55
PDF created with pdffacloud trial version www.pdffactory.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.