How the Linux kernel sends and constructs data packets

Source: Internet
Author: User
Tags htons

Link: http://blog.chinaunix.net/uid-10167808-id-25974.html

You are welcome to reprint this article, but please indicate the source and ensure the integrity of this article.

Author: godbach

Date: 2009/09/01

1. Structure of data packets

This section does not detail how to construct data packets in the kernel. The following sections analyze data packets at appropriate locations if necessary. Here is a brief analysis of how the kernel state constructs data packets based on the netfilter framework.

The data packet construction method that can be used in the kernel can be divided into two types.

First, we directly apply for a SKB struct using alloc_skb, and then fill in different members based on the actual application, or call the skb_copy_expand () function to apply for a new nskb Based on the SKB of the current data packet, and copy the content of SKB.

Second, it is also commonly used by individuals, that is, to directly modify the received data packet SKB, the main source IP address, the destination IP address, if it is TCP/UDP protocol, the destination port number of the source port. In short, you can adjust the relevant members of the data packet as needed.

Generally, both methods may involve re-calculating the checksum of each part, which is also required.

Ii. How to send constructed data packets

After completing the preceding steps, the data packet has been constructed. The next step is how to send the data packet. I have summarized two methods here.

Method 1: Let the data packet be transmitted according to the netfilter process. Because some data packet content has been changed, especially when the source IP address and destination IP address are changed, it is necessary to ensure that there is a route that can be queried.

The routing position in the NF framework is after prerouting, but after localout. In addition, data packets must be sent from the local device. Therefore, you can consider sending the modified data packet from the localout point.

The kernel code has a typical embodiment of this method. The kernel code versions involved in this article are 2.6.18.3. The source file is ipt_reject.c. The send_reset function is used to send the RST packet to the source IP address of the received data packet. The entire function involves the construction and transmission of data packets. Here we will make a simple analysis.

/* Send rst reply */

Static void send_reset (struct sk_buff * oldskb, int hook)

{

Struct sk_buff * nskb;

Struct iphdr * IPH = oldskb-> NH. iph;

Struct tcphdr _ otcph, * oth, * tcph;

Struct rtable * RT;

U_int16_t tmp_port;

U_int32_t tmp_addr;

Int needs_ack;

Int hh_len;

 

/* Determine whether it is a multipart package */

If (oldskb-> NH. iph-> frag_off & htons (ip_offset ))

Return;

/* Get the TCP Header pointer */

Oth = skb_header_pointer (oldskb, oldskb-> NH. iph-> IHL * 4,

Sizeof (_ otcph), & _ otcph );

If (OTH = NULL)

Return;

 

/* The package received in the current period is the RST package, so you no longer need to send the RST package */

If (oth-> RST)

Return;

 

/* Check whether the packet checksum is correct */

If (nf_ip_checksum (oldskb, Hook, IPH-> IHL * 4, ipproto_tcp ))

Return;

/* This step is critical, and it is used to update the route. This function uses the source IP address of the current data packet as the destination IP address of the route, and considers the destination IP address of the data packet to obtain the route to the source IP address */

If (RT = route_reverse (oldskb, oth, Hook) = NULL)

Return;

 

Hh_len = ll_reserved_space (RT-> U. dst. Dev );

 

/* Copy the current oldskb, including the SKB struct and data. This is the first method for constructing data packets as mentioned above */

Nskb = skb_copy_expand (oldskb, hh_len, skb_tailroom (oldskb ),

Gfp_atomic );

If (! Nskb ){

Dst_release (& RT-> U. DST );

Return;

}

 

/* Because it is an oldskb copy, it does not need to be referenced again, so release the reference to this route entry */

Dst_release (nskb-> DST );

/* Direct the route referenced by the newly constructed data packet to the new route entry returned by the route_reverse function above */

Nskb-> DST = & RT-> U. DST;

 

/* Clear link trace related content in oldskb copied from nskb */

Nf_reset (nskb );

Nskb-> nfmark = 0;

Skb_init_secmark (nskb );

 

/* The actual data section of the constructed data packet is as follows. If we direct the request to an oldskb buffer instead of a new buffer for nskb, the second method to construct the data packet we mentioned above will be used. */

/* Obtain the TCP Header of nskb */

Tcph = (struct tcphdr *) (u_int32_t *) nskb-> NH. iph + nskb-> NH. iph-> IHL );

 

/* Exchange source and destination IP address */

Tmp_addr = nskb-> NH. iph-> saddr;

Nskb-> NH. iph-> saddr = nskb-> NH. iph-> daddr;

Nskb-> NH. iph-> daddr = tmp_addr;

 

/* Switch source and destination port */

Tmp_port = tcph-> source;

Tcph-> source = tcph-> DEST;

Tcph-> DEST = tmp_port;

 

/* Reset the length of the TCP header and modify the total length of the packets recorded in the IP header. Because the RST packet is sent here, only the TCP header is required, and the TCP Data part is not required */

Tcph-> doff = sizeof (struct tcphdr)/4;

Skb_trim (nskb, nskb-> NH. iph-> IHL * 4 + sizeof (struct tcphdr ));

Nskb-> NH. iph-> tot_len = htons (nskb-> Len );

 

/* Reset seq and ack_seq in two cases (details about TCP/IP )*/

If (tcph-> ACK) {/* positions marked by ack in the original data packet */

Needs_ack = 0;

Tcph-> seq = oth-> ack_seq;/* ack_seq of the original data packet as the seq of nskb */

Tcph-> ack_seq = 0;

} Else {/* the ACK flag bit in the original data packet is not set, such as initial connection SYN or end connection fin */

Needs_ack = 1;

/* This should be a syn or FIN packet, because both SYN and FIN packets occupy 1 byte length. Therefore, ack_seq should be equal to seq + 1 of the old package. The reason for this is that there may be other data packets. */

Tcph-> ack_seq = htonl (ntohl (oth-> SEQ) + oth-> SYN + oth-> fin

+ Oldskb-> len-oldskb-> NH. iph-> IHL * 4

-(Oth-> doff <2 ));

Tcph-> seq = 0;

}

 

/* Mark location 1 in rst */

(U_int8_t *) tcph) [13] = 0;

Tcph-> rst = 1;

Tcph-> ACK = needs_ack;

 

Tcph-> window = 0;

Tcph-> urg_ptr = 0;

 

/* Re-calculate the TCP checksum */

Tcph-> check = 0;

Tcph-> check = tcp_v4_check (tcph, sizeof (struct tcphdr ),

Nskb-> NH. iph-> saddr,

Nskb-> NH. iph-> daddr,

Csum_partial (char *) tcph,

Sizeof (struct tcphdr), 0 ));

 

/* Modify the TTL of the IP package and set the disable part */

Nskb-> NH. iph-> TTL = dst_metric (nskb-> DST, rtax_hoplimit );

/* Set DF, id = 0 */

Nskb-> NH. iph-> frag_off = htons (ip_df );

Nskb-> NH. iph-> id = 0;

 

/* Re-calculate the IP packet header checksum */

Nskb-> NH. iph-> check = 0;

Nskb-> NH. iph-> check = ip_fast_csum (unsigned char *) nskb-> NH. iph,

Nskb-> NH. iph-> IHL );

 

/* "Never happens "*/

If (nskb-> Len> dst_mtu (nskb-> DST ))

Goto free_nskb;

/* Associate the link records of nskb and oldskb */

Nf_ct_attach (nskb, oldskb );

/* Here is the final way to send the data packet. The specific method is to let the new data packet pass through the loaclout hook point, then check the route, and finally send the data packet through the prerouting point.

In fact, I still have one question: (1) Why can't I directly search for a route, but must first go through the localout point ;*/

Nf_hook (pf_inet, nf_ip_local_out, nskb, null, nskb-> DST-> Dev,

Dst_output );

Return;

 

Free_nskb:

Kfree_skb (nskb );

}

Some netizens have explained the question mentioned in the source code, which is referenced here:

Bytes --------------------------------------------------------------------------------------------------

In fact, this is not lost to the top layer, but the same meaning as the ip_queue_xmit () Sending process.
After this package is re-routed, the header is encapsulated and then placed before nf_ip_local_in.

In fact, if you modify the IP address in the middle, you must manually re-route the route.
This involves some complicated route cache searches. If not, you can find the route tables. Then, you can associate the route structure with the neighbor structure and perform operations on the neighbor subsystem; then it involves the ARP cache query. If not, some operations and ARP processes are performed to find the Mac information corresponding to the IP address.
.

Bytes --------------------------------------------------------------------------------------------------

Through the above analysis of the send_reset function, we should understand the method of using the NF framework to send the constructed data packets.

Method 2: Call the dev_queue_xmit function to directly send the constructed data packets to the NIC Driver. From the perspective of the NF framework, the function is called after the postrouting point. It can also be understood that the layer-3 constructed data packet is sent directly by calling the layer-2 sending function. This function actually calls SKB-> Dev-> hard_start_xmit, that is, the driver function of the corresponding Nic, and sends the data packets directly.

Obviously, this function works on the second layer and sends data packets (the precise call of data packets at the second layer should be frames. We call data packets directly at the third layer, which is also called data packets) you do not need to check the route.

However, layer-2 transmission depends on the target Mac. In the packets constructed in the first method, only the IP address is exchanged without any modification to the MAC address. In this way, calling dev_queue_xmit directly produces problems, and the content sent by this function should start from the second-layer header to the end of the data packet. Therefore, if you want to call this function to directly send data packets, You need to modify the Source and Destination MAC of the data packet and direct the SKB-> data pointer to the MAC header, and the SKB-> Len value must also be added with the header length method. The following is the sample code for reference:

Unsigned char mac_temp [eth_alen] = {0 };

Struct ethhdr * Mach = NULL;

......

/* Code ...... Construct the IP address of the data packet, that is, the upper-layer protocol and Data */

......

/* Switch source and target Mac */

Mach = (struct ethhdr *) SKB-> Mac. Raw;

Memcpy (mac_temp, (unsigned char *) Mach-> h_dest, eth_alen );

Memcpy (Mach-> h_dest, (unsigned char *) Mach-> h_source, eth_alen );

Memcpy (Mach-> h_source, mac_temp, eth_alen );

 

/* Modify the SKB-> Data Pointer to point it to the MAC header, and add SKB-> Len */

Skb_push (SKB, eth_hlen );

/* Directly call this function to send data packets from the NIC */

Ret = dev_queue_xmit (SKB );

 

The Return Value of the hook function after the constructed data packet is sent.
(1) Implementation of the first method of sending data packets. for the implementation of the send_reset function, because nskb memory is applied separately and new data packets are constructed. The new data packet follows the NF process. For the original SKB, the return value of the module return nf_drop is processed.
(2) Implementation of the second method of sending data packets. If a data packet is re-constructed based on an existing data packet, the content of the original data packet will no longer exist, in addition, after dev_queue_xmit is called, the same buffer zone has been added, but the new data packet has been filled and sent out. Therefore, no original data packet exists and nf_stolen needs to be returned, tell the protocol stack not to worry about the original package. Otherwise, if the new data packet is separately applied for memory, the original data packet should still return nf_drop.

Iii. Summary

The above two methods are personal sharing and summary, and the packet sent out in the kernel. In practice, after constructing a data packet, call the dev_queue_xmit function to send the packet, and test the RST method by calling send_reset. However, other data packets are not sent by calling nf_hook in send_reset. If you have any practical experience, please share with us.

In the process of analyzing the send_reset code, this article referred to the muddoghole article found in Baidu, because this article can only be seen from Baidu snapshots, And the link is too long, the connection is not listed here, thanks to the author of the original article.

Due to insufficient understanding of some aspects of the kernel, there must be many problems in this article. Thank you for your attention.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.