IP datagram for sharding

Source: Internet
Author: User
When the IP layer receives an IP data report to be sent, it must determine which local interface to send data (optional) and query the interface to obtain its MTU. The IP address compares MTU with the datagram length, and fragment if necessary. Fragments can occur on the original sending host or intermediate router.
After an IP datagram is split, it is re-assembled only when it reaches the destination (the re-assembly here is different from other network protocols, they require re-assembly at the next stop, instead of at the final destination ). Re-assembly is completed by the IP layer of the target end. The purpose is to make the partitioning and re-assembly process transparent to the transport layer (t c p and UDP), except for some possible excessive operations. Data packets that have already been sharded may be sharded again (more than once ). The data contained in the IP header provides sufficient information for partitioning and re-assembly.
Recall the IP header (Figure 3-1). The following fields are used for the partitioning process. Each IP datagram sent by the sender contains a unique value. This value is copied to each piece when the data packet is sharded (we can see the purpose of this field now ). The flag field uses a bit to represent "more slices ". Except for the last piece, the bit must be set to 1 for each piece of data. The offset field refers to the position at the beginning of the offset of the original datagram. In addition, after the data packet is split, the total length value of each piece should be changed to the length value of the piece.
Finally, a bit in the flag field is called the "not sharding" bit. If this bit is set to 1, the IP address does not shard the datagram. Instead, discard the datagram and send an I c m p error message ("fragment is required but no fragment bit is set", as shown in Figure 6-3) to the start end. In the next section, we will see an example of this error.
When an IP datagram is sharded, each segment becomes a group with its own IP header and is independent from other groups when routing is selected. In this way, the data packets may be out of order when they arrive at the target end, but there is enough information in the IP header to allow the receiving end to correctly assemble the data packets.
Although the IP sharding process Looks transparent, one thing you don't want to use is to re-transmit the entire datagram even if only one piece of data is lost. Why is this happening? Because the IP layer itself does not have a mechanism for timeout retransmission-the higher layer is responsible for timeout and retransmission (t c p has a timeout and retransmission mechanism, but UDP does not. Some UDP applications also execute timeout and retransmission ). When a piece of data from the t c p packet segment is lost, t c p resends the entire t c p packet segment after the timeout. The packet segment corresponds to an IP datagram. There is no way to re-transmit only one piece of data. In fact, if the data packet is partitioned by an intermediate router instead of the starting system, the starting system cannot know how the data packet is partitioned. For this reason, we often need to avoid sharding. The document [Kent and mogul 1987] describes how to avoid sharding.
Using UDP can easily cause IP sharding (as we will see later, t c p tries to avoid sharding, but for applications, it is almost impossible to force t c p to send a long packet segment that needs to be split ). We can use the s o c k program to increase the length of the datagram until the multipart occurs. On an Ethernet network, the maximum length of a data frame is 1 5 0 0 bytes (see Figure 2-1), where 1 4 7 2 bytes is left to the data, assuming that the IP header is 2 0 bytes, the UDP header is 8 bytes. We run the s o c k program with the Data Length of 1471,147 2, 1473 and 1 4 7 4 bytes respectively. The last two parts should occur:
Bsdi % sock-u-I-NL-w1471 svr4 discard
Bsdi % sock-u-I-NL-w1472 svr4 discard
Bsdi % sock-u-I-NL-w1473 svr4 discard
Bsdi % sock-u-I-NL-w1474 svr4 discard
The corresponding tcpdump output is 11-7.

The first two UDP data packets (lines 1st and lines 2nd) can be mounted to an Ethernet data frame and are not partitioned. However, if the length of an IP datagram corresponding to writing 1473 bytes is 1 5 0 1, it must be split (3rd rows and 4th rows ). Similarly, if the length of a datagram generated by writing 1 4 7 4 bytes is 1 5 0 2, it also needs to be split (5th rows and 6th rows ).
After the IP datagram is split, tcpdump prints other information. First, frag 26304 (3rd rows and 4th rows) and frag 26313 (5th rows and 6th rows) refer to the value of the identification field in the IP header.
The next number in the part information, that is, 3rd in the row 1480 between the colon and @ number, is the length of the segment except the IP header. The first piece of two datagram data is 1480 in length: The UDP header occupies 8 bytes, user Data occupies 1 4 7 2 bytes (and the length of the 2 0 bytes of the IP header is exactly 1 5 0 0 bytes ). The 1st pieces (2nd rows) of 4th million data packets only contain 1 byte of data-the remaining user data. The first 2nd pieces (2nd rows) of 6th data packets contain the remaining 2 bytes of user data.
During sharding, the data (except the IP header) in each slice must be an integer multiple of 8 bytes. In this example, 1480 is an integer multiple of 8.
The number after the @ symbol is the offset value calculated from the beginning of the datagram. The offset values of 1st pieces of two datagram data are 0 (3rd rows and 5th rows), and the offset values of 2nd pieces are 1480 (4th rows and 6th rows ). The plus sign following the offset value corresponds to the "more" bits in the 3-bit flag field of the IP header. The purpose of setting this bit is to let the receiver know when to complete all multipart assembly.
Finally, make sure that the protocol name (UDP), source port number, and destination port number are omitted for lines 4th and 6th (not 1st. The protocol name can be printed, because it is in the IP header and copied to each film. However, the port number is in the UDP header and can only be found in 1st.
3rd pieces of data (1473 bytes of user data) are sent, as shown in 11-8. It should be reiterated that any transport layer header only appears in 1st pieces of data.
In addition, several terms need to be explained: IP datagram refers to the end-to-end transmission unit at the IP layer (before and after fragmentation), and grouping refers to the Data Unit transmitted between the IP layer and the link layer. A group can be a complete IP datagram or a part of an IP datagram.
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.