Original address: http://blog.chinaunix.net/uid-26833883-id-3627644.html interconnection network in the early days, the interconnection between hosts uses the NCP protocol. This protocol itself has many shortcomings, such as: Can not interconnect different hosts, can not interconnect different operating systems, no error correction function. To improve this shortcoming, Daniel has made the TCP/IP protocol. Almost all operating systems now implement the TCP/IP protocol stack. The TCP/IP protocol stack is divided into four layers: application layer, Transport layer, network layer, data link layer, each layer has corresponding protocol, such as
The so-called protocol is a format for both parties to transmit data. There are many protocols used throughout the network, and fortunately, each protocol has RFC documentation. In this case only the IP, TCP, UDP protocol header to do an analysis. First look at the format of a frame Ethernet packet in the network: in the Linux operating system, when we want to send data, we only need to prepare the data on the upper layer, and then submit to the kernel stack, the kernel stack automatically add the corresponding protocol header. Let's take a look at each layer to add the protocol header specific content. I. TCP protocol is a transport layer protocol for connection-oriented, high reliability (no data loss, no data disorder, no data error, no data duplication). 1.TCP Head Analysis Analyze the format of the TCP header and the meaning of each field: (1) port number [16bit] We know that the network implements inter-process communication between different hosts. In an operating system, there are many processes that are submitted to which process to process when the data arrives? This requires a port number. In the TCP header, there is the active port number (source port) and the destination port number (Destination port). The source port number identifies the process of the sending host, and the destination port number identifies the process of the receiving host. (2) ordinal [32bit] sequence number is divided into the sending sequence number (Sequence numbers) and the confirmation ordinal (acknowledgment number). Send serial number: Used to identify the data stream sent from the tcp source to the tcp destination, which represents the sequence number of the first data byte in this segment. If you consider a byte stream as a one-way flow between two applications, the  TCP uses the sequential number to count each of the bytes. The ordinal number is the 32bit of the symbol, and the sequence number reaches 2 32- 1 and starts from 0. When a new connection is established, the syn flag becomes 1, and the Sequence number field contains the initial order number  ISN ( initial sequence number) of the connection selected by this host. Confirm ordinal: Contains the next order number that is expected to be received at the end of the send acknowledgement. Therefore, the confirmation sequence number should be the last time the data byte order was successfully received plus 1. The Confirm ordinal field is only valid if the  ACK flag is 1.  TCP provides a full duplex service for the application layer, which means data can be transferred independently in two directions. Therefore, each end of the connection must maintain the transmit data sequence number in each direction. (3Offset [4bit] Here The offset actually refers to the length of the TCP header, which is used to indicate the number of the first bit in the TCP header, through which it is possible to know where the user data for a TCP packet began. This field occupies 4bit, as the value of 4bit is 0101, then the TCP header length is 5 * 4 = 20 bytes. So the first Minister of TCP has a maximum of 15 * 4 = 60 bytes. However, there are no optional fields, and the normal length is 20 bytes. (4) Reserved [6bit] is not currently used, its value is 0 (5) flag [6bit] There are 6 flag bits in the TCP header. Multiple of them can be set to 1 at the same time. urg Emergency pointer (urgent pointer) effective ack Confirm the serial number valid PSH to indicate that the receiver should hand over this segment to the application layer without waiting for the buffer to fill rst &N Bsp Generally means disconnecting a connection syn Sync sequence number is used to initiate a connection fin Send end Send task (that is, disconnect) (6) The size of the Window [16bit] window, indicating the maximum number of bytes The source method can accept. (7) The checksum [16bit] checksum covers the entire TCP packet segment: TCP header and TCP data. This is a mandatory field that must be computed and stored by the originator and validated by the end of the collection. (8) Emergency pointer [16bit] only if the URG flag is set to 1 o'clock the emergency pointer is valid. The emergency pointer is a positive offset, and the sum of the values in the Ordinal field represents the ordinal of the last byte of the emergency data. The emergency mode of TCP is a way of sending an emergency data to the other end. (9) TCP option is optional, we are looking at it 2 when we grab the packet. Key details (1) Three-time handshake establish connection a. The requester (often called the customer) sends a SYN segment that indicates the port of the server to which the customer intends to connect, and the initialThe serial number (ISN, in this example is 1415531521). This SYN field is message Segment 1. B. The server sends back the SYN segment (message segment 2) that contains the initial sequence number of the server as a response. At the same time, the confirmation number is set to customer's ISN plus 1 to confirm the customer's SYN message segment. A SYN will occupy an ordinal C. The customer must set the confirmation serial number to the server's ISN plus 1 to confirm the SYN message segment of the server (message Segment 3) The three segments to complete the connection establishment. This process, also known as the three-time handshake (three-way handshake), can see that the three-time handshake determines the serial number of the two-way package, the maximum accepted data size (window), and the MSS (Maximum Segment size).  MSS = Mtu-ip Head-TCP header, MTU represents the maximum transmission unit, we will say in the IP header analysis, it is generally 1500 bytes. The IP header and TCP headers are 20 bytes with optional options. In this case mss=1500-20-20 = 1460.  MSS limits the size of the TCP packet to carry the data, it means that when the application layer to the transport layer to submit data through the TCP protocol for transmission, if the application layer of data >mss must be segmented, divided into multiple segments, each sent past. The first three times is the process of three handshake, the last three times is the process of transmitting data, because the data size is 4,096 bytes, so three times to pass (1448 + 1448 + 1200). The attentive person will ask why is the maximum data size for each transfer not 1460 bytes? TCP Header Length = 20 + 12 (optional option size) = 32 bytes Because TCP carries an option here. The maximum data that can be transmitted is: 1500-20-32 = 1448 bytes. (2) Four wave disconnect a. The network traffic is now based on the socket, and when the client closes its socket, the kernel stack automatically sends a FIN-set packet to the server, requesting disconnection. We call the first party initiating the disconnect request called the active disconnect. b. After the server receives a fin disconnect request from the guest, the kernel stack immediately sends an ACK packet as an answer indicating that the client's request c has been received. After the server has been running for some time, it shuts down its socket. At this point, the kernel stack sends a FIN-set packet to the client, requesting a disconnected d. After the client receives a fin disconnect request from the server, an ACK is sent to answer the request that the server has received
(3) The guarantee of TCP reliability
TCP uses a technology called "positive acknowledgment with retransmission (positive acknowledge with retransmission)" as the basis for providing reliable data transfer services. This technique requires the receiving party to send back the acknowledgement ACK to the source station after receiving the data. The sender keeps a record of each group emitted and waits for a confirmation message before sending the next packet. The sender also initiates a timer at the same time as the packet is sent out, and when the timer expires and confirms that the message has not arrived, resend the packet just issued. Figure 3-5 shows a positive acknowledgement protocol with retransmission function to transmit data, and figure 3-6 indicates that packet loss causes timeouts and retransmissions. In order to avoid late confirmation and duplicate confirmation due to network delay, the protocol specifies a sequence number in the confirmation message to allow the receiver to associate the grouping with the acknowledgment correctly.
As can be seen from figure 3-5, although the network has the ability to communicate at the same time, the simple affirmation protocol wastes a lot of valuable network bandwidth because the next packet must be deferred before receiving the acknowledgement from the previous packet. To do this, TCP uses the mechanism of sliding windows to improve network throughput while addressing end-to-end traffic control.
(4) Sliding window technology
Sliding window technology is a more complex variant of the simple positive acknowledgment mechanism with retransmission, which allows the sender to send multiple groupings before waiting for a confirmation message. 3-7, the sender wants to send a sequence of packets, the sliding window protocol places a fixed-length window in the grouping sequence, and then sends out all the groupings within the window, and when the sender receives a confirmation of the first grouping within the window, it can slide backwards and send the next packet; The window also continues to slide backwards.
UDP protocol UDP Protocol is also a transport layer protocol, it is no connection, does not guarantee a reliable transport layer protocol. Its protocol header is relatively simple, as follows:
The port number here does not explain, and the TCP port number is the same meaning. Length occupies 2 bytes, which identifies the size of the UDP header. Checksum: Checksum, which contains the UDP header and data parts. Third, the IP protocol I p is the T C p/i P protocol Family the most core protocol. All t C p, U D p, i C m P and i G m P data are transmitted in the I P datagram format. It is characterized as follows: unreliable (u n r e L i a b l e) means that it does not guarantee that I-p datagrams can successfully reach their destination. I P provides only the best transfer service. If some kind of error occurs, such as a router has temporarily run out of buffers, I p has a simple error handling algorithm: Discard the datagram, and then send an I C M p message to the source side. The reliability of any requirement must be provided by the upper layer (e.g. T C P). No connection (c o n n e c t i o n l e s s) This term means that I p does not maintain any status information about subsequent datagrams. The processing of each datagram is independent of each other. This also indicates that I P datagrams can not be received in the order sent. If a source sends two consecutive datagrams (first A, then B) to the same beacon, each datagram is routed independently and may choose a different route, so B may arrive before a arrives. 1.IP Header Format
(1) version is 4 bits, which refers to the version of IP protocol. The IP protocol versions used by both sides of the communication must be consistent. The current widely used IP protocol version number is 4 (that is, IPv4). With regard to IPV6, it is still in the draft stage.
(2) The first length is 4 bits, the maximum decimal value that can be represented is 15. Note that the unit of the number represented in this field is 32 bits in length (a 32-bit word length is 4 bytes), so when the IP header length is 1111 (that is, the decimal 15), the header length reaches 60 bytes. When the header length of an IP packet is not an integer multiple of 4 bytes, it must be populated with the last fill field. Therefore, the data part will always start at 4-byte integer multiples, which makes it more convenient to implement the IP protocol. The disadvantage of a header length limit of 60 bytes is sometimes not enough. However, this is done in the hope that users will minimize overhead. The most commonly used header length is 20 bytes (that is, the header length is 0101), and no options are used.
(3) Differentiated services accounted for 8, for better service. This field is called the service type in the old standard, but it has not been used in practice. 1998 the IETF renamed this field to differentiate Service DS (differentiated Services). This field only works if you are using differentiated services.
(4) Total length length refers to the length of the header and the sum of the data, in bytes. The total Length field is 16 bits, so the maximum length of the datagram is 216-1 = 65535 bytes.
Each data link layer below the IP layer has its own frame format, which includes the maximum length of the data field in the frame format, which is called the Maximum Transfer Unit MTU (Maximum Transfer unit). When a datagram is encapsulated into a link-layer frame, the total length of the datagram (that is, the header plus the data portion) must not exceed the MTU value of the data link layer below.
(5) The mark (identification) occupies 16 bits. The IP software maintains a counter in memory, each generating a datagram, the counter adds 1, and assigns this value to the identity field. However, this "identity" is not an ordinal, because IP is a no-connect service, the datagram does not exist in order to receive problems. When a datagram must be fragmented because it is longer than the MTU of the network, the value of the identity field is copied to the identity field of all datagrams. The same value of the identity field causes the fragmented datagram to be correctly re-installed as the original datagram.
(6) flag (flag) accounted for 3, but at present only 2 bits are meaningful.
The lowest bit in the Flag field is recorded as MF (more Fragment). Mf=1 is a datagram that says "There are shards" later. Mf=0 says this is the last of several datagrams
The one in the middle of the flag field is recorded as DF (Don ' t Fragment), meaning "cannot shard." Shards are allowed only when df=0.
(7) Chip offset is 13 bits. The slice offset indicates the relative position of a piece in the original group after a long grouping in the Shard. That is, relative to the starting point of the User data field, the slice starts from where. The slice offset is offset in 8 bytes. This means that the length of each shard must be an integer multiple of 8 bytes (64 bits).
(8) Survival time is 8 bits, and the abbreviation for the time-to-live field is TTL, which indicates the lifetime of the datagram in the network. This field is set by the source point at which the datagram is emitted. The aim is to prevent the inability to deliver data in a way that is unrestrained in the Internet and thus consumes network resources in vain. The original design was in seconds as the TTL unit. Each time a router is passed, the TTL is subtracted from the datagram when it is consumed by the router. If the datagram consumes less than 1 seconds on the router, the TTL value is reduced by 1. When the TTL value is 0 o'clock, the datagram is discarded.
(9) The protocol occupies 8 bits, and the Protocol field indicates which protocol the data is carrying in order to enable the IP layer of the destination host to know which process to hand over the data portion.
(10) The first Test and 16. This field only examines the header of the datagram, but does not include the data section. This is because the router has to recalculate the first check and (some fields, such as lifetime, flag, slice offset, and so on) every time a router is passed. Do not test the data section to reduce the amount of computational effort.
(11) The source IP address occupies 32 bits.
(12) The destination IP address occupies 32 bits.
2. Shard Interpretation sharding refers to the need to transmit more data than the maximum Transmission Unit (MTU), you need to split into multiple packages, and then sent to each other. When we talk about TCP, there are a lot of people in MSS who can't tell them apart. From the diagram below, I think I can distinguish them completely.
Personal feel if the data is transmitted through the TCP protocol to the IP layer, there is no need for sharding. Shards are required only when big data is transmitted over the UDP protocol. Example: Transmitting 10,240 bytes of data with the UDP protocol
Can be seen, but when the data is submitted to the network layer, because the data exceeds the maximum transmission unit, it is fragmented. Split into multiple packages to send each other through the IP protocol. The maximum bytes per packet is Mtu-ip header = 1500-20 = 1480.
Four, Ethernet head
Three components: source Mac Address | Destination MAC Address | The protocol that is used. So in Ethernet, there are several formats for the packet:
ARP protocol is the IP address to obtain the corresponding MAC address, called Address Resolution Protocol Rarp protocol is through the MAC address to obtain the corresponding IP address, called Reverse Address Resolution Protocol
Go TCP, UDP, IP protocol