This article briefly introduces TCP connection-oriented theory, describes the meaning of each field of TCP packets in detail, and selects TCP connections from the wireshark capture group to establish relevant packet segments for analysis.
I. Overview
TCP is a reliable connection-oriented transmission protocol. Two processes need to establish a connection before sending data to each other. The connection here is only some cache and status variables allocated in the end system, the group switch in the middle does not maintain any connection status information. The connection establishment process is as follows (namely, the three-way handshake protocol ):
First, the client sends a special TCP packet segment;
Second, the server uses another special TCP packet segment to respond;
Finally, the client uses the third special packet segment as the response.
Figure 1 three-way handshake protocol [1]
Ii. TCP Message format
2.1 Overview
To provide reliable data transmission, there are many fields in the header field of the TCP packet. The format of the TCP packet is as follows:
Figure 2 TCP Message format
Source and target ports
It is used for Multiplexing/multi-channel decomposition of data from or from the top layer application. It can be understood that a port is used to identify different processes of the same computer.
Serial number and confirmation number
These two fields are the key part of the TCP reliable transmission service. The serial number is the first byte stream number (TCP regards data as ordered byte streams, TCP implicitly numbers each byte of the data stream ). This understanding may be more intuitive. When a packet is divided into multiple packet segments, the serial number is the offset of the first byte of the packet segment in the entire packet.Confirm number to specify the next expected byte. TCP is full-duplex. Assuming that data is received from host a to host B, the validation number that host a fills in the packet segment is the next byte number that host a expects to receive from host B. Have the relationship between the two been clarified? See (three-way handshake ):
Figure 3 TCP connection establishment process under normal circumstances
Header Length (4 digits)
Because the options are not long, you need to identify the length of the entire header field (unit: 32 characters), that is, the number of 5 + options. 4-bit. The unit is 32 characters. Therefore, the maximum length of the header is 15*4 = 60 bytes, that is, the maximum length of the option is 40 bytes (10 options ).
Flag
URG
Indicates that the upper-layer entity of the sender is marked as "urgent" data in the packet segment. When URG = 1, the following emergency pointer indicates the location of the emergency data in the current data segment (the byte offset relative to the current serial number). The TCP receiver must notify the upper-layer entity.
ACK
When ACK = 0, it indicates that the data segment does not contain confirmation information. When ACK = 1, it indicates that the message segment includes a confirmation of the successfully received message segment.
Psh
When PSH = 1, the Receiver immediately delivers the data to the upper layer after receiving the data, instead of until the entire buffer zone is full.
RST
Used to reset a chaotic connection (such as a master crash), or reject an invalid data segment or a connection request. Generally, if the data segment you get is set with the RST bit, it indicates that there is a problem at this end.
SYN
Used to establish a connection. In the connection request,SYN = 1 and ACK = 0This data segment is not usedTapeAnd the connection response token carries a confirmation, that isSYN = 1 and ACK = 1.
Note: The bandwidth refers to the data from the client to the server.ConfirmIt is loaded in a data packet segment that carries the server to the client.
Fin
Release a connection, indicating that the sender has no data to transmit. At this time, the receiver may continue to receive data. Fortunately, both SYN and fin data segments have serial numbers, thus ensuring that these two data segments are processed in the correct order.
Window Size
Used for throttling (ensure that neither side of the connection sends an excessive group too quickly to overwhelm the other), the window size specifiesBytes to be confirmedThe number of bytes that can be sent.
Checksum
Additionally, the checksum field of TCP is set to 0 during the test and calculation. If the number of bytes in the data field is odd, the data field fills in an additional 0 bytes. Checksum algorithm: accumulate all 16-bit characters in the form of 1, and obtain the completion code of the accumulated result. Therefore, when the receiver executes the same calculation (including the checksum field), the result should be 0.
Emergency pointer
Refer to the URG bit of the flag field.
Option
The option is designed to fit complex network environments and better serve the application layer. The maximum TCP option is 40 bytes. For details, see 2.2.
Data
TCP segments without any data are also valid and are generally used to confirm and control information.
2.2 option field [2]
The TCP option part appears in the established session, as long as it appears in the TCP connection establishment stage, that is, the three-way handshake. The actual use of TCP options is as follows:
(1) Maximum message transmission segment (MMS, maximum segment size)
Used for sending and receiving negotiationMaximum packet segment length(Only net load data, excluding TCP header fields ). In TCP three-way handshake, each party will announce the desired MSS (the MSS only appears in the SYN Packet). If one party does not accept the MSS value of the other party, the default536Bytes of net load data, that is, the host can accept 20 + 536 bytes of TCP packet segments.
(2) Window Scaling)
The window size field of TCP packets occupies 16 bits, that is, the maximum value is 65535,However, as latency and bandwidth increase (for example, satellite communication), a larger window is required to meet performance and throughput requirements.This is the significance of the window expansion option. For examples, see references [2].
Windows scaling occupies three bytes. The last byte isShift count)That is, the number of bits in the first window is 16 shifted to the left. If the shift value is 14, the maximum value of the new window will be increased to 65535*(2 ^ 14 ).
The window expansion option is negotiated at the beginning of TCP establishment. If the window expansion has been implemented, when the window expansion is no longer neededShift value = 0The size of the original window can be restored, that is, 65535.
(3) Select the confirm option (sack, selective acknowledgements)
In this case, host a sends a packet segment of 12345, host B receives a packet of 135, and the packet has no error,Sack is used to ensure that only the missing packets are retransmitted.Instead of re-transmitting all packets.
The SACK option requires two feature bytes, one for specifying the SACK Option (sack permission), and the other for specifying the number of bytes of this option.
How to describe the lost packet segment 2, indicating that the Left and Right boundaries of 2 are 1 and 3 respectively. TCP data packets have block boundaries, which are represented by serial numbers.
How many bytes of boundary information can be specified at most? The answer is four. This is because the maximum value of the option field is 40 bytes, and two feature bytes are removed. The serial number is 32 bits, that is, 4 bits, and the left and right boundary is required. Therefore, (40-2)/8 = 4.
(4) timestamps)
The timestamp option is used to calculate the round-trip time RTT. When sending a message segment, the sender puts the current clock time value in the timestamp field. The receiver copies the value of this timestamp field to the confirmation message, when the recipient receives the validation message, compare the timestamp (equal to the timestamp of the sender's sending packet segment) with the current clock to calculate the RTT.
The timestamp option can also be used to prevent loop number paws. The serial number is only 32 bits. Every 2 ^ 32 serial numbers will be rewound (think about the ring Queue). It is easy to distinguish the packet segments with the same serial number using the timestamp option.
(5) Nop (no-Operation)
The header of TCP must be a multiple of 4 bytes, while most options are not a multiple of 4 bytes. If the header is insufficient, it should be filled with NOP. In addition, NOP is also used to split different option data, such as NOP isolation between window expansion options and sack (as shown in the following example ).
Iii. instance resolution
3.1 Overview
Take access to the Baidu homepage as an example. First, use the DNS protocol to resolve the URL to an IP address, then establish a TCP connection between the client and the server, and use Wireshark to capture the group, for example:
Figure 4 establish a TCP connection group by Wireshark capture
You may think it is a bit strange. In theory, it should be three groups. How can we have six groups? First, send and receive the six packets (in combination with the time and message meaning), as shown below:
Figure 5 TCP connection to create an instance
The figure shows that at the beginning of the connection establishment, the client sends two message segments, which may be used to establish a connection faster (assuming a request segment is lost, it won't take a while, resend packets ). Next, the TCP connection creation process is analyzed with 19, 21, and 22 (as shown in red lines.
3.1 first handshake 19
The packet segment captured by Wireshark for the first handshake of the TCP connection is as follows:
Figure 6 TCP connection to the first handshake instance
Here we mainly select several fields for analysis:
Flag field. SYN = 1 and ACK = 0 indicate that this data segment is not used.Tape.
How does the maximum message segment length (MMS) reach 1460? the physical characteristics of the link layer over Ethernet determine that the data frame length is 1500 (MTU, the maximum transmission unit ), 1460 = 1500-20 (IP header length)-20 (TCP Header length ). Do not be confused by the 32-byte length of the first part of the message. This is only the connection process. For the relationship between MSs and MTU, see [2]:
Figure 7 relationship between MSs and MTU
The NOP field can be filled with less than 4 multiples of bytes, or separated as options. Three NOP fields appear in the text segment. For specific functions, see:
Figure 8 NOP field of TCP Packets
3.3 second handshake 21
When the server responds to the TCP packet segment of the client, the confirmation number is 1. SYN = 1 and ACK = 1 indicate that the connection response token carries a confirmation. The Wireshark capture group is as follows:
Figure 9 TCP connection second handshake instance
Why is MSS 1452 instead of 1460? This is because pppoe (point-to-point over Ethernet) enables Ethernet hosts to connect to a non-existent Access Concentrator through a simple bridging device [3]) for dial-up Internet access, the ppop header is 8 bytes, so the MTU of pppoe is 1492, And the MSS is 1492-40 = 1452.
What is the MSS for data transmission after a TCP connection is established? 1460 or 1452 or 536? My understanding is that the default value is 536. Is that correct? Please advise!
3.4 Third handshake 22
The message segment of the client server again. At this time, the serial number and the validation number are both 1 and there is no option field. The group information captured by Wireshark is as follows:
Figure 10 TCP connection to the third handshake instance
It is worth noting that the window expansion size negotiation fails, so the window will not be expanded, that is, the maximum window size is 65535.
In this case, the TCP connection is established :-)
Original article from: yusi Zhiyuan's column
[Switch] Use Wireshark to analyze the format of TCP headers in TCP/IP protocol