Some Understandings and practices of TCP protocol

Source: Internet
Author: User


Some Understandings and practices of TCP protocol
1. The introduction references the introduction in "TCP/IP explanation-Volume 1". TCP and UDP use the same network layer (IP layer ), TCP provides a completely different service to the application layer than UDP. TCP provides a connection-oriented and reliable byte stream service. Connection-oriented refers to establishing a connection before communication with each other. Meanwhile, this point-to-point connection shows that TCP does not support multicast and broadcast. The so-called Reliability refers to a mechanism in which TCP has a heap to ensure data transmission accuracy. Www.2cto.com
The so-called byte stream means that the TCP receiving end does not know how much data the sending end writes to the connection each time. It only cares about limiting the maximum number of nodes in the connection. Ii. Protocol format diagram: TCP packet structure diagram: Description of the TCP packet header structure field: 16-bit source port number: indicates the port number that establishes a connection (or sends data) 16-bit destination port number: port Number 32 bits connecting the other end (or receiving data): The sent byte number. If it is a newly established connection, the seq of the first package is 0, otherwise, it is the validation sequence number of the previous data packet. The sequence numbers in the same package are different from the validation sequence numbers. Www.2cto.com
32-bit validation serial number: equal to the serial number of the received data packet seq + the length of the data packet len. It also tells the peer that the starting byte Number of the next packet. 4-bit Header Length: the length of the tcp packet header. URG: The emergency pointer is valid. It allows one end to tell the other end that some "emergency data" with some method has been placed in a common data stream. The other end is notified that the emergency data has been placed in the normal data stream.
Is determined by the recipient. URG bit is set to 1, and a 16-bit emergency pointer is set to a positive offset, which must be added to the serial number field in the TCP header, in order to obtain the sequence number of the last byte of the emergency data. Note: The TCP emergency mode is not out-of-band data ). What is the role of emergency methods? Two common examples are Telnet and Rlogin. Another example is FTP when an interactive user inputs an interrupt key. What happens if the sender enters the emergency mode multiple times before the receiver processes the first emergency pointer? The emergency pointer in the data stream will move forward, and its previous position in the receiver will be lost. The receiver has only one emergency pointer, and it will be overwritten whenever the other party has a new value. This means that if the content written by the sender when entering the emergency mode is very important to the receiver, the byte data must be specially marked by the sender in some way. We will see that Telnet marks all its commands by adding a byte with a value of 255 to the data stream as the prefix. Www.2cto.com
ACK: confirm the serial number Valid PSH: the receiver should send this packet segment to the application layer as soon as possible. The sender uses the PUSH flag to notify the receiver to submit all received data to the receiving process. The data here includes the data transmitted together with PUSH and other data that the receiver TCP has received for the receiving process (still in the TCP buffer ).
RST: resets the connection SYN: synchronous signal, used to initiate a connection FIN: The initiator completes the sending task and closes the connection 16-bit window size: TCP traffic control is provided by the declared window size at each end of the connection. The window size is the number of bytes, starting from the value specified in the validation serial number field. This value is the byte that is expected to be received. The window size is a 16-bit field, so the window size is up to 65535 bytes. Www.2cto.com
16-bit checksum: checks and overwrites the entire TCP packet segment: TCP header and TCP data. 16-bit emergency pointer: The emergency pointer is valid only when the URG flag is set to 1. The emergency pointer is a positive offset, and the sum of values in the serial number field indicates the sequence number of the last byte of the emergency data. The TCP emergency mode is a method in which the sender sends emergency data to the other end. Option: some optional configurations mainly include the following: each option starts with a 1-byte kind field, indicating the option type. The options with the kind field 0 and 1 only occupy 1 byte. Other options include len bytes after kind bytes. It indicates the total length, including kind bytes and len bytes. Www.2cto.com 3. Various TCP statuses 1. status transition A actively establishes A connection diagram: actively establishes A connection B passively establishes A connection diagram: passively establishes A connection C actively closes A connection diagram: actively closes A connection D passively closes A connection diagram: passively closes A connection 2. establish a connection diagram: TCP establish a connection 3. closed connection diagram: TCP closed connection 4. semi-connection TCP connection is a full-duplex channel, so it supports both sending and receiving. It can be considered that a channel is divided into two parts. Just like a highway, a large road is separated in the middle, and the two sides are in the opposite direction. A semi-connection means that after the local data is sent at the end of the connection, the half connection is closed and the peer end can still send data without waiting for receiving data, the local end can read data. Let's take another example of a highway. If the connection is from Beijing to Tianjin, if one day there are no vehicles between Beijing and Tianjin, the expressway in this direction can be closed, the entrance to Beijing will be closed and there will be no vehicle access at Tianjin exit. There is no need to be on duty at Tianjin exit. However, there are a lot of vehicles going from Tianjin to Beijing, so that the entrance to Tianjin is not closed, and the exit of Beijing requires a toll to be lifted.
5. no matter when a referenced connection (referenced connection) sent from a packet segment is incorrect, TCP sends a reset packet segment (the "baseline connection" mentioned here refers to the connection specified by the destination I p address and destination port number, as well as the source I P address and source port number .) A common case of a reset is that when the connection request arrives, no process is listening to the target port. For UDP, when a datagram arrives at the destination port, this port is not in use, and it will generate an ICMP port inaccessible information. TCP uses reset.
Send a reset message segment instead of FIN to release a connection in the middle, which is called exception release. Abnormal Termination of a connection has two advantages for the Application: (1) discard any data to be sent and immediately send the reset packet segment; (2) the receiver of the RST will distinguish whether the other end is shut down abnormally or normally. The API used by the application must provide a means to generate an exception rather than close normally. The Socket API uses the "linger on close" option (SO_LINGER) to provide the ability to disable this exception. We add the-L option and set the stay time to 0. This will cause a reset when the connection is closed rather than a normal FIN. Iv. TCP Data Transmission 1. normal transmission diagram: normal data transmission 2. fast sending and slow receiving figure: data transmission between fast sending and slow receiving. When the slow receiving device receives data, because the TCP buffer data is not read to the application layer in time, an ACK with 0 Notification window is returned to the sender. 5. timeout and retransmission Mechanisms 1. round-trip RTT and re-transmission over time RTO (Retransmission TimeOut) RTT estimator: R ← aR + (1-a) M here a is a smoothing factor with a recommended value of 0.9. The smooth RTT is updated each time a new measurement is made. 90% of each new estimate comes from the previous one, and 10% is taken from the new measurement. RTO formula: E rr = M-AA ready A + g E rrD ready D + h (| E rr |-D) RTO = A + 4D here, A is A smooth RTT (mean estimator), and D is A smooth mean deviation. Err is the difference between the obtained measurement result and the current RTT estimator. Both A and D are used to calculate the next retransmission time (RTO ). The incremental g plays an average role, and the value is 1/8 (0.125 ). The gain of the deviation is h and the value is 0.25. When the RTT changes, a large deviation gain will increase RTO rapidly. A TCP connection has only one RTT timer. If the timer starts timing when a request is sent, the request is not computed by RTT. Figure: In RTT timing, 4th data packets are sent because the RTT timer has started timing (3rd data packets) and no timer is started again. 2. Congestion prevention algorithms have two types of packet loss indicators: timeout and repeated validation. Congestion avoidance algorithms and slow start algorithms are two different and independent algorithms. However, when congestion occurs, we want to reduce the transmission rate of the group into the network, so we can call the slow start command to achieve this. The congestion avoidance algorithm and the slow start algorithm need to maintain two variables for each connection: A congestion window cwnd and a slow start threshold ssthresh. The algorithm is as follows: 1) for a given connection, the initial cwnd is set to 1 packet segment, and the ssthresh is set to 65535 bytes. 2) the output of the TCP output routine cannot exceed the size of the cwnd and receiver announcement window. Congestion avoidance is the traffic control used by the sender, while the notification window is the traffic control implemented by the receiver. The former is the estimation of network congestion felt by the sender, and the latter is related to the available cache size of the receiver on the connection. 3) when congestion occurs (timeout or repeated confirmation is received), ssthresh is set to half of the current window size (cwnd and the receiver advertise the minimum value of the window size, but at least two packet segments ). In addition, if timeout causes congestion, cwnd is set to 1 packet segment (this is a slow start ).
4) when new data is confirmed by the other party, cwnd is added. However, the added method depends on whether we are performing slow start or congestion avoidance. If cwnd is less than or equal to ssthresh, a slow start is underway; otherwise, congestion is being avoided. Slow Start continues until we return to the position where congestion occurs (because we recorded half of the window size that caused us trouble in step 2 ), then it is converted to congestion avoidance. The slow start algorithm sets cwnd as one packet segment at the beginning, and Adds 1 each time a confirmation is received. The congestion avoidance algorithm requires that cwnd be increased by 1/cwnd each time a confirmation is received. Compared with the increase in the index of slow start, this is an additive increase ). We want to add up to one packet segment for cwnd in a round-trip time (no matter how many ACK packets are received in this RTT ), however, the slow start will increase cwnd according to the number of confirmations received during the round-trip time. Figure: visualized description of slow start and congestion avoidance 3. Fast retransmission and fast recovery algorithms after receiving an out-of-order message, TCP needs to generate an ACK (a duplicate ACK) immediately ). This ACK should not be delayed. The duplicate ACK is used to let the other party know the unordered message segment and tell the other party the serial number of the packet they want to receive. If the problem arises, we may receive duplicate ACK packets: Packet Loss and packet segment out of order. If the packet segment is out of order, after one or two duplicate ACK packets are sent, several out-of-order packets can be received and then re-ordered in the TCP buffer zone, then return another ACK (the serial number of the next packet to be received). If there are three or more duplicate ACK, the corresponding data of the ACK is deemed to have been lost, you need to re-upload immediately without waiting for the timeout timer to overflow. This is the fast retransmission algorithm. Figure: TCP timeout retransmission shows that after receiving three duplicate ACK packets consecutively, the sender does not start slowly, but executes the congestion avoidance algorithm, this is the quick response algorithm. The reason why the slow start algorithm is not executed is that after the sender receives a consecutive duplicate ACK packet segment, it not only determines that the packet has been lost, it can also be determined that the receiving end has received the following packets (only in this way can the receiving end continuously return repeated ACK packets), indicating that the network can still transmit packets at this time, there is no need to perform slow start to affect the transmission performance. Therefore, three packets 67, 69, and 71 were sent before the ACK packet segment was received.
This algorithm is usually implemented as follows: 1) When 3rd duplicate ACK packets are received, set ssthresh to half of the current congestion window cwnd. Retransmission of the lost packet segment. Set cwnd to ssthresh plus a 3-fold packet segment size. 2) When receiving another duplicate ACK, cwnd increases the size of one packet segment and sends one group (if the new cwnd allows sending ). 3) when the next a c k to confirm the new data arrives, set cwnd to ssthresh (the value set in step 1 ). This ACK should be used to confirm the re-transfer in step 1 within a round-trip time after retransmission. In addition, this ACK should also be used to confirm all intermediate packet segments between the lost group and the 1st duplicate ACK packets received. This step avoids congestion because we will halved the current rate when the group is lost. 4. congestion algorithm Example 1) the initial SYN has a timeout retransmission example: SYN timeout retransmission-congestion prevention example when the SYN timeout occurs, sstresh is set to its minimum value (512 bytes, 2 packet segments ). In order to enter the slow start stage, cwnd is set to one packet segment (256 bytes). When SYN and ACK are received, no changes are made to these two variables, because the new data has not been confirmed. When ACK 257 arrives, cwnd is still in the slow start stage because cwnd is smaller than or equal to ssthresh, so cwnd is increased by 256 bytes. When ACK 513 is received, the same processing is performed. When ACK 769 arrives, we are not in the slow start state, but in the congestion avoidance state. The new cwnd value is calculated as follows: the size of the current cwnd is 885 bytes calculated using the formula above: when the next ACK 1025 arrives, the cwnd value is calculated as 991 Bytes: 2) packet segment loss retransmission diagram: packet segment loss retransmission-congestion to avoid the case when 3rd duplicate ACK arrives, ssthresh is set to half of cwnd (rounded to the next multiple of the packet segment size ), cwnd is set to ssthresh, and the number of duplicate ACK received is multiplied by the packet segment size (that is, 1024 plus 3 times of 256), and then re-transmitted data is sent. Five more repeated ACK arrives (packet segment 64 ~ 66, 68, and 70), each cwnd increases the length of one segment. When the last new ACK (CIDR block 72) arrives, cwnd is set to ssthresh (1024) and enters the normal congestion avoidance process. As cwnd is less than or equal to ssthresh (now equal), the size of the packet segment is increased to cwnd with a value of 1280. When the next ACK arrives (not shown in the figure), cwnd is greater than ssthresh and the value is 1363:
In the fast retransmission and fast recovery phase, we receive the duplicate ACK in packets 66, 68, and 70 before sending new data, instead of sending packets after receiving the ACK that is already in the 64 and 65 segments. This is the result of comparing the value of cwnd with the size of unconfirmed data. When the packet segment 65 arrives, cwnd is 2048, but the unconfirmed data contains 2304 bytes (9 packet segments: 46, 48, 50, 52, 54, 55, 57, 59 and 63), so no data can be sent. When the packet segment 65 reaches, The cwnd is set to 2304. At this time, we still cannot send it. However, when the packet segment 66 arrives, cwnd is 2560, so we can send a new data packet segment. Similarly, when the message segment 68 arrives, cwnd is equal to 2816. This value is greater than the unconfirmed data size of 2560 bytes, so we can send another new data segment. When the packet segment 70 arrives, it is also processed.
5. Regroup when TCP times out and re-transmits packets, it does not have to re-transmit the same packet segment. On the contrary, TCP allows regrouping and sending a large packet segment, which will help improve performance (of course, this large packet segment cannot exceed the MSS declared by the receiver ). This is allowed in the protocol, because TCP uses the byte serial number instead of the segment serial number to identify and confirm the data to be sent.
Source http://www.cnblogs.com/geekma/archive/2012/10/23/2735944.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.