Source: EMC Chinese support forum
TCP detects packet loss through the sliding window mechanism and adjusts the data transmission rate when packet loss occurs. The sliding window mechanism uses the Receiving Window of the data receiver to control the data stream.
The receiving window value is specified by the data receiver. It is stored in the TCP packet header in the form of a number of bytes and informs the transmission device of how much data will be stored in the TCP buffer. The buffer zone is the place where data is temporarily placed until it is passed to the application layer protocol for processing. Therefore, the sender can only send the data size specified by the window size field each time. In order for the sender to continue transmitting data, the receiver must send a confirmation message: the previous data has been received. At the same time, data occupying the buffer zone must be processed to release the cache space. Shows how the receiving window works:
The client sends data to the server. The server receiving window is 5000 bytes. The client sends 2500 bytes, the server buffer has 2500 bytes left, and then sends 2000 bytes, so the buffer has only 500 bytes left. The server sends confirmation information. Process the cached data and clear the cache. This process is repeated. The client sends 3000 bytes and 1000 bytes again, and the server cache is reduced to 1000 bytes. The client then confirms the data and processes the content in the cache.
More information
Adjust the window size:
When the TCP stack receives data, a confirmation message is generated and sent in the reply mode. However, the data stored in the cache of the receiving end is not always processed immediately. When the server is busy processing packets received from multiple clients, the server may become slow due to cache cleaning and cannot free up space to receive new data. If there is no traffic control, packet Loss and data corruption may occur. Fortunately, when the server cannot process data properly at the rate set in the receiving window, the size of the receiving window can be adjusted. Reduce the size of the TCP Header window returned to the sending end. As shown in:
The initial window size of the server is 5000 bytes. The client sends 2000 bytes, and then sends 2000 bytes. The buffer zone is only 1000 bytes available. The server realizes that the buffer is rapidly filling up. It knows that if data continues to be transmitted at this rate, packets will soon be lost. To prevent packet loss, the server sends confirmation information to the client, and the update window size is 1000 bytes. As a result, the client reduces data transmission and the server processes cached content at an acceptable rate, that is, the data stream is transmitted at a stable rate.
Adjusting the window size is feasible in both directions. When the server can process packets more quickly, it will send a large window of ACK packets.
Zero-window pause data streams:
In some cases, the server cannot process data sent from the client. It may be due to insufficient memory, insufficient processing capacity, or other reasons. This may cause data to be discarded and paused, but the receiving window can help reduce the negative impact.
When this happens, the server sends a message with a window of 0. When the client receives this packet, it suspends all data transmission, but maintains a connection with the server to transmit the probe (keep-alive) packet. The detection packet is sent in a stable gap on the client to view the status of the server receiving window. Once the server is able to process data again, the non-zero window size will be returned and the transfer will be restored. This example shows the zero-window notification process.
The initial receiving data window of the server is 5000 bytes. After receiving 4000 bytes of data from the client, the server load becomes very heavy and the client cannot continue to process any data. The server sends messages whose window size is 0. The client suspends data transmission and sends a Test message. After the probe packet, the server replies to inform the client of the packets that can now receive data, and the window size is 1000 bytes. The client resumes data transmission.
TCP Sliding Window practice:
In this example, start from 192.168.0.20 to 192.168.0.30. We are concerned about the window size field, which can be seen from the info column of the packet list panel and the TCP packet header of packet details. After the first three packets, you can see that the value is reduced immediately, as shown in:
The window size is changed from 8760 bytes of the first packet to 5840 bytes of the second packet to 2920 bytes of the third packet. The decrease in the window size is a typical sign of host latency. Note in the time bar that this process happened very quickly ②. When the window size decreases rapidly, it is usually possible to decrease to zero. This is the fourth message, as shown in:
The fourth packet is sent from 192.168.0.20 to 192.168.0.30 to tell 192.168.0.30 that it no longer receives any data. The 0 value can be found in TCP packet header ①, the packet list panel info column of Wireshark, And the seq/ack analysis field ② Of the TCP packet header also tell us that this is a 0-window packet.
Once a zero-window message is sent, the device 192.168.0.30 will not send any data until it receives an update from the window 192.168.0.20, notifying that the window size has increased. In this example, the zero-window problem is temporary, so the window update information is sent in the next message, as shown in.
In this example, the window size is increased to a very healthy value of 64,240 bytes. Wireshark again tells us in seq/ack analysis that this is a window update.
Once an update packet is received, the host at 192.168.0.30 starts to send data again in packets 6 and 7. This process takes place very quickly. If it lasts a little longer, it may lead to potential network interruptions, causing data transmission to slow down or fail.
In the next example of a sliding window, the first packet is normal HTTP, from 195.81.202.68 to 172.31.136.85. This packet immediately follows a zero-window packet sent from 172.31.136.85, as shown in:
This is very similar to the zero-window packet in the previous example, but the result is significantly different. The 172.31.136.85 host does not send a window update and reply to communication, but a detection packet, as shown in:
Wireshark marks this packet as a probe packet ①. The time bar tells us that this message happened 3.4 seconds after the last received message. This process continues several times. One End sends a zero-window packet and the other end sends a test packet, as shown in:
The probe message transmission gap is 3.4, 6.8, 13.5 seconds. This process may last for a long time, depending on the operating system of the communication device. In this case, the time bar value is added up and the communication is suspended for 25 seconds.
TCP Error Control and Flow Control troubleshooting summary:
Retransmission packets
Re-transmission occurs because the client detects that the server has not received the data it sent. Therefore, depending on which end of the communication you are analyzing, you may not be able to see retransmission. If the data is captured from the server and it does not receive the and retransmission packets sent by the client, nothing may be returned because the retransmission packets cannot be seen. If you suspect that the message is not lost on the server, you can try to capture the message on the client to check whether the Retransmission has occurred.
RepeatedACK
Duplicate ack can be considered as the "Opposite" of retransmission because it is generated when the server detects the loss of sent packets from the client. In most cases, the duplicate Ack is displayed when the traffic is captured at both ends of the communication. Remember that duplicate Ack is triggered when the received packets are out of order. For example, if the server receives the first and third packets, it will cause repeated ack sending and the client will re-Send the second packet quickly, because you have received the first and third packets, No matter what causes the second packet to be discarded, it is likely to be temporary, therefore, repeated ack messages are successfully sent and received in most cases. Of course, this situation does not always happen. Therefore, when you suspect that the packets are lost on the server side and do not see any duplicate ACK, You need to capture packets from the communication client.
Zero-window and probe packets
The sliding window is directly related to the failure of the server to receive and process packets. The reduction of any window size and zero value are the direct results of server problems. So if you see one of the two, you should study it in depth. Generally, update packets in the host window at both ends of network communication.