Confused window syndrome and Nagle Algorithm

Source: Internet
Author: User

Note: The series of TCP/IP explanation is not a textbook, but is not detailed in many places. For example, SWS did not explain what it was, so they began to introduce the method to avoid it, but also pulled it together with Nagle. Intuition tells me that the two must be tricky and search for it, which is really rewarding. I will post it here today and share it with you.

 

Part 1: SWS

What is confused window syndrome?

When the sending application process produces slow data, or the receiving application process processes slow data in the receiving buffer, or both, the packet segment transmitted between application processes is very small, especially the payload is very small. In extreme cases, the payload may only have 1 byte, while the transmission overhead is 40 bytes (20 byte IP header + 20 byte TCP Header). This is called the confusion window syndrome.

SWS caused by the sender

If the sending end is an application that produces slow dataProgramService (a typical Telnet application), for example, one byte is generated at a time. This application writes one byte of data to the TCP cache of the sending end at a time. If the sending end does not have a specific command, it generates a packet segment that only contains one byte of data. As a result, many 41-byte IP data packets are transmitted over the Internet. The solution is to prevent the sender from sending data in bytes over TCP. The sender must force TCP to collect data and use a larger data block to send the data. How long does the sending end wait for TCP? If it waits too long, it will cause a long delay throughout the process. If the waiting time is not long enough, it may send a small packet segment. Therefore, Nagle found a good solution and invented Nagle.Algorithm. The selected wait time is an RTT, that is, when the next ack comes.

SWS CAUSED BY THE ACCEPTOR

TCP at the receiving end may produce confusion window syndrome if it serves applications that consume slow data, for example, consuming one byte at a time. It is assumed that the sending application generates 1000 bytes of data blocks, but the receiving application only absorbs 1 byte of data each time. Then we assume that the TCP input cache at the receiving end is 4000 bytes. The sender sends the first 4000 bytes of data first. The acceptor stores it in its cache. The cache is full now. The notification window size is zero, which indicates that the sender must stop sending data. The receiving application reads the first byte of data from the TCP input cache at the receiving end. The cache contains 1 byte of space. The receiver's TCP announces that its window size is 1 byte, which indicates that TCP is eager to wait for the sender to send data to regard this announcement as good news, and sends packets that only contain one byte of data. This process continues. One byte of data is consumed, and then a packet segment containing only one byte of data is sent.

For this confused window syndrome, that is, the application consumes less data than it arrives, there are two suggested solutions:
1) Clark solution is to send confirmation as long as there is data arriving, but the declared window size is zero until or the cache space can already be placed in the packet segment with the maximum length, or half of the cache space is empty.
2) Delay confirmation the second solution is to wait for a period of time before sending confirmation. This indicates that confirmation is not sent immediately when a packet segment arrives. The receiving end waits until there is sufficient space in the cache. Delayed confirmation prevents the sender from sliding its window over TCP. When the sending end sends the data, it stops. This prevents such symptoms. Delayed Validation also has another advantage: It reduces traffic. The receiver does not need to confirm each packet segment. However, it also has a disadvantage that delayed validation may force the sender to re-transmit unconfirmed packets. The protocol can be used to balance this advantage and disadvantage. For example, the latency defined for validation cannot exceed 500 milliseconds.

 

Part 2: Nagle Algorithm

In TCP/IP, no matter how much data is sent, always add a protocol header before the data. At the same time, the other party receives the data and also needs to send an ACK to confirm. To make full use of network bandwidth, TCP always wants to send big data as much as possible. (MSS parameters are set for a connection. Therefore, TCP/IP needs to be able to send data with MSS data blocks each time ).The Nagle algorithm is used to send as much data as possible to avoid the Network Flooding with many small data blocks.
NThe basic definition of the agle algorithm isA maximum of unconfirmed segments can be found at any time.. The so-called "small segment" refers to a data block smaller than the MSS size. The so-called "unconfirmed" refers to a data block sent out, failed to receive the ACK from the other party to confirm that the data has been received.
Rules of the Nagle algorithm (see the tcp_nagle_check function comment in the tcp_output.c file ): 

(1) If the package length reaches MSS, sending is allowed;

(2) If the message contains fin, the message can be sent;

(3) If the tcp_nodelay option is set, the message can be sent;

(4) If the tcp_cork option is not set, if all the sent small data packets (the packet length is smaller than MSS) are confirmed, send is allowed; 

(5) If none of the above conditions are met, but a timeout occurs (generally 200 ms), it will be sent immediately.

 The Nagle algorithm allows only one unack packet to exist in the network, regardless of the packet size. Therefore, it is actually an extended stop-and other protocol, but it is based on Packet stop-Wait, rather than byte stop-wait..The Nagle algorithm is completely determined by the ACK mechanism of the TCP protocol, which may cause some problems. For example, if the ACK reply to the peer is fast, the Nagle will not splice too many data packets, although network congestion is avoided, the overall network utilization is still low.In addition, he is an adaptive method that allows readers to experiment based on the above rules. 

The Nagle algorithm is a half set of the silly window syndrome (SWS) Prevention algorithm.The SWS algorithm prevents sending a small amount of data,The Nagle algorithm is implemented by the sender. When the receiver wants to do this, do not notify the receiver of the small growth in the buffer space, unlessThe buffer space has increased significantly. Here, a significant increase is defined as a full-size segment (MSS) or half of the maximum window.

Note::BSD allows sending the last small segment of a large write operation on the idle link. That is to say, when more than one MSS data is sent, the kernel first sends n MSS data packets in sequence, and then sends the small data packets at the end without waiting for delay. (Assuming that the network is not blocked and the receiving window is large enough)

Tcp_nodelay Option

By default, the negale algorithm is used to send data. In this way, although the network throughput is improved, the real-time performance is reduced. In some highly interactive applications, the tcp_nodelay option is not allowed.Stop negale algorithm. 

In this case, each packet sent by the application to the kernel is immediately sent.Although the negale algorithm is disabled, the network transmission is still affected by the TCP validation delay mechanism.

Tcp_cork Option 

the so-called cork indicates the plug-in, visually, you can use cork to plug in the connection, so that data is not sent out first and then sent out after the plug-in is pulled out. After this option is set, the kernel will try its best to splice a small data packet into a large data packet (One MTU) and then send it out. Of course, if after a certain period of time (generally 200 ms, this value has yet to be confirmed). When the kernel is still not combined into an MTU, the existing data must also be sent (it is impossible to keep the data waiting ).
however, the implementation of tcp_cork may not be as perfect as you think, and cork will not completely plug in the connection. The kernel does not know when the application layer will send the second batch of data for splicing with the first batch of data to reach the MTU size. Therefore, the kernel will give a time limit, if you do not splice a large package (try to get close to MTU) during this time, the kernel will send it unconditionally. that is to say, if the application layer program sends small packets, tcp_cork does not have any effect, but loses the real-time data (each packet will be delayed for a certain period of time before sending ).

Nagle algorithm andCork algorithm difference

The Nagle algorithm and the cork algorithm are very similar, but they have different points of view,The Nagle algorithm mainly avoids network congestion due to too many packets (the proportion of protocol headers is very large), while the cork algorithm aims to improve network utilization, make the proportion of the protocol header as small as possible..In this case, the two are the same in avoiding sending packets. At the user control level, the Nagle algorithm is not controlled by the user socket. You can only simply set tcp_nodelay and disable it, the cork algorithm also sets or clears TCP _CorkEnable or disable it. However, the Nagle algorithm is concerned with network congestion issues. As long as all Ack is returned, packets are sent, while the cork algorithm is concerned with the content, when the sending interval between front and back packets is very short (very important, otherwise the kernel will help you send distributed packets), even if you send multiple small packets separately, you can also use the cork algorithm to splice the content into a package. If you use the Nagle algorithm, you may not be able to do this.

 

reference: http://www.cnblogs.com/ggjucheng/archive/2012/02/03/2337046.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.