Window-based end-to-end TCP congestion control mechanism

Source: Internet
Author: User

In 1988, Van jacbson pointed out the shortcomings of TCP in controlling network congestion, and proposed "slow start" and "Congestion Avoidance ".Algorithm. The TCP Reno version that appeared in 1990 added the "Fast retransmit" and "Fast Recovery" algorithms, this avoids the use of the "slow start" algorithm when network congestion is not serious, resulting in excessive reduction of the size of the sending window. In this way, TCP congestion control is composed of the four core components. In recent years, Improved TCP versions have emerged, such as new-Reno and sack.

Main Parameters

TCP congestion control is implemented by controlling changes to some important parameters. TCP Parameters Used for congestion control include:

(1) congestion window (cwnd): a key parameter for congestion control. It describes the maximum number of data packets that can be sent at a time at the source end under congestion control.

(2) Announcement window (awin): the size of the preset sending window from the receiving end to the source end. It only plays a role in the initial stage of TCP connection establishment.

(3) Sending window (WIN): the size of the window in which data is actually sent by the source end each time.

(4) ssthresh: The demarcation points of the slow start phase and the congestion avoidance phase in congestion control. The initial value is usually set to 65535 bytes.

(5) loop Response Time (RTT): The interval at which a TCP packet is sent from the source end to the receiver, and the source end receives the confirmation from the receiver.

(6) Timeout retransmission counter (RTO): Describes the time interval between data packets sent and failed. It is an important parameter for judging whether data packets are lost or not and whether the network is congested. Usually set to 2rtt or 5rtt.

(7) Fast retransmission threshold (tcprexmtthresh): number of ACK packets received by the source end that can trigger fast retransmission. When the number exceeds tcprexmtthresh, the network enters the fast retransmission phase. The default value of tcprexmtthresh is 3.

Four Phases

1. Slow start stage

The old TCP sends many data packets to the network when starting a connection. Some routers have to queue data packets, so storage space may be exhausted, resulting in throughput) sharp decline. The algorithm to avoid this situation is to start slowly. When a new TCP connection is established, the congestion window is initialized to a data packet size (the default value of a data packet is 536 or 512 bytes ). The source end sends data according to the size of cwnd. Each time an ACK is received, cwnd increases the number of data packets sent. Obviously, the growth of cwnd will grow exponentially with RTT (exponential): 1, 2, 4, 8 ....... The amount of data sent from the source end to the network will increase dramatically.

2. Congestion avoidance phase

When a timeout occurs or three identical ack confirmation frames are received, the network is congested (this is the practice of TCP Reno, this assumption is based on the probability of packet corruption and loss caused by transmission being less than 1% ). In this case, the congestion avoidance phase is entered. The slow start threshold is set to half of the current cwnd; when the time-out expires, cwnd is set to 1. If cwnd is less than or equal to ssthresh, TCP re-enters the slow start process. If cwnd> ssthresh, TCP executes congestion to avoid the algorithm, cwnd only adds 1/cwnd packets each time it receives an ACK (the segsize of the packet is assumed to be 1 here ).

3. Fast retransmission and recovery

When the data packet times out, cwnd is set to 1 and re-enters the slow start. This will greatly reduce the size of the sending window and the throughput of the TCP connection. Therefore, when the source side receives three or more duplicate ACK packets, it is determined that the packet has been lost and re-transmitted, and the ssthresh is set to half of the current cwnd, instead of waiting until RTO times out. Figure 2 and Figure 3 show the changes of the congestion control window in four phases over time.

Efficiency and fairness

In addition to the self-similarity of TCP congestion control, the efficiency and fairness of TCP congestion control have also been widely concerned.

1. Efficiency

The efficiency of using network resources is determined by the closeness of the total resources required by the source end to the network resources. If the total source resources are close to or equal to the resources provided by the network, the efficiency of this algorithm is high. Overload or insufficient load are not efficient. Obviously, efficiency is only related to the utilization of the total resources, but not to the utilization of resources between various sources.

2. Fairness

Fairness means that when congestion occurs, each source (or different TCP connections or UDP datagram established on the same source) can share the same network resources (such as bandwidth and cache) fairly ). The source end must have the same number of network resources. The root cause of fairness is that congestion will inevitably lead to packet loss, and packet loss will lead to competition between data streams for limited network resources, data streams with weak competition capabilities will suffer more damage. Therefore, without congestion, there will be no fairness issues.

The fairness of the TCP layer is manifested in two aspects:

(1) connection-oriented TCP and connectionless UDP respond to and handle different congestion indicators when congestion occurs, resulting in unfair use of network resources. When congestion occurs, TCP data streams with a congestion control mechanism will enter the congestion avoidance stage according to the congestion control steps, so as to actively reduce the data volume sent into the network. However, for non-connection datagram UDP, because there is no end-to-end congestion control mechanism, even if the network sends a Congestion Indication (such as packet loss or repeated ack reception ), UDP does not reduce the amount of data sent to the network as TCP does. As a result, fewer and fewer network resources are obtained from TCP data streams that comply with congestion control. UDP without congestion control will obtain more and more network resources, this leads to serious unfair distribution of network resources on various sources.

Unfair network resource allocation will in turn increase congestion, and may even cause congestion and collapse. Therefore, how to judge whether each data stream strictly complies with TCP congestion control when congestion occurs, and how to "Punish" the behavior of not complying with the congestion control protocol has become a hot topic of current research on congestion control. The fundamental way to solve the fairness problem of congestion control at the transport layer is to fully use the end-to-end congestion control mechanism. Currently, the following methods are used to determine whether a data stream is congested and does not comply with the congestion control:

● If the data stream follows the TCP congestion control mode, in the case of congestion, as a response, it should first halved the congestion window cwnd, and then increase the cwnd at a constant rate within each RTT. Given packet loss rate P, the maximum transmission rate of the TCP connection is T byte/S, B is the maximum number of bytes of a data packet, and r is the minimum RTT. When the transmission rate of a data stream is greater than T, it can be concluded that the data stream does not implement congestion control. This formula is mainly used in the absence of sudden packet loss. In actual use, it is determined that the data stream is not subject to congestion control based on a value greater than 1.45b/(R), and a value smaller than 1.22b/(R) is used as the condition to "Punish" the data stream.

● Determine whether a high-bandwidth data stream in the network responds to the congestion indicator to determine whether it implements congestion control. That is, as the network packet loss rate increases, the transmission rate should be reduced accordingly.

If the packet loss rate increases by X, the sending rate at the source end should be greatly reduced. For example, if the packet loss rate increases by 4 times, the sending rate should be reduced by 2 times. Based on this relationship, by detecting the response of the data stream to the packet loss rate, we can roughly determine whether the stream implements congestion control. For multi-point broadcast (Multicast) modes with frequent changes to data sources and receivers with the on/off feature, this method is not ideal in the above two cases due to frequent changes in the transmission rate.

(2) Some TCP connections also have fairness issues. The cause of the problem is that some TCP uses a large window size before congestion, or their RTT is small, or the data packets are larger than other TCP, so they also occupy more bandwidth. In short, the fundamental way to solve the fairness problem of TCP congestion control is a new algorithm that implements end-to-end congestion control and integrates IP layer congestion control on the Internet.

Improvement

In fact, TCP Tahoe exists before TCP Reno. The main difference between the two is that the latter only has the first three parts of the congestion control and does not have rapid recovery. Therefore, TCP Reno can be considered as the standby version of TCP Tahoe. However, the TCP Reno algorithm is still insufficient.

First, after the source end detects congestion, it re-transmits all data packets (that is, go-back-n algorithm) sent from packet loss to packet loss detection ), some data packets in the middle are correctly transmitted to the receiver without re-transmission.

In most TCP implementations, the RTO counter value is considered as a function of the RTT mean value and variance estimation. It is not easy to accurately estimate the RTO and RTT values. Theoretically, RTT measurement is relatively simple. It is only the time from the packet sending to the confirmation ack returned to the source end. However, TCP uses an ACK to confirm the "accumulation" of all received data, so the RTT estimation is often complicated.

In response to the above shortcomings, some improved algorithms have been proposed in recent years. Both new-Reno and sack are in ultimate version. The sack algorithm is extended based on Reno to selectively confirm and re-transmit data packets. In this way, the source end can accurately know which packets are correctly transmitted to the receiving end, so as to avoid unnecessary retransmission, reduce latency, and improve network throughput. New-Reno does not use the sack method, but tries its best to avoid many retransmission timeouts of Reno in the fast recovery phase. It uses an ACK to confirm part of the sending window and re-transmits the remaining data packets immediately. Obviously, new-Reno only needs to modify the source end.Code.

In summary, even if the source end does not wait for timeout to recover the packet lost in a window data, Reno and new-Reno can only re-transmit at most one discarded packet in one RTT. Sack uses the pipe variable to indicate the number of packets lost in the sending path. Use tcpremtthresh to determine whether congestion has occurred. Reno is better than Tahoe, while new-Reno and sack are better than Tahoe and reno. Because sack does not send packets when all the new-Reno packets are re-transmitted at a time, it resends packets selectively. Therefore, when a large number of data packets are lost in a window, the performance of sack is better than that of New-Reno, but the biggest disadvantage of sack is to modify the TCP protocol.

as the RTT value is closely related to network running conditions, the Vegas algorithm that uses RTT to Control Congestion has emerged in recent years. Vegas controls the congestion window cwnd by observing the changes in the RTT value in the previous TCP connection. If the RTT becomes larger, Vegas considers the network to be congested and begins to reduce cwnd. If the RTT becomes smaller, Vegas removes the congestion and increases cwnd again. In this way, cwnd is ideally stable on a suitable value. The biggest advantage of this is that the congestion mechanism is only triggered by changes in the RTT, but not the specific transmission latency of packets. Because it does not use packet loss to determine the available bandwidth of the network, it is determined by the RTT change, so it can better predict the network bandwidth usage, in addition, it is highly adaptable to small caches and has good fairness and efficiency. However, the Vegas algorithm is still widely used on the Internet. This is not a problem with the algorithm itself, but because of unfair competition in bandwidth with and without the Vegas algorithm.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.