Principle analysis of Linux TCP congestion control algorithm

Source: Internet
Author: User
Tags ack rfc

Here is just a simple comb TCP version of the control principle, for the basic variable definition, you can refer to the following links:

TCP Basic Congestion Control

RTO calculation in TCP

TCP Congestion Control noun interpretation:

The 1.awnd (advised window) notification windows, sent by the receiving side TCP to the sending side TCP, tells the sender that it can use the current available space to receive the new packet.

2.cwnd (congestion window) congestion windows, artificially introduced variables for congestion control. Because if Awnd is used alone, each time the receiver receives the maximum window, the instantaneous congestion of the network can be triggered to avoid drastic decrease of network utilization.

3.ssthresh (slow start thresh) slow start threshold for determining whether to use the slow-start algorithm or the congestion avoidance algorithm. When the current window is less than Ssthresh, use the slow-start algorithm to increase the window by exponential, the current window equals Ssthresh, use slow start or congestion to avoid the algorithm to grow the sending window can be, the current sending window is greater than Ssthresh, use congestion avoidance algorithm, linearly increase the sending window.

4. send Window W = min (CWnd, Awnd)

5.duplicate ACK Repeat ack,tcp the receiver should return a duplicate ACK to the sender immediately upon receiving the wrong order/out-of-sequence packet. If the receiver has received a packet in order before 1000th, return to the sender 1001, then the receiver then received 1002th packet, instead of the expected 1001th packet, then it immediately returned to the sender of the ACK Packet 1001, the ACK relative to the first 1001ACK is a duplicate ACK packet.

The original version of the TCP protocol--tcp-tahoe:

1. Slow start

The initial value cwnd=1 (after linux3.0), the Ssthresh initial value can be set to any size (can be set to Awnd or larger, which always causes TCP to start with a slow-start algorithm, rather than congestion avoidance algorithm), such as Linux 3.2.12 is the int maximum value 0x7fffffff.

In this phase, each RTT cycle sends the window W doubly. The time graph shows the exponential curve.

2. Congestion avoidance

In case of packet loss, TCP cannot confirm the packet loss type, so it is considered that network congestion occurs, retransmission packets and enter congestion avoidance algorithm. Actions are:

Ssthresh = max (Flight SIZE/2, 2*SMSS), flight size is the number of packets that have been sent in the current send window but have not yet received an ACK, that is, flight SIZE<=W,SMSS is the maximum packet size for the sending side.

Cwnd=1 (or other initial values, such as 10), can then be re-added from the slow-start algorithm to the Send window.

3. Drop the packet and reproduce the slow start

On the basis of the previous step, according to the new initial value, go back to the 1th step and re-execute the slow start.

Such as:


At 1.tcp startup, cwnd=1,ssthresh=16, Timeline 0-4 is the slow-start algorithm phase. Then the sending window increases to Ssthresh, entering the congestion avoidance phase, timeline 4-12.

2. When sending drops, CWnd resets to 1,ssthresh set to half of the current congestion window (24), or 12. On the timeline 12-13.

3. After setting up the new CWnd and Ssthresh in the previous step, re-enter the 1th step to perform the slow start algorithm.

The late Tahoe version also introduced the Fast retransmission algorithm (the original detection of packet loss needs to wait until the RTO timeout to retransmit, fast retransmission is to receive 3 consecutive ACK immediately retransmission).

Fast Recovery version Tcp--tcp-reno (increase fast retransmission/fast recovery algorithm):

Tahoe version detection drops only through the RTO timeout will not receive the ACK can start retransmission, Reno version of the modification is to introduce a fast retransmission mechanism: when the other party received 3 consecutive ACK, no longer wait for the RTO timeout, think the network has been congestion drops, immediately retransmit packets. At the same time, because the RTO Time received 3 consecutive ACK, the network condition is still good, packet loss may be the network instantaneous congestion. So you don't have to make excessive adjustments to the send window.

Fast retransmission mechanism (fast retransmit):

When 3 consecutive repeated ACK is received, the packet is immediately re-transmitted without waiting for the RTO to time out.

Note: Fast retransmission does not necessarily have to be used with the fast recovery algorithm at the same time. Fast retransmission is also available in the Tahhoe version, but CWnd is still tuned to 1, not the CWND/2 of the fast recovery algorithm.

window Example:

Fast retransmit/Fast recovery algorithm (Fast Recovery):

1. Receive 3 duplicate ACK, set SSTHRESH=CWND/2.

2. Retransmit The lost package and set the cwnd=ssthresh+3. 3 is based on the network "Data Baoshou" law, that is, according to the ACK mechanism, received 3 duplicate ACK, indicating that 3 packets have left the network, so cwnd+3.

3. Each receive an additional duplicate ACK (package 5), cwnd=cwnd+1. Just to reflect the fact that a package has left the network.

4. Transfer Package

5. If you receive a new packet ack (Packet 7 or Package 9), Cwnd=ssthresh, called "Shrinking Window", exits the fast recovery phase and enters the congestion avoidance phase.

Corresponds to the time axis 12-13.


TCP Congestion Control algorithm RFC document link

New version tcp--tcp NewReno (relative Reno minor but important modifications are processed for partial acknowledgment of partial acknowledgment):

Reno version is insufficient:

When only one packet is lost in the Send window, the first ACK received after a RTT confirms that all packets have been sent before the fast retransmission starts. However, if multiple packets in the sending window are lost, the first ACK received after fast retransmission can only confirm some of the data in the sending window, called "Partial acknowledgments". Newreno handles "partial confirmation" by thinking that the packet is missing and re-transmitting the package.

Newreno new Fast retransmission/fast recovery algorithm:

1. When you receive 3 duplicate ACK and the sender is not in the fast recovery process ,

Set Ssthresh = CWND/2

Record the maximum packet sequence number (package 10 or larger) that has been transmitted before the fast retransmit algorithm for Recover=max sequence numbers.

2. Re-transmit the lost package (package 5), set cwnd=ssthresh+3

3. If you continue to receive duplicate ACK (package 5), cwnd=cwnd+1

4. Send package (due to previous step CWnd window expansion)

5. When the ACK (Packet 7 or package 9) of the new packet is received, the ACK may be a confirmation of the 2nd step retransmission or a later retransmission.

If (ACK confirms all packets from the missing packet until and includes "recover")

Set Cwnd=ssthresh or min (Ssthresh, flightsize + MSS) and exit the fast recovery phase.


Re-transmit the next packet that is not confirmed, and send the new packet with the number of newly confirmed packets cwnd=cwnd-+1. The purpose of this narrowing of the window is that when the fast recovery algorithm finally exits, approximately ssthresh of packets are present in the network. If you receive a new ACK, repeat the 3–4 step.


TCP NewReno RFC documentation links

Where there is a misunderstanding, you are welcome to point out.

Principle analysis of Linux TCP congestion control algorithm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.