Analysis of TCP fast retransmission and fast recovery principles

Source: Internet
Author: User

Timeout retransmission is an important mechanism for TCP to ensure data reliability. Its principle is to enable a timer after a data is sent,
If no ACK packet is sent within a certain period of time, the data will be re-sent until the message is sent successfully. This is data
A repair mechanism is provided when the package is lost. Generally, retransmission occurs after timeout, but if the sender receives more than three
The duplicate ack should realize that the data is lost and needs to be re-transmitted. This mechanism does not need to wait until the retransmission timer overflows.
Fast retransmission. After fast retransmission, it is called a fast recovery algorithm because it does not take a slow start but is congested to avoid algorithms.

Fast retransmission and fast recovery are designed to quickly recover lost data packets.
Without fast retransmission and fast recovery, TCP will use a timer to request the transmission to be paused. No new data packet exists during the pause period.
Sent.
The fast retransmission and recovery algorithms are proposed in 4.3bsd and described in rfc2001 and rfc2581.
Fast retransmission and fast recovery have gone through several stages.

 

1. Tahoe

The Tahoe algorithm is an earlier version of TCP. Its core idea is to allow cwnd to rapidly approach available channel capacity in exponential growth mode, and then
Slowly approaching the balance. Tahoe includes three basic congestion control algorithms: slow start, congestion avoidance, and fast retransmission.

We can see that Tahoe does not have a quick recovery algorithm. This is also its deficiency.
Tahoe sets cwnd to 1 when three duplicate ACK packets are received or timeout, and then enters the slow start stage. This will cause the Internet
On the other hand, the network utilization is greatly reduced.
That is to say, once a packet is found to be dropped, cwnd will be taken back to its original shape.

There is no quick recovery algorithm. During the restoration of lost data packets, new data packets cannot be sent. The throughput for this period is 0. In fact
If you use this time to send a certain amount of new data packets, the transmission efficiency of packet loss can be greatly improved. This is the quick recovery name
Source.

 

2. Reno

Compared with Tahoe, Reno adds a fast recovery phase, that is, after fast retransmission is completed, it enters the congestion avoidance phase instead of the slow
Start stage.
Reno prevents the communication path from going empty after fast retransmit.
Reno sender uses additional incoming DUP acks to clock subsequent outgoing packets.
Algorithm Description:

 

Step 1: If (dupacks> = 3) {ssthresh = max (2, cwnd/2); cwnd = ssthresh + 3 * SMSs;} Step 2: Group for retransmission Loss Step 3: after receiving a duplicate ack confirmation, cwnd ++ step4: when receiving the ACK confirmation for the new data, cwnd = ssthresh, this ack can be used to confirm all packets sent before the first ack repeat after the lost group.

 

In the quick recovery phase, each time a duplicate Ack is received, cwnd adds 1; When a non-duplicate Ack is received, cwnd = ssthresh is set,
Transfer congestion avoidance phase; if a retransmission occurs, set ssthresh to half of the current cwnd, and enter again
Slow start stage.
Exit condition of the Reno quick recovery phase: Non-repeated Ack is received.

 

3. NewReno

Reno cannot effectively process the loss of multiple groups from the same data window, so it has NewReno.
NewReno modifies the Reno's quick recovery algorithm to process "partial validation" when multiple message segments in a window are lost at the same time"
(Partial acks, which arrives at the fast recovery phase and confirms new data, but it only confirms part of the data sent before the fast retransmission.
Data ).
In this case, Reno will exit the quick recovery status and wait for the timer overflow or repeated ack confirmation to arrive, but NewReno
It does not exit the quick recovery status, but rather:

Step 1: re-transmit the packet segment following the ACK. The congestion window equals to the part where the partial Ack is subtracted. Step 2: for new data to be confirmed, cwnd ++ Step 3: For the first or every partial ACK, re-transmit the timer to reset. And reset cwnd = original cwnd/2 each time.

 

The NewReno algorithm contains the variable recover. Its value is the maximum sending serial number when packet loss is detected. Only data before recover
After the report is confirmed, a quick recovery can be launched to enter the congestion avoidance stage.
When timeout occurs, the maximum serial number sent is saved in the recover variable to end the quick recovery process.
NewReno does not support sack.

 

4. Sack

During fast recovery, sack maintains a variable calledpipe that represents the estimated number
Of packets outstanding in the path.
The sender only sends new or retransmitted data when the estimated number of packets in the path
Is less than the congestion window. The variable pipe is incremented by one when the sender either
Sends a new packet or retransmits an old packet. It is decremented by one when the sender es
A dup ack packet with a sack option reporting the new data has been encrypted ed at the specified er.

Use of the pipe variable decouples the demo-of when to send a packet from the demo-of which
Packet to send.

The sender maintains a data structure, thescoreboard, that remenbers acknowledgements from
Previous SACK option. When the sender is allowed to send a packet, it retransmits the next packet
From the list of packets inferred to be missing at the specified er. If there are no such packets and
Extends er's advertised window is sufficiently large, the sender sends a new packet.

When a retransmitted packet is itself dropped, the sack implementation detects the drop with
Retransmit timeout, retransmitting the dropped packet and then slow-starting.
The sender exits fast recovery when a recovery acknowledgement is already ed acknowledging
All data that was outstanding when fast recovery was entered.

 

Step 1: fast recovery is initiated, pipe-1 (for the packet assumed to have been dropped ). pipe + 1 (for the packet retransmitted) cwnd = cwnd/2 Step 2: If pipe <= cwnd, sender retransmits packets inferred to be missing. if there are no such packets, sender sends new packets. step 3: When sender es a dup ACK, pipe = pipe-1 when sender sends a new/retransmit an old packet, pipe = pipe + 1 s TEP 4: for partial acks: pipe = pipe-2 (one for original dropped packet, one for retransmitted packet) Step 5: all packets outstanding before fast reere were acked, exit fast recovery. when you exit fast recovery, cwnd is also restored to ssthresh to avoid congestion.

 

Unlike Reno:

(1) When to send packet: it is determined by the pipe change calculation. It is no longer the cwnd change calculation.
(2) which packet to send: determined by the information carried by the sack, the response is faster.

 

Question 1: Will repeated packet loss in a window affect the situation?

Answer: Yes. If you lose only one package, you can confirm all the packages in this window when you receive the non-repeated ack. Then enter the congestion
Avoid phase. This is what Reno wants to achieve.
If multiple packets are lost, all the packets in this window cannot be confirmed when the unduplicated Ack is received. However, it will also exit the quick recovery,
Enter the congestion avoidance stage.

There may be two situations at this time:
(1) Fast retransmission and recovery are performed multiple times. Packet loss occurs, and the request enters fast retransmission and rapid recovery again. Note:
During fast retransmission and fast recovery, ssthresh and cwnd must be halved. Multiple packet loss causes the ssthresh index to decrease.
By drawing the cwnd (t) diagram, we can find that not only is the throughput very low during this period of time, but also the starting point of congestion avoidance after recovery is very low.
The resulting throughput is also low.

(2) After multiple fast retransmission and rapid recovery, transmission times out.
What are the conditions for transmission timeout?

1) When two packets are dropped from a window of data, the Reno sender is forced to wait for
Retransmit timeout whenever the congestion window is less than 10 packets when fast
Recovery is initiated, and whenever the congestion window is within two packets of the specified er's
Advertised window when fast recovery is initiated.

2) When three packets are dropped from a window of data, the Reno sender is forced to wait
A retransmit timeout whenever the number of packets between the first and the second dropped
Packets is less than 2 + 3 W/4, for W the congestion window just before the fast retransmit.

This situation has a very high probability, that is, three packet loss occurs in a window, and a high probability of timeout occurs. Window size when packet loss occurs
If there are three packet loss, no matter the packet loss order, there will always be timeout after two fr/FR requests.
Timeout is generally caused by unconfirmed data packets> cwnd that can be used. New data cannot be sent, but is there enough data in the network?
Repeat ack to trigger fr/FR.

 

Question 2: Why does cwnd increase when congestion occurs?

Answer: When packet loss is detected, the window is cwnd. At this time, the network contains the most cwnd packets (in_flight <cwnd ). Every time you receive
A duplicate ack indicates that a packet has left the network and reached the receiving end. In this case, the network can accommodate another package. By
The sliding window on the sender cannot be moved, so if you want to keep in_flight, you can use cwnd ++.
In this way, the throughput can be increased. In fact, compared with the cwnd packet loss occurs,
It has been greatly reduced.

 

Performance Analysis

Tahoe does not have a rapid recovery mechanism. After packet loss, it not only resends some data that has been successfully transmitted, but also the throughput during recovery.
It is not high either.
With the information carried by the SACK option, we can know in advance which packets are lost. NewReno can only be restored within each RTT
A lost data packet, so if n data packets are lost, fast recovery will last n * RTT time. When n is compared
This is a long time. Sack does not have this restriction. It can be restored simultaneously based on the information of the sack option.
Multiple data packets for faster and more stable recovery.
When multiple packet loss occurs in the same window, both sack and NewReno can be restored quickly and stably. While Reno goes through
Timeout often occurs, and then it is restored with a slow start. At this time, Reno is like a Tahoe, which may cause repeated accepted data.
Transfer. During the Reno recovery period, some problems may occur, such as low throughput, long recovery time, unnecessary data re-transmission, and low threshold after recovery.
Seriously affect the performance.

 

Conclusion

After the above analysis, we can see that:

It is a fundamental consequence of the absence of sack that the sender has to choose
The following strategies to recover from lost data:

(1) retransmitting at most one dropped packet per round-trip time

(2) retransmitting packets that might have already been successfully delivered.

 

Reno and new-Reno use the first strategy, and Tahoe uses the second.
With sack, a sender can avoid unnecessary delays and retransmissions, resulting in improved
Throughput.

 

Sack Deficiency

I have mentioned a lot about Sack. Now let's talk about its shortcomings.

For a large BDP network where the number of packages are in flight, the procesing overhead
Sack information at the end points can be quite overhelming because each sack block invokes
A Research into the large packet buffers of the sender for acked packets in the block, and every
Recovery of a loss packet causes the same search at the specified er.

In the BDP network, this problem is particularly obvious, which may cause a series of problems due to serious CPU consumption. To a certain extent
Like a DoS attack, each traversal consumes a lot of CPU. the time complexity is O (n ^ 2), and N is packets in
Flight quantity.

The system overload can cause serious problem: it can cause multiple timeouts (as even Packet
Retransmission and exceptions are delayed) and a long period of zero throughput.

Of course, this is a medium-scale BDP (100 ~ 1000) or a large-scale BDP network.
There will be no major problems.

 

The author of this article is zhangskd. If it is reprinted, please indicate the source, THX.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.