The pros and cons of TCP congestion control ABC (appropriate Byte counting)

Source: Internet
Author: User
Tags rfc

TCP in the slow-start phase, each RTT congestion window grows exponentially, and TCP increases linearly by 1 in the congestion avoidance phase, each of the RTT congestion windows. These are the books, do not take too seriously, the real situation is more complicated than this!
First of all, we look at the majority of the data on how TCP is to achieve each of the RTT window, all is the theory of nonsense, nothing realistic!
In the slow-start phase, each time an ACK is received (the packet is sent from the send to its ACK, which is a RTT), the window increases by 1, in the congestion avoidance phase, every time the previous window is received with an ACK, the window is incremented by 1, that is, each time an ACK is received, the window is incremented 1/cwnd! This is ideal, however, what if the ACK is lost? What if delay ACK is enabled on the receiving side?
According to the standard provisions, the receiver can only delay 2*MSS so many ack, if more is stretch ack! That is, if the receiver has a delay ACK enabled, an ACK is sent each time a two packet is received, and the sending side may have an additional window that is expected to increase in a rtt period of 1/2, which conforms to the logic of most data?

The standard does not stipulate that the congestion Avoidance phase window must be an RTT increase of 1, but only with an approximate algorithm:
Traditionally, TCPS has approximated this increase by increasing CWnd by 1/cwnd for each arriving ACK.  This algorithm opens CWnd by roughly 1 segment per RTT If the receiver ACKs each incoming segment and no ACK loss occurs. However, if the receiver implements delayed ACKs [Bra89], the receiver returns roughly half as many ACKs, which causes th E Sender to open CWnd + conservatively (by approximately 1 segment every second RTT).

If TCP really behaves as a packet of one ACK, then in most cases it is really possible to slow-start the exponential window and congestion to avoid linear windowing. However, as mentioned above, if the ACK is lost or the receiver delay ACK is considered, the theoretical logic will inevitably be distorted.
The TCP ACK mechanism itself is a kind of feedback, theoretically "the more data to be ACK, means that can send more" this kind of speculation is always reasonable, so RFC3465 proposed an ABC algorithm, that is, through the received ACK in the ACK of the number of bytes to calculate how to increase the window.

1.safe area and dangerous areatcp as an end-to-end protocol and no bandwidth feedback capability, so its congestion control mechanism is based on detection, that is, constantly testing the limits of bandwidth, no matter how good congestion control algorithm, its nature is more than this detection. In the view of congestion control, the so-called bandwidth detection is finally implemented into the Congestion window detection.
Today's TCP implementation basically continues the Newreno kernel, in the process of congestion window detection, it will go through two regions, one is safe area, one is dangerous area, two regions are divided by Ssthresh. In the safe area, perform a slow start of the exponential window, and perform a linear windowing congestion avoidance in the dangerous region. Ssthresh in fact is the upper bound of the security window, but also can be understood as a conservative full bandwidth window, since it is a safe area, then you can increase the window as fast as possible, so theoretically every ack (this means that the network is through), the window can be increased by 1, without waiting for a window of data are ACK to add windows, After crossing the Ssthresh, TCP will think it is possible to exceed the load capacity of the network at any time, so only after a window of data has been ACK, you can add Windows.

2. How to use the feedback of the ACK above in the description of the Safe/dangerous area, all the windowing behavior is via ACK feedback, RFC2581 recommends using the number of ACK as the window signal, but in the face of the ACK loss or delay ACK, RFC3465 gives the ABC algorithm, in ABC, uses the number of ACK bytes instead of ACK as the feedback signal for the window. In this case, the following process will be performed:
1. Slow start stage: As long as the number of data bytes ack reaches an MSS size, the window adds an MSS;
2. Congestion avoidance phase: As long as the number of ACK data bytes reached the size of a window, the Windows add an MSS.

In addition, ABC can let TCP in the slow start phase more aggressive, it can let TCP each receive ACK a MSS confirmation packet, add n instead of 1 MSS window size, because this is in the security area, radical not too!

3.ABC and sudden ABC seems to solve the delay ACK and the loss of the ACK caused by the window to increase the slow problem, but brought a sudden problem, this sudden problem is the pros or cons, not absolutely! Essentially, the ABC algorithm will bring unexpected reasons is that it will "remember" those late or missing confirmation, and accumulate to use later, which is very similar to the network traffic control burst token bucket principle, tokens can be accumulated use, in TCP, each for a MSS size data segment confirmation, Whether it is explicit or implicit, it is equivalent to a token that can accumulate.
1. Abnormal caused by the burst hypothesis ACK large area loss, for the sender, the frequency of the ACK will be reduced, the effect is the window for a long time due to receive the ACK and froze, once received an ACK, the sender will find it ack a large number (even huge) of the data, the window suddenly got compensation, May increase a lot, this kind of burst may be more serious for the slow start stage, because the window will increase n (depending on the configuration) for each time the data of MSS size is confirmed in the slow start phase, and the situation will be much mitigated for the congestion avoidance phase.
This anomaly brings the sudden, in the case of one-way congestion, the problem is not small, if there is a two-way congestion, the sending side of the aggressive window brings more drops, the sending side in the inability to distinguish between the ACK lost and delay ACK case, will cause a lot of miscarriage, fortunately, TCP specification stretch The definition of ACK, which may give the sender's judgment to bring some hints, such as received a continuous ack of more than 2 MSS data confirmation packet, it is determined that the ACK is missing, conservative window.
In order to ensure that the occurrence of the above problem will not have serious consequences, RFC3465 set the slow start stage, the maximum value of n is 2, that is, each received an MSS size confirmation, the window up to 2 additional mss!
2. Normal burst relative to abnormal burst, the normal burst appears more like a positive feedback stepless speed change system, simply speaking, if the delay ACK and ACK loss and the implementation of the receiver end BUG,ABC algorithm execution will be quite smooth, that is, the more wide the ACK coverage, the faster the window, Without considering congestion, this means that the more data you send, the faster the window increases!
RFC3465 An example is good, for example, you log in to SSH for interactive work, most of the time is small data interaction, when the window increased slowly (the ABC algorithm according to the size of the ACK bytes to determine whether to add windows and how much), at this time if you need to display a large file in the terminal, The amount of interactive data is almost instantaneous, if you add windows in accordance with RFC2581, the window increases exactly according to the number of ACK, it is said to have a maximum speed limit, however, if the use of ABC, with large chunks of data sent and confirmed, the window growth rate increases. In this case, the rate at which the ACK arrives is constant, but the number of bytes that are overwritten by the ACK is increased, and the window is growing.

3. Crossing Ssthresh in the slow start stage, the window increases exponentially, if the value of Ssthresh is larger, it may be very large when the window crosses the Ssthresh, and in the last slow-start add-on window, A large probability will make the window suddenly rise to ssthresh more than a lot, more than the network load-carrying capacity caused by packet loss. In this case, the Ssthresh does not have a threshold value at all, and in the case of N 2 in the ABC algorithm, the problem is more serious.
This problem occurs because TCP does not have finer control when the window is added near the Ssthresh. But this is not a problem, Linux 4.x+ has been a perfect fix to this problem (I do not know which version was introduced, but I am sure that 3.10 is not, and 4.3).

4.ABC and Linux TCP implementation of Linux on the ABC algorithm has undergone three stages.
Phase one: ABC as an SYSCTL option with the 2.6.32 kernel version as an example, Linux has a sysctl_tcp_abc option that chooses whether to use the ABC algorithm, and if ABC is enabled, the number of bytes confirmed by an ACK packet is saved in the Bytes_ In the Acked field, TCP is using bytes_acked in the congestion avoidance phase:
if (tp->bytes_acked >= tp->snd_cwnd*tp->mss_cache) {    tp->bytes_acked-= tp->snd_cwnd*tp-> Mss_cache;    if (Tp->snd_cwnd < tp->snd_cwnd_clamp)        tp->snd_cwnd++;}
The bytes_acked will be updated when the ACK is received:
if (Icsk->icsk_ca_state < TCP_CA_CWR)    tp->bytes_acked + = Ack-prior_snd_una;else if (icsk->icsk_ca_ state = = Tcp_ca_loss)    /* We assume just one segment left network *    /tp->bytes_acked + = min (Ack-prior_snd_una,                   Tp->mss_cache);

However, if ABC is not used, the following logic is executed for each ACK received:
if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {    if (Tp->snd_cwnd < Tp->snd_cwnd_clamp)        Tp->snd_ cwnd++;    tp->snd_cwnd_cnt = 0;} else {    //visible, not using ABC is counted by the number of ACK    tp->snd_cwnd_cnt++;}
In the slow start phase:
if (Sysctl_tcp_abc > 1 && tp->bytes_acked >= 2*tp->mss_cache)    CNT <<= 1;//can be increased by twice times window//slow start stage The window size is determined exactly by the MSS multiples of one ACK, so you need to clear bytes_ackedtp->bytes_acked = 0;tp->snd_cwnd_cnt + = cnt;//Note that The following algorithm may appear ssthresh traversal problem! while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {    tp->snd_cwnd_cnt-= tp->snd_cwnd;    if (Tp->snd_cwnd < tp->snd_cwnd_clamp)        tp->snd_cwnd++;}

Phase two: ABC optional implementation take 3.10 For example, you will find that there is no tcp_abc option, in the code, there is no count of bytes_acked, and in the congestion avoidance phase, there is only Tcp_cong_avoid_ai (TP, TP-&GT;SND _cwnd) logic, which is counted by the number of ACK.
So what if we want to implement ABC? I had to write in my own congestion module. The amount of data confirmed by each ACK (that is, bytes_acked) can be obtained through the pkts_acked callback function (called after cleaning up the packet of the transmission queue).
Phase three: ABC built-in and optimized for the Linux 4.4 kernel, there is still no tcp_abc option, but if you look at the code, you find that basically the ABC algorithm has been fully implemented:
void Tcp_reno_cong_avoid (struct sock *sk, u32 ack, u32 acked) {    struct Tcp_sock *tp = Tcp_sk (SK);    if (!tcp_is_cwnd_limited (SK))        return;    /* In the "safe" area, increase. *    /if (Tcp_in_slow_start (TP)) {        ///Slow_start function has a return value, returns how much of the confirmed data can be used to increase congestion to avoid window counting when the slow start is over        acked = Tcp_ Slow_start (TP, acked);        If 0 is returned, this window has not yet crossed the Ssthresh. Visible new version perfectly captures the Ssthresh traversal problem        if (!acked)            return;    }    /* in dangerous area, increase slowly.    the *////acked parameter is passed in, and the count as the window condition is added.    Tcp_cong_avoid_ai (TP, Tp->snd_cwnd, acked);}
If we look at the Tcp_slow_start, we will find it unusually concise:
U32 tcp_slow_start (struct tcp_sock *tp, u32 acked) {    //slow start phase, the window is increased up to Ssthresh.    //acked Add to the window, although not the standard algorithm of ABC (No implementation of N window), but basically that meaning    u32 cwnd = min (Tp->snd_cwnd + acked, Tp->snd_ssthresh);    The remainder of the cross-Ssthresh is given to congestion avoidance to handle    acked-= cwnd-tp->snd_cwnd;    Tp->snd_cwnd = Min (CWnd, tp->snd_cwnd_clamp);    return acked;}
The comment of function Tcp_cong_avoid_ai also has a sentence:
/* In theory the is Tp->snd_cwnd + = 1/tp->snd_cwnd (or alternative W), * the following sentence is new ... * for every packet Acked. */void tcp_cong_avoid_ai (struct Tcp_sock *tp, u32 W, u32 acked) {/    * If credits accumulated at a higher w, apply them g ently now. *    /if (tp->snd_cwnd_cnt >= w) {        tp->snd_cwnd_cnt = 0;        tp->snd_cwnd++;    }    Accumulate the amount of confirmed data instead of every ACK that is received simply plus 1    tp->snd_cwnd_cnt + = acked;    ... }

TCP is a very complex protocol, the implementation of Linux has also undergone tremendous changes, from 2.6.8 to 4.4, you will find a lot of details even the basic ideas have changed, behind, if not understand the RFC explained the idea, it will be difficult to understand its why, so the RFC is kingly, internal, heart!

The pros and cons of TCP congestion control ABC (appropriate Byte counting)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.