Libjingle source code parsing (6)-[pseudo TCP] TCP over UDP (4): timeout and retransmission

Source: Internet
Author: User

Timeout and retransmission

TCP is a reliable connection-oriented transport layer. When data is lost, TCP needs to re-transmit packets. TCP solves this problem by setting a timer.

For each connection, TCP has four different Timers:

1) retransmission Timer: used when you want to receive confirmation from the other end, but not receive it.

2) stick to the Timer: keep the window size information flowing.

3) keep-alive Timer: detects when the other end of the idle connection will crash or restart.

4) 2msl Timer: Time for measuring the time_wait status.

The ptcp itself does not provide a timer, but the getnextclock method allows the caller to obtain the next timer trigger time. When the timer triggers the next timeout, The notifyclock method must be called.

Timeout settings

TCP settings to obtain the timeout value of the ACK packet may be set to 1.5 s, 3 s, 6 s, 12 s, 24 s, 48 s, 64 s, when the timeout lasts for more than 9 minutes, TCP will be reset (RST), that is, "Exponential Backoff ".

How is the timeout calculated?

If the RTT can be well estimated, if it is confirmed that the packet does not receive a return within an RTT, it can be considered as a packet loss.

The original RTT estimation method for TCP is

R = Ar + (1-A) m

Where smoothing factor A is 90%, M indicates the RTT of this measurement, that is, the interval between the packet sent to obtain ack.

This algorithm avoids the influence of R value on the floating of new M Through smoothing factor. However, this is precisely because the connection status cannot be promptly reflected in connections with large RTT fluctuations. In addition, when the network is saturated, frequent retransmission may cause burning of oil. Jacob son designed a new algorithm for this:

Err = m-

A = a + G * err

D = d + H (| err |-d)

RTO = a + 4D

The incremental G is 0.125 (1/8), and the err is the difference between the obtained value and the new RTT. A is the incremental data measured in the previous step, and H is 0.25.

When the RTT changes greatly, the err will also increase, resulting in D becoming larger, leading to a rapid increase in RTO. The estimated value of a connection and the real RTT relationship are estimated as follows:

The ptcp implementation is as follows:

Ptcp sets the maximum timeout value to 60 s. When an ACK is received, the RTT is calculated by the timestamp difference value in the ptcp header, so the Karn algorithm is not used here. The RTO algorithm is consistent with the one described above:

1) Err = RTT-m_rx_srtt

2) d = d + 0.25 * (ABS (ERR-D ))

3) m_rx_srtt = m_rx_srtt + err/8

4) RTO = m_rx_srtt + d

The following code implementation varies, but the careful analysis is consistent with the above algorithm.

Bool pseudo TCP: Process (segment & SEG ){...... // check if this is a valuable ack if (SEG. ack> m_snd_una) & (SEG. ack <= m_snd_nxt) {// calculate round-trip time if (SEG. tsecr) {long RTT = talk_base: timediff (now, SEG. tsecr); // calculate RTT if (RTT> = 0) {If (m_rx_srtt = 0) {m_rx_srtt = RTT; m_rx_rttvar = RTT/2 ;} else {m_rx_rttvar = (3 * m_rx_rttvar + ABS (long (RTT-m_rx_srtt)/4; m_rx_srtt = (7 * m_rx_srtt + RTT)/8 ;} m_rx_rto = bound (min_rto, m_rx_srtt + talk_base: _ max <uint32> (1, 4 * m_rx_rttvar), max_rto );} else {assert (false );}}......}

When the retransmission still times out, ptcp also uses the exponential backoff algorithm.

Congestion Avoidance Algorithm

Congestion avoidance algorithms are usually used together with slow-start algorithms, mainly to limit the traffic of senders. The purpose of slow start is not to send data too quickly, resulting in the middle of the router to fill up the buffer, and congestion avoidance algorithm is a method for sending the lost group when the network is found to be congested.

Congestion Avoidance algorithm and slow start algorithm maintain two variables cwnd and ssthresh on one connection at the same time.

1) initialize cwnd to 1 for a given connection.

2) When congestion occurs (timeout or the third Ack is duplicated), ssthreth takes half of the current window. If the congestion is caused by timeout, cwnd is set to 1.

3) when the new data packet is confirmed, if cwnd <ssthreth is used, the slow start algorithm is implemented. Otherwise, cwnd increases by 1/cwnd for each confirmation.

Fast retransmission and fast recovery Algorithms

Why is congestion caused when we obtain more than three duplicate ack statements?

Because when the receiver receives the out-of-order packet segment, it immediately sends the next packet segment to be received. However, when the sender sends more than two packets, the packet may be in a short out-of-order due to different routes, in order to avoid the resulting retransmission, set the congestion judgment to more than three.

When receiving more than three repeated packet segments, the sender deems the packet is lost, so the packet segment is re-transmitted immediately and will not wait until the timeout timer overflows. This is the fast retransmission algorithm.

After the sender retransmits the data, the sender continuously sends the data that is not sent after the retransmission, instead of starting the slow start. The sender waits for ACK because the sender receives more than three consecutive ack instructions, the receiver receives more than three data packets and caches them. This is the quick recovery algorithm, which is implemented as follows:

1) when three duplicate ACK packets are received, ssthreth is set to half of the current window, And cwnd is set to ssthresh + 3.

2) When receiving another duplicate ACK, cwnd adds a packet segment and retransmits it.

3) when the next ack arrives, cwdn is set to ssthreth, that is, congestion is avoided, and the rate is halved.

The first step is to set ssthresh to half the unconfirmed bytes when three ACK packets are retransmitted.

Bool pseudo TCP: Process (segment & SEG ){...... if (SEG. ack> m_snd_una) & (SEG. ack <= m_snd_nxt) {// when a valid Ack is received ...... if (m_dup_acks> = 3) {// confirm the uint32 ninflight = m_snd_nxt-m_snd_una after re-transmission if (m_snd_una> = m_recover; // unconfirmed data m_cwnd = talk_base: _ min (m_ssthresh, ninflight + m_mss); // cwnd is set to ssthreth m_dup_acks = 0; // duplicate ack counters cleared} else {If (! Transmit (m_slist.begin (), now) {// slow start, continue sending closedown (econnaborted); Return false;} m_cwnd + = m_mss-talk_base: _ min (nacked, m_cwnd) ;}} else {m_dup_acks = 0; // slow start, congestion avoidance if (m_cwnd <m_ssthresh) {// slow start m_cwnd + = m_mss ;} else {m_cwnd + = talk_base: _ max <uint32> (1, m_mss * m_mss/m_cwnd); // avoid congestion, add 1/cwnd }}} else if (SEG. ACK = m_snd_una ){//!?! Note, TCP says don't do this... but otherwise how does a closed window become open? M_snd_wnd = static_cast <uint32> (SEG. WND) <m_swnd_scale; // check duplicate acks if (SEG. len> 0) {// It's a dup ACK, but with a data payload, so don't modify m_dup_acks} else if (m_snd_una! = M_snd_nxt) {m_dup_acks + = 1; if (m_dup_acks = 3) {// fast retransmission if (! Transmit (m_slist.begin (), now) {closedown (econnaborted); Return false;} m_recover = m_snd_nxt; uint32 ninflight = m_snd_nxt-m_snd_una; token = talk_base :: _ max (ninflight/2, 2 * m_mss); // set ssthresh to the minimum value of two MSS and cwnd m_cwnd = m_ssthresh + 3 * m_mss; // cwnd is set to ssthresh plus 3} else if (m_dup_acks> 3) {m_cwnd + = m_mss; // when receiving the duplicate ack after retransmission, only one MSS is added, that is, the quick recovery algorithm} else {m_dup_acks = 0 ;}}......}

Regroup

When TCP times out for re-transmission, it is allowed to send larger messages not greater than the MSS, that is, to merge the subsequent data for sending, as is the case with ptcp.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.