How to make UDP reliable

Last Update:2018-08-02 Source: Internet

Author: User

Tags ack quic

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recent conversations with many friends in the field of real-time audio and video have talked about RUDP (Reliable UDP), which is a commonplace problem, rudp is used on many famous projects, such as Google's Quic and WEBRTC. A layer of reliability on top of UDP, many friends think this is a very unreliable thing, and some friends think this is a big kill device, can solve the real-time field of most problems. As an education company, genius in many real-time scenarios do use RUDP technology to solve our problems, different scenarios we adopt the RUDP way is not the same. Let's take a look at genius which scenes used RUDP:

1) Global 250 millisecond delay real-time 1v1 answering, using RUDP + multipoint relay Intelligent routing Scheme.

2) 500 MS 1080P Video Interactive system, using RUDP + proxy scheduling transmission scheme.

3) 6-party real-time synchronous writing system, using the reliable transmission technology of Rudp+redo log.

4) 720P coaxial transmission system of pad under weak network WiFi, using RUDP+GCC real-time flow control technology.

5) Multi-point distribution system with large live broadcast, which saves more than 75% of the distribution bandwidth through RUDP + multipoint parallel relay technology.

Related to real-time transmission we will first consider the RUDP,RUDP application in all aspects of the core transmission system, but different system scenarios we designed different RUDP way, so based on those heated discussion and we use the experience I grilled a steak rudp. In fact, there is a triangular equilibrium relationship in the field of real-time communication: The constraints of cost, quality and delay (Figure 1)

Figure 1

That is, the cost of inputs, the quality of access and communication between the delay is a triangular constraint (LEQ) relationship, so the designers of real-time communication system in these three constraints to find a balance point, TCP is to increase the delay and transmission costs to ensure the quality of the communication mode, UDP is a way of communicating by sacrificing quality to ensure latency and cost, so it is easier to find such a balance in some specific scenarios rudp. RUDP is how to find the balance point, we must first from the rudp of the reliable concept and use of the scene to analyze. a reliable concept

In the real-time communication process, different demand scenarios for the reliable demand is not the same, we are here in general summed up as three categories of definitions:

L try to be reliable: the receiver of the communication requires the sender's data to arrive as fully as possible, but the data of the business itself can be allowed to be missing. For example: Audio and video data, idempotent state data.

L Disorderly and reliable: the receiver of the communication requires that the sender's data must be fully arrived at, but can be reached in sequence. For example: File transfer, whiteboard writing, graphics real-time plotting data, log-type append data, etc.

L Orderly and reliable: the communication receiver requires the sender's data to arrive in order.

RUDP is based on these three types of requirements and the triangular constraints of Figure 1 to determine their own communication model and mechanism, that is, to find a balance point of communication. Why UDP should be reliable

When it comes to this, many people will say, "Why bother, just use TCP?" It is true that many people do this, TCP is a reliable communication protocol based on fairness, in some harsh network conditions, TCP either can not provide normal communication quality assurance, or high cost. The reason why we should make reliable guarantee on UDP is to reduce the cost under the condition of guaranteeing the delay and quality of communication, RUDP mainly solve the following related problems:

L End-to-end connectivity issues: General terminal direct and terminal communication will involve NAT traversal, TCP in the NAT traversal implementation is very difficult, relatively speaking, UDP traversal NAT is much simpler, if the end-to-end reliable communication is generally used rudp way to solve, the scene is: End-to-end file transfer, audio and video transmission, Interactive instruction transfer and so on.

L Weak network environment transmission problem: In some WiFi or 3g/4g mobile network, need to do low-latency reliable communication, if the TCP communication delay can be very large, which will affect the user experience. For example: real-time operation of online games communication, voice dialogue, multi-whiteboard writing, and so on, these scenarios can use a special rudp way to solve such problems.

L Bandwidth Competition Problem: Sometimes the client data upload needs to break through its own TCP fairness limit to achieve high-speed low latency and stability, that is to use a special flow control algorithm to squeeze the client upload bandwidth, such as: Direct broadcast video streaming, such scenes with RUDP to achieve not only can squeeze bandwidth, Also can better increase the stability of communications, to avoid similar to the frequent disconnection of TCP.

L Transmission Path Optimization problem: In some of the high latency requirements of the scenario, will use the application layer relay way to do transmission routing optimization, that is, dynamic intelligent selection, then both sides use RUDP mode to transmit, intermediate delay to relay route optimization delay. There is also a class of transport throughput-based scenarios, such as: data distribution between services and services, data backup, and so on, such scenarios are generally used in multi-point parallel relay to improve the speed of transmission, but also to build on the RUDP (these two points in the following emphasis to describe).

L Resource Optimization problem: Some scenarios in order to avoid TCP three handshake and four waves of the process, will use RUDP to optimize the resource occupancy and response time, improve the system's concurrency, such as: QUIC.

No matter what kind of scene, is to ensure the reliability, that is, the quality, then on UDP how to achieve reliable. The answer is to re-preach.

Retransmission Mode

IP protocol in the design time is not designed for reliable data arrival, so UDP to ensure reliable, rely on retransmission, which is our usual sense of rudp behavior, before the description of RUDP retransmission before the first to understand the RUDP basic framework, as shown:

Figure 2

RUDP is divided into sending and receiving end, each kind of rudp in the design will do a different choice and simplification, summed up is the unit in the diagram. The retransmission of the RUDP is the transmission end through the receiving End ACK packet feedback to carry on the data retransmission, the sending side will design its own retransmission mode according to the scene, the retransmission way divides into three kinds: the time retransmission, the request retransmission and the FEC choice retransmission.

L timed Retransmission

Timed retransmission is a good understanding, that is, the sender if the packet (T1) at the time of an RTO has not received the ACK message of this packet, then the transmission is re-transmit the packet. This approach relies on the ACK and RTO of the receiving end, which is prone to false positives, in two main cases:

ü The other party receives the packet, but the ACK is sent in the way lost.

Üack is on the way, but the sender's time has exceeded an RTO.

So the method of time-out retransmission is mainly focused on the calculation of RTO, if your scene is a delay-sensitive but the traffic cost is not high, you can compare the design of the RTO is relatively small, so that the maximum possible to ensure that your delay is small enough. For example: real-time operation of online games, education in the field of writing synchronization, is a typical use of expense for latency and quality scenes, suitable for small bandwidth low latency transmission. In the case of large-bandwidth real-time transmission, the timing retransmission of the bandwidth consumption is very large, the extreme situation will be 20% repeat retransmission rate, so in large bandwidth mode is generally used in the request retransmission mode.

L Request Retransmission

Request retransmission is the receiver at the time of sending an ACK to carry their own loss of information feedback, the sender received the ACK message based on packet loss feedback retransmission. The following figure:

Figure 3

The most critical step in this feedback process is to send back an ACK when the message should be carried out, because the UDP in the network transmission process will be disorderly jitter, the receiver in the process of communication to evaluate the network jitter time, that is, Rtt_var (RTT variance value), When the packet is found to record a moment T1, when T1 + Rtt_var < curr_t (the current moment), we think it is lost, this time the subsequent ACK will need to carry this packet loss information and update the packet drop time T2, subsequent continuous scan packet loss queue, if he t2 + RTO < curr_t, again in the ACK carries the packet loss information, and so on until the message is received. This way is caused by the packet loss request, if the network is not good, the receiver will continue to initiate retransmission requests, resulting in the sending end of the retransmission, causing network storms, communication quality will fall, so we design a congestion control module in the transmitter to limit the flow, this behind our focus on analysis. In addition to the network storm, the entire request retransmission mechanism relies on the jitter time and RTO, which is the two-timing parameter, which is also closely related to the evaluation and adjustment of these two parameters and the corresponding transmission scenarios. The request retransmission this way than the time delay of the retransmission method is large, generally suitable for large-bandwidth transmission scenarios, such as: video, file transfer, data synchronization and so on.

L FEC Selection retransmission

In addition to the timed retransmission and request retransmission mode, there is also a way to select the retransmission in FEC grouping, FEC (Forward error Correction) is a forward error correction technology, usually through an XOR similar algorithm to achieve, There are also multi-layer EC algorithm and Raptor Spring code technology, in fact, is a solution equation process. The schematic diagram applied to the RUDP is as follows:

Figure 4

When the sender sends a message, it will be based on the FEC method of several messages to the FEC packet, through the XOR way to get a number of redundant packets, and then sent to the receiving end, if the receiver found drops but can be restored through the FEC packet algorithm, do not send to the end of the request retransmission, If the packet within the packet is not FEC-recoverable, the request wants the sending side to request the original packet. FEC grouping method is suitable for the transmission scenarios requiring delay-sensitive and random packet drops, and in a transmission condition where the bandwidth is not very abundant, FEC will add redundant packets, which may make the network worse. FEC can not only be combined with the request retransmission mode, but also with timed retransmission mode.

L Calculation of RTT and RTO

In the above introduction of retransmission mode several times mentioned RTT, RTO and other time measurement elaborated, RTT (Round trip times) is the network loop delay, loop delay is sent by the packet and received ACK packet calculation, the schematic is as follows:

Figure 5

rtt= T2-T1

This calculation method only calculates the RTT of a certain message moment, but the network is fluctuating, which inevitably will have the noise phenomenon, so in the process of calculation introduced the method of weighted average convergence (can refer to RFC793)

SRTT = (α* SRTT) + (1-α) RTT

This can be obtained by the new approximation of the SRTT, in the formula General α=0.8, determined the SRTT, the next step is to calculate Rtt_var (variance), we set Rtt_var =| Srtt–rtt|

Then Srtt_var = (α* srtt_var) + (1-α) Rtt_var

This can get Rtt_var value, but in the end we need to go to the top RTO, because it involves retransmission of the message, RTO is a message retransmission cycle, from the network communication flow we are easy to know, after retransmission a packet, if a rtt+rtt_var after the time has not received the confirmation, Then we can re-preach again, then we know:

rto= SRTT + Srtt_var

However, the general network in the case of severe jitter will still have a large repetition rate problem, so:

rto=β* (SRTT + rtt_var)

1.2 <β<2.0, the value of β can be selected according to different transmission scenarios.

RUDP is through the heavy transmission to ensure reliable, retransmission in the triangular balance relationship is actually used expense and latency in exchange for quality behavior, so retransmission will attract two problems, one is delay, one is retransmission bandwidth, especially the latter, if the control is not good will attract network storm, Therefore, a window congestion mechanism is designed to avoid the problem of high concurrent bandwidth consumption at the sender side.

window and congestion control

L window

RUDP need a send and receive sliding window system to match the corresponding congestion algorithm to do flow control, some rudp need strict sending and receiving side of the window corresponding, some rudp is not to send and receive the window strictly correspondence. If a reliable and orderly rudp is involved, the receiving side will do the window to do the sort and buffer, if it is disorderly or reliable or try to secure the scene, the receiver generally do not do window buffer, only do position sliding. Let's look at the Send and receive window diagram:

Figure 6

The figure above depicts the sending side from the sending window sent 6 data messages to the receiving end, the receiver receives 101,102,103,106 will first determine the continuity of the message and slide the window start position to 103, and then each packet response ack, the sender at the time of receiving the ACK, will confirm the continuity of the message, and sliding the window to 103, the sender will judge the window's spare, and then fill the new send data, this is the entire window sliding process. Here the value of a mention is received at 106 when the processing, if it is orderly and reliable, then 106 will not notify the upper layer of the business processing, but wait 104,105. If you are trying to be reliable and out-of-order reliable scenarios, 106 notifications will be processed first for the upper tier business. After receiving the ACK, the sending side of the window to slide how much is made by their own congestion machine, that is, the sliding speed of the window is controlled by congestion mechanism, congestion control implementation either based on packet loss rate to achieve, or based on the communication delay between the two to achieve, the following to see several typical congestion control.

L Classical Congestion algorithm

TCP Classic congestion algorithm is divided into four parts: slow start, congestion avoidance, congestion processing and fast recovery, these four parts are designed to determine the sending window and send speed, in fact, in order to determine the network congestion status under the current network conditions through the network packet loss, thus determining the more suitable transmission window. The classical algorithm is based on the timing retransmission, if RUDP uses this algorithm to do congestion control, the general scene is to ensure orderly and reliable transmission while taking into account the principle of fairness of network transmission. Let's explain these parts one by one.

ü Slow Start (Slowstart)

When the connection link has just been established, it is not possible to start the CWnd set a large, so easy to cause a lot of retransmission, the classic congestion will be in the beginning of CWnd = 1, let after the loss of the communication process to gradually expand CWnd to adapt to the current network state, until the threshold threshold of slow start (Ssthresh) , the steps are as follows:

1) Initialize Set CWnd = 1, and start transfer of data

2) receive the ACK of the feedback, the CWnd will be added 1

3) When a Send side is a RTT and no packet retransmission is found, CWnd = CWnd * 2 will be returned.

4) When the CWnd >= Ssthresh or the loss packet retransmission occurs slow start end, into the congestion avoidance state.

ü Congestion avoidance

When the communication connection end slow start, it is possible that the network transmission speed is not online, this time need to further through a slow adjustment process to adapt. Generally is a RTT after if no drops are found, that is, CWnd = CWnd + 1. Once the packet loss and time-out retransmission is found, the congestion processing state is entered.

ü Congestion handling

Congestion processing in the TCP implementation is very violent, if the loss of packet retransmission, directly will CWnd = CWND/2, and then into the fast recovery state.

ü Fast Recovery

Fast recovery is done by confirming that the drop is only occurring in the window one location of the package to determine whether to perform a quick recovery, as described in Figure 6, if only 104 has been lost, and 105,106 is received, then the ACK will always be the ACK base = 103, If the ACK of Base 103 is received 3 times in a row, a fast recovery is performed, that is, it will be re-transmitted 104 immediately, and then if a new ACK is received and Base > 103 is

Will CWnd = CWnd + 1, and enter the congestion avoidance state.

Classical congestion control is designed based on packet loss detection and timed retransmission mode, which is a typical case of latency in exchange for quality in triangle equilibrium relationship, but because of its fairness design avoids excessive expense, it will make this transmission way difficult to squeeze the network bandwidth, It is difficult to guarantee the high throughput and hour delay of the network.

L BRR Congestion algorithm

For delay and bandwidth squeezing of classical congestion algorithm Google designed a BBR congestion control algorithm based on transmit-side latency and bandwidth evaluation. This congestion algorithm is dedicated to solving two problems:

1. Make full use of bandwidth on a certain packet loss network transmission link

2. Reduce buffer latency in network transmissions

The main strategy of BBR is to periodically evaluate the Min_rtt and max_bandwidth of the link through ACK and nack return. The maximum throughput (CWnd) size is:

CWnd =max_bandwidth/min_rtt

The transport model is as follows:

Figure 7

BBR the entire congestion control is a detection bandwidth and pacingrate state, there is a state:

Startup: Startup status (equivalent to slow start), gain parameter = Max_gain = 2.85

DRAIN: Full load transfer status

PROBE_BW: The bandwidth evaluation State, incremented (1.25) or decremented (0.75) by a smaller bbr gain parameter.

PROBE_RTT: Delay evaluation status by maintaining a minimum send window (4 MSS) for RTT sampling.

So how are these states going to switch back and forth? The following are the general steps in Quic BBR as follows:

1) When the connection is initialized, an initial CWnd = 8 is set and the state is set to startup

2) Send data under startup, according to the sampling of the ACK data periodically determine whether the bandwidth can be increased, if possible, CWnd = CWnd *max_gain. Drain status If the number of time periods exceeds the preset startup cycle time or if a packet loss occurs

3) in drain state, if flight_size (sent out but unconfirmed data size) >cwnd, continue to ensure drain status, if FLIGHT_SIZE<CWD, enter PROBE_BW state

4) in PROBE_BW state, if no packet loss occurs and

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More