High-performance Browser network (performance Browser Networking) Chapter III

Last Update:2014-11-17 Source: Internet

Author: User

Tags rfc

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The 3rd chapter of UDP

In August 1980, User Datagram Protocol (UDP) was added by John Postel to the core network protocol family, with the UDP protocol starting after the TCP/IP protocol, but with the same time that the TCP and IP specifications were split into two separate RFCs. This timing is very important because, as we will see, the important feature of UDP is not what new features he brings, but what features he ignores. UDP (RFC 768) is popularly known as the Null protocol, which describes the operation that can basically be accommodated on a napkin.

Datagram a self-contained, stand-alone data entity that hosts enough information to reach the destination route from the source route, rather than relying on the preceding packets between the network nodes and the underlying transport network.

Data message (Datagram) and Packet (Packet) two terms are used interchangeably, but in fact there are some slight differences between the two. Data packets (packet) are generally used to describe blocks of data in any format, whereas datagrams (Datagram) are often reserved to describe packets transmitted through an unreliable service (packet)-no transmission guarantees, no failure notifications. Because of this, you often find that someone is using unreliable (unreliable) to replace the user in the official UDP definition, which we can understand as "unreliable datagram protocol." This is also why UDP packets are generally or more accurately referred to as datagrams (Datagram).

One of the most famous of UDP is also the application that all browsers and Web applications depend on is the DNS service, any one host name, we need to obtain its IP address before the data exchange. However, even though the browser itself relies on UDP, UDP has never been the first choice for Web Capture and browser application transport protocols. Of course, the appearance of WEBRTC, the situation has changed.

The new Web real-time communication (WebRTC) standard, developed by the IETF and the consortium, enables real-time communication, such as voice and video calling, as well as other forms of peer communication, which is the use of UDP in the browser. In WebRTC, UDP is the preferred transport protocol. We'll discuss WEBRTC in depth in the 18th chapter, but before we go into the internal workings of the UDP protocol, let's look at why WEBRTC chose the UDP protocol.

NULL Protocol Service

To understand UDP and why it is commonly referred to as the "null protocol," we first need to understand the Internet Protocol (IP), which is located under the TCP and UDP protocol layers.

The primary task of the IP layer is to send datagrams from the source host to the destination host based on the address. To do this, messages are encapsulated in an IP packet ( figure 3-1), identifying the source and destination address, and some other routing parameters.

We emphasize again the meaning of the term "datagram" mentioned above: the IP layer provides unreliable data transmission, neither message acknowledgement nor loss notification, the IP layer directly exposes this layer of unreliability to the upper level. If a datagram is in transit because a routing node is congested, heavy, or otherwise lost, then the protocol from the upper layer of the IP detects, recovers, and retransmits the data--of course, this is the demand on the upper level!

Figure 3-1 IPv4 Header (20 bytes)

The UDP protocol adds a new header based on the IP packet ( figure 3-2), which adds only four additional fields: Source port, Destination port, packet length, data check message. Therefore, when the IP layer transmits the packet to the destination host, the host unlocks the UDP packet, identifies the target application through the destination port, and sends the message. Besides, there is nothing else.

Figure 3-2 UDP Header (8 bytes)

In fact, both the source port and the Check field in the UDP header are optional fields. The IP packet already contains its own header checksum, the application layer can choose to ignore the UDP check field, which means that all the UDP layer error detection and correction, can be delegated to the above application layer checksum. At its core, UDP only provides an "application-layer Reuse" feature above the IP layer, which is embedded in the source and destination ports. With this in mind, we can now summarize all the services that are not available with UDP:

No Message transmission Guarantee

No confirmation, re-send, or timeout

Cannot guarantee sequential transmission

No packet sequence number, no reordering, wireless header blocking

No connection status tracking

No state machine connected to build or shut down

No congestion control

No built-in client or network feedback mechanism

TCP is a byte-stream-oriented protocol that can send the application's message data through multiple packets, without any explicit message boundaries within the package itself. In order to achieve this goal, the connection ends are assigned a connection state, and the packets are sorted, the packet is sent again, and the packets are dispatched sequentially. Instead, the UDP datagram has a clear boundary: Every datagram is packaged in an IP packet, and every UDP packet read by the application layer is the complete message-the packet cannot be split.

UDP is a simple, stateless protocol that is suitable for booting the upper layer of other application layer protocols-almost all protocol decisions are left to the application layer above it. However, when you want to implement your own protocol to replace TCP, you should carefully consider the complexities involved, such as UDP interaction with other tiers (such as NAT traversal), and some best practices for network protocols. Without careful planning and design, designing a new protocol is not a good idea, and may eventually be implemented as a rudimentary TCP version. Various algorithms and TCP state machines have been honed and promoted for decades, and have taken dozens of mechanisms to ensure his performance.

UDP and Network address translation

Unfortunately, the IPV4 address is only 32 bits long, and it provides a maximum of 4.29 billion IP addresses. In mid-1994 (RFC 1631), the IP network address translation (NAT) specification, as a temporary solution, was proposed to address the problem of IPv4 addressing exhaustion-the number of hosts on the internet began to multiply in the early 90 's, We simply cannot assign a unique IP to each host.

The recommended solution for IP reuse is to introduce a NAT device in the Edge network, which will be responsible for mapping the tuples that maintain local IP and port tuples to one or more globally unique (public) IP addresses and ports ( figure 3-3). The local IP address space inside the NAT can be reused by many different sub-networks to address the problem of addressing exhaustion.

Figure 3-3 IP network address translation

Unfortunately, it often happens that the interim programme eventually becomes the final solution. NAT devices are not just used to solve the problem of IP address depletion, they quickly become a ubiquitous network component, including many enterprise and home agents and routers, security devices, firewalls, and dozens of other hardware and software devices that contain NAT functionality. NAT is no longer a temporary solution, it has become an integral part of the Internet infrastructure.

Reserved private network address range

The Internet Number allocation Agency (IANA), which is responsible for global IP address allocation, has reserved three well-known private network segments, often used in the NAT device Internal network:

Table 3-1. Reserved IP Address Segment

IP Address Segment	Number of addresses
10.0.0.0-10.255.255.255	16,777,216
172.16.0.0-172.31.255.255	1,048,576
192.168.0.0-192.168.255.255	65,536

You should be familiar with all or part of the address section above. The basic situation is that the local router assigns to your computer an address in one of the IP address segments above-that is your private IP address on the internal network, and NAT will do network address translation when communicating with the external network.

To avoid routing errors and confusion, the public host does not allow IP addresses to be assigned from any of these reserved private network ranges above.

Connection Status Timeout

The key issue with NAT translation, at least for UDP, is that it must hold the routing table for data transfer. NAT relies on network connection state, and UDP does not-this is a serious mismatch, which is also the source of the UDP transmission problem. In addition, the common case is that there are many layers of devices in the NAT intranet, which can only complicate the problem further.

Each TCP connection has an explicit protocol state machine that starts three handshake, then begins data transfer, finally closes the connection, and has a complete flow. Based on this process, NAT can observe each connection state and can create and delete route entries as needed. Instead of UDP, there is no handshake, no connection termination, and no state machine to monitor the connection status.

Sending data out of UDP does not require any extra work, but the requested reply requires a NAT Maintenance routing table to identify the IP and port of the local destination host. Therefore, NAT must maintain the routing table information for each UDP stream, because UDP is stateless.

Even worse, Nat needs to know when to clear the routing record, but UDP does not have a connection termination sequence, and at any time both ends can stop sending packets without any notification. To solve this problem, UDP routing records have an aging timer. How long is this timer? "There is basically no definitive answer, but it depends on the device provider, version, configuration, etc." Therefore, one of the best practices in fact for long-running UDP sessions is to introduce bidirectional keepalive messages that periodically reset the aging timers for all NAT devices on the route.

TCP Timeouts and NAT

Technically speaking, there is no need for a NAT device to provide a timeout aging mechanism for TCP connections. The TCP protocol has a good handshake mechanism and a terminating sequence packet, and NAT can clearly add or remove routing records based on this information.

Unfortunately, in real-world applications, many NAT devices provide UDP-like aging timers for TCP. As a result, in some cases, the TCP connection also requires a bidirectional keepalive message. If your TCP connection drops suddenly, it might be a curse of the aging mechanism of the NAT device.

NAT traversal

Unpredictable connection state management is a serious problem with NAT, but a bigger problem for many applications is the inability to establish a UDP connection at all. This is especially true for many applications such as VoIP, gaming, file sharing, and so on. These applications often require both sides of the communication to act as both the client and server side roles, enabling them to communicate in both directions.

The first problem is that, in a scenario with NAT, the internal client does not know its public IP?? : It only knows its internal IP address, the NAT device rewrites each UDP packet, modifies the source port and address of the UDP packet, and the source IP address of the IP layer. However, if the client communicates with the external network address as part of the application-tier data, the connection will inevitably fail. Therefore, the "transparent" conversion of NAT is problematic, the application must first discover its public IP?? Address, if it needs to communicate with an address in the external network.

However, it is not possible to ensure that UDP transmissions are successful by just knowing their public IP. Does any packet arrive after a NAT device that has a public IP? There is a destination port, the NAT routing table must have an external network IP port and the intranet address and port mapping records, the data can really achieve the destination address. If this record does not exist, then the packet is simply discarded ( figure 3-4). Nat as a simple packet filter, it has no way to automatically determine the internal route mapping relationship, unless the user through port forwarding or similar mechanism explicitly on the NAT registration.

Figure 3-4 the received packet is discarded due to a missing mapping record

It is important to note that the problem described above is not a problem for the client application, where the client initiates the connection from the internal network and NAT naturally adds the appropriate route. However, for applications that require active reception (intranet hosts as servers) such as peer-to-peer applications (such as VoIP), game consoles, file sharing, and so on, they will encounter this problem.

To solve this problem of UDP traversal, various traversal techniques (stun,turn,ice) are proposed to establish an end-to-end connection between two intranet hosts.

Stun,turn,ice

The STUN (RFC 5389) protocol is an enabling host application to discover a NAT device in the network and use it to assign a public IP to the current connection?? And a scenario for the port tuple (Figure 3-5). To do this, the Protocol requires the use of a third-party stun server deployed on the public network.

Figure 3-5 Stun querying public IP and ports

Assuming that the IP address of the stun server is known (either through DNS discovery, or through a manually specified address), the application first sends a BIND request to the stun server. Accordingly, the stun server responds with a response that includes a public IP address that is externally exposed for its assigned clients. Address and Port. This simple process solves some of the problems we have encountered in our previous discussions:

The program obtains the tuple of its public IP and port, and uses this information as part of its application data to communicate with the peer.
The request sent to the stun server also establishes a route map record on the NAT, which ensures that the peer request can be prepared to reach the application in the internal network.
The stun protocol defines a simple mechanism to maintain route aging on a NAT.

With this mechanism, both ends need to communicate over UDP, they send binding requests to their respective stun servers, receive responses from their respective stun servers, and then they can use their assigned public IP addresses and ports for data exchange.

However, in practical applications, stun is not sufficient to handle all NAT topologies and network configurations. In addition, unfortunately, in some cases, UDP may be completely blocked by firewalls or other network devices-a rare scenario in many enterprise networks. To solve this problem, as long as the stun fails, we can also use the Turn protocol (RFC 5766) as a fallback, which can run on UDP and convert UDP to TCP.

The key to the Trun scenario is trunking (relay). This protocol relies on the trunking on the public network to ensure the visibility and availability of private web hosts ( figure 3-6).

Figure 3-6 Turn Relay Server

Both ends send an address assignment request to the same turn server, followed by permission negotiation.
Once the negotiation is complete, both ends communicate with each other by sending the data to the turn server, and by forwarding the turn to the peer.

Of course, the most obvious drawback of this type of communication is that he is no longer a peer-to-peer communication. He needs to rely on the turn server to ensure reliable transmission, turn server becomes a bottleneck, the cost of maintaining turn is very high, at least turn server needs enough bandwidth to guarantee all the traffic. Therefore, the turn scheme is best used as a last-in-one alternative, only if other schemes fail.

Stun and turn practice

Google offers the Libjingle, is an open source C + + library, it can be used to create peer-to application, it implemented at the bottom of the stun,turn,ice and other negotiations. This library is used in Google Talk, which provides a valuable reference point for the performance of stun and turn in the real world:

92% of the time can be directly connected to the scenario (STUN)
8% time connection requires a repeater (TURN)

Unfortunately, even with the stun solution, some users are unable to establish a direct-to-peer tunnel. In order to provide reliable services, we also need turn trunking, which can be an alternative to the stun scenario unavailability scenario.

Building an effective NAT traversal solution is not a simple and easy task. Fortunately, we can use the Ice Protocol (RFC 5245) to help us accomplish this task. Ice is a protocol, and a set of methods used to seek the most effective end-to-end ( figure 3-7) Tunneling method: If possible then connect directly, if not the stun to negotiate, if all fail then take turn.

Figure 3-7 Ice attempts to establish a connection via direct connection, stun and turn

In practice, if you are building a UDP-based peer application, you want to leverage existing platform APIs or third-party libraries to implement Ice,stun and turn for you. Now you should understand these protocols, now you can jump to the appropriate installation and configuration to implement your plan!

UDP optimization

UDP is a simple and common protocol. In fact, the main feature of UDP is that it ignores features: No connection state, handshake, Resend, reassembly, reorder, congestion control, congestion avoidance, flow control, and even optional error checking. However, this message-oriented transport layer provides flexibility and is the responsibility of the implementation. Your application may re-implement some or many of the missing features from scratch, each of which should need to match the end or application protocol.

Unlike TCP, which has built-in traffic and congestion control, congestion avoidance mechanisms, UDP applications must implement these mechanisms themselves. Congestion-insensitive UDP applications can easily congestion networks, which can cause network performance degradation and, in severe cases, cause network congestion to crash.

If you want to use UDP in your own application, be sure to research and read the current best practices and recommendations. In RFC 5405, special emphasis was placed on the application design guidelines for transmitting data via unicast UDP. The following is a short example:

Application must endure the changing Internet path
The application should control the transfer rate
Applications should implement all traffic congestion control
Applications should use the same bandwidth as TCP
The application should roll back the retransmission counter when the packet is dropped
Apps should not send datagrams that exceed the MTU
Applications should handle the loss, repetition, and reordering of datagrams
The application should be sure to support a two-minute delay
The app should enable IPV4 UDP checksum, the IPV6 checksum must be enabled
Applications may use keepalive when needed (minimum interval of 15 seconds)

Designing a new transport protocol requires a lot of serious thinking, planning and research-doing your due diligence. Where possible, the full use of existing libraries or an existing framework has been adopted to enable NAT traversal, enabling it to establish some level of fair communication with networks from other sources.

On this, good news, WEBRTC is such a framework!

High-performance Browser network (performance Browser Networking) Chapter III

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More