turn from: http://www.cnblogs.com/pannengzhi/p/4800526.html
1. Introduction
Today the internet is littered with middleware (middleboxes), such as Nat and firewalls, which cause two of clients (not in the same intranet) to communicate directly. These problems can exist even in the IPV6 era, because even if NAT is not needed, there are other middleware such as firewalls blocking the creation of links.
Most of the middleware deployed today is designed on the C/S architecture, where relatively hidden clients initiate link requests to the weekly server (with static IP addresses and DNS names). Most middleware implements an asymmetric communication model, that is, the host in the intranet can initialize the external link, but the host of the net cannot initialize the link of the intranet, unless it is specially configured by the middleware administrator. In the case of middleware for common NAPT (also discussed in this paper), the client in intranet does not have a separate public IP address, but through napt conversion, and other users of the same intranet share a public network IP. This kind of intranet host is hidden in the middleware after the non-accessibility for some clients
Software, such as a browser, is not a problem because it only needs to initialize the external links, but in a way it is also good for privacy protection.
However, in Peer-to-peer applications, intranet hosts (clients) need to establish links directly to other terminals (Peer), but the initiator and responder may be behind different middleware, neither of which has a public network IP address. and external to NAT public network IP and port active link or data will be discarded because the intranet is not requested. In this paper, we discuss how to realize the direct communication of intranet host through Nat.
2. Terminology Firewall (Firewall):
Firewalls mainly restrict the communication between intranet and public network, and usually discard the unlicensed packets. The firewall detects (but does not modify) the IP address and TCP/UDP port information that attempts to enter the intranet packet. Network Address Translator (NAT):
Nat not only checks into the head of the packet, but also modifies it to achieve a common intranet with fewer public IP (usually one). Base NAT (Basic NAT):
The basic NAT maps the IP address of intranet host to a public network IP, and does not change its TCP/UDP port number. Basic NAT is usually useful only when NAT has a public IP pool. Network Address-Port converter (NAPT):
So far the most common is napt, which detects and modifies the IP address and port number of the packet, allowing multiple intranet hosts to share a public network IP address at the same time. conical Nat (Cone NAT):
After the establishment of a pair of (public IP, public network port) and (intranet IP, intranet port) two-tuple binding, Cone NAT will reuse this set of bindings for the next application of all sessions (the same intranet IP and port), as long as there is still a session or activation.
For example, suppose client a establishes two consecutive external sessions, from the same internal endpoint (10.0.0.1:1234) to two different external service-side S1 and S2. Cone NAT maps a public network endpoint (155.99.25.11:62000) for only two sessions, ensuring that the identity of the client port remains unchanged during address translation. Because basic NAT and firewalls do not change the port number of packets, these types of middleware can also be considered as degraded cone NAT.
symmetric NAT (symmetric NAT)
Symmetric NAT, on the contrary, does not maintain a fixed port binding in all public-intranet sessions. It opens up a new port for each new session. As shown in the following illustration:
The Cone NAT can be subdivided into the following three categories according to how NAT receives the input data that has been established (public IP, public network port): 1 Full cone NAT (fully Cone NAT)
Once the public/intranet port bindings have been established in a new session, the full cone NAT will then accept all data for the public network port, regardless of which (public network) terminal is coming from. All-cone NAT is sometimes referred to as "mixed" nat (promiscuous NAT). 2 Limited-cone NAT (restricted Cone NAT)
The restricted cone NAT forwards only the input packets that meet a certain condition. The condition is: the external (source) IP address matches the IP address of the node that sent one or more packets before the intranet host. Restricted NAT effectively streamlines the rules of a firewall by restricting input packets to a "known" External IP address. 3 Port restricted cone NAT (port-restricted Cone NAT)
Port-Restricted cone NAT is similar to forwarding only if the IP address and port number of the external packet match the address and port number sent by the intranet host. Port-Restricted cone NAT provides the same level of protection for internal nodes as symmetric NAT to isolate unrelated data.
3. Peer-to-peer Communication
According to the client's different, Peer-to-peer transmission between the client method is also slightly different, here introduces the existing through middleware for peer-to-peer communication technology. 3.1 relay (relaying)
This is the most reliable but also the lowest effect of a peer-to-peer communication implementation. The principle is to relay and forward the communication data of two intranet clients through a server with public network IP. As shown in the following illustration:
Client A and client B do not communicate directly, but first establish a link with the server s and then relay the data through the paths established by S and the other side. The flaw of this clock method is obvious, when the link client becomes more and more, will increase the burden of the server significantly, completely did not show the advantage of Peer-to-peer. 3.2 Reverse link (Connection reversal)
The second method works when one of the two endpoints does not exist in the middleware. For example, client A has a global IP address after NAT, and client B has the following figure:
Client A intranet address is 10.0.0.1, and the application is using TCP port 1234. A and server s set up a link, the server IP address is 18.181.0.31, listening to 1235 ports. NAT a assigns TCP port 62000 to client A, the public IP address 155.99.25.11 of NAT, as the temporary IP and port of client A for the current session. So s thinks client A is 155.99.25.11:62000. and b because there is a public network address, so for S, B is 138.76.29.7:1234.
When client B wants to initiate a peer-to-peer link to client A, either link A's extranet address 155.99.25.11:62000, or link A's intranet address 10.0.0.1:1234, but both links will fail. Link 10.0.0.1:1234 failed Since needless to say, why link 155.99.25.11:62000 will also fail. A TCP SYN handshake request from B is rejected when it arrives at Nat A, because only outgoing links are allowed for Nat a.
After direct link A fails, b can relay a link request via s to a, thereby "reverse" the point-to-point link between the a-b from a direction.
Many of the current Peer-to-peer systems have implemented this technology, but its limitations are also obvious, only when one party has a public network IP link can be established. In more and more cases, both sides of the communication are behind NAT, so we need to use the third technology we have described below. 3.3 udp hole (UDP hole punching)
The third kind of peer-to-peer communication technology, which is widely used, is called "Peer-to-peer punching Hole". Peer-to-peer hole-punching technology relies on the usual firewalls and cone NAT to allow legitimate peer-to-peer applications to hole in the middleware and establish direct links with each other. The following two common scenarios are considered and how the application is designed to handle these situations perfectly. The first scenario represents the majority of the two clients that need direct links after two different Nat, and the second scenario is that two clients are behind the same NAT, but the client does not need to know. 3.3.1. Endpoints under different Nat
Assume that client A and client B address are intranet addresses and are behind different NAT. The Peer-to-peer application and server s running on a and B use UDP ports 1234,a and B to initialize UDP communications with the server separately, as shown in the figure:
Now assume that client a intends to establish a UDP communication session directly with Client B. If a sends UDP data directly to B's public address 138.76.29.7:31000, NAT B is likely to ignore incoming data (unless it is full Cone NAT), because the source address and Port do not match s, and only the first session is established with S. b sending a message directly to a is similar.
If a begins to send UDP data to B's public address, send a relay request to server s, requiring B to start sending UDP information to the public address of a. The output of A to B will cause NAT A to open a new communication session between the Intranet address of a and the external network address of B, and B to a. Once the new UDP session is opened in two directions, client A and client B can communicate directly without having to boot server s again.
There are many useful properties of UDP hole-punching technology. Once a peer-to-peer link is established, both sides of the link can, in turn, serve as a "boot server" to help other middleware clients make holes, greatly reducing the load on the server. The application does not need to know what the middleware is (if any), because the above process can build a communication link without middleware or with multiple middleware. 3.3.2. Endpoints under the same NAT
Now consider the scenario where two clients A and B are right after the same NAT (and may not know for themselves), and therefore within the same intranet segment. Client A and server s establish a UDP session in which NAT allocates the public network port 62000,b same and S to establish the session, assigned to Port 62001, the following figure:
What happens if A and b use the UDP hole technique described in the previous section to create a peer-to-peer path? First A and B will receive the other's public network IP and port number, and then send the message to each other's address. Two clients only when NAT allows intranet hosts to initiate UDP sessions with other hosts on the network can normally communicate, we call this situation "loop transmission" (lookback translation), because the data from the internal to NAT will be "loopback" to the intranet rather than forward to the extranet. For example, when a sends a UDP packet to B's public address, the packet initially has the source IP address and the port address 10.0.0.1:1234 and the destination address 155.99.25.11:62001,nat receives the package and converts it to the source 155.99.25.11 : 62000 (a public address) and purpose 10.1.1.3:1234, and then forwarded to B. Even if NAT supports loopback transmission, this kind of conversion and forwarding is not necessary in this case, and may increase the conversation latency of A and B and add to the burden of Nat.
For this problem, the solution is straightforward. When A and B first Exchange address information through s, they should include their own IP address and port number (from their own perspective), as well as their own address and port number from the server. The client then begins to send data to each other at the same time from the two addresses known to the other, and uses the first successful address as the offset address. If two clients are in the same NAT, the data sent to each other's intranet address is most likely to arrive first, thereby establishing a communication link without NAT; If two clients are after different NAT, the packets sent to each other's intranet address will never reach each other. However, access can still be established through the public network address. It is worth mentioning that, although these packets are authenticated in some way, it is entirely possible that the information sent by a to B is sent to the unrelated node in the other a intranet segment in the case of different Nat. 3.3.3. Fixed port bindings
One of the main conditions for UDP hole-punching is that it works only if two Nat are cone NAT (or non-NAT firewalls). Because it maintains a given (intranet IP, intranet UDP) two Yuan group and (public network IP, public network UDP) two-tuple fixed port binding, as long as the UDP port is still in use, it will not change. Assigning a new public network port to each session, like a symmetric NAT, can cause a UDP application to not use a communication link with an external endpoint. Because Cone NAT is the most widely used today, although there is a small number of symmetric NAT is not to support the hole, UDP hole technology is widely adopted. 4. Concrete implementation
If you understand what is said above, then the code is easy to implement. Here uses C + + asynchronous IO Library to implement the simple function of the boot server and Peer-to-peer client, the purpose is to get through the communication link of two clients, so that the clients of two different LANs can realize direct communication. 4.1 Boot service-side design
The boot server runs on a device with a public network address, and receives the command from the client for the specified port (here is the port number 2333).
The client can and should preferably establish a TCP link with the server, but I am here for the convenience of the diagram, but also only the use of UDP communication methods. The server listens to the 2333-port command and then performs the appropriate action, and the currently included commands are:
Login, the client logs in so that it is logged in the server Traker so that other peer can send a link request to it.
Logout, the client log out to make it hidden against peer. Because the server does not track the login status of the client.
List, the client views the current logged-in user.
Punch <client>, make holes in the specified user (serial number).
Help to see what commands are available. 4.2 peer-to-peer Client Design
General network programming, the client is more difficult than the server, because to handle the communication with the servers also to deal with the events from the user, especially for Peer-to-peer clients, because the Peer-to-peer client is not only as a client, but also as the server side of the peer connection.
The general idea here is that after the input command is transmitted to the server, it receives feedback from the server and executes the appropriate code. For example, a wants to establish a communication link with B, first send the punch command to the server and send data to B, the server after the command to send B punch_requst information and a endpoint information, B received to a send data through the access, and then A and B can be peer-to-peer communication. After testing, through the access path even if the server shut down, A and B can also normal communication.
Code
Https://github.com/pannzh/P2P-Over-MiddleBoxes-Demo/tree/master