P2P principles and common implementation methods

Source: Internet
Author: User
Transferred from: http://www.cppblog.com/peakflys/archive/2013/01/25/197562.html for the project's late im application, recently in the research libjingle, in the middle of the saw also collected a lot of information, feeling a lot of information on the Internet or too tangled protocol (such as stun, ice, etc) implementation Details, or there are many flaws in the middle. Finally, I want to help my colleagues in the future.
If you have any questions or corrections, leave a message or email peakflys@gmail.com.
Principles of P2P implementation First, we will introduce some basic concepts: NAT (network address translators), Network Address Translation: Network Address Translation is generated when IP addresses are increasingly lacking, its main purpose is to be able to reuse addresses. Nat is divided into two categories from the historical development: Basic Nat and napt (Network Address/port translator ). The first thing we put forward is the basic NAT (peakflys Note: At the beginning, it was actually only a functional module on the vro). Its production is based on the following fact: a private network (domain) only a few of the nodes in need to be connected to the Internet (proposed in the 1990s S ). In fact, only a few nodes in this subnet need a unique IP address in the world. The IP addresses of other nodes should be reusable. Therefore, the basic Nat implementation function is very simple. Using a reserved IP subnet segment in the subnet, these IP addresses are invisible to the outside. Only a few IP addresses in the subnet can correspond to a truly unique IP address in the world. If these nodes need to access the external network, the basic Nat is responsible for converting the IP address in the subnet of the node into a globally unique IP address and then sending it out. (Basic Nat changes the original IP address in the IP packet, but does not change the port in the IP packet.) for more information about basic Nat, see RFC 1631, we can also see from the name that napt will not only change the IP address of the IP datagram through the NAT device, but also change the TCP/UDP port of the IP datagram. We may not know much about the basic NAT devices (basically eliminated). napt is what we really need to pay attention. See:

There is a private network 10. *. *. * Client A is one of the computers. The Internet IP address of the Network Gateway (a NAT device) is 155.99.25.11 (there should be an intranet IP address, such as 10.0.0.10 ). If a process in Client A (this process creates a UDP socket, this socket is bound to port 1234) wants to access port 1235 of the Internet host 18.181.0.31, so what will happen when a packet passes through Nat? First, Nat changes the original IP address of the packet to 155.99.25.11. Next, Nat creates a session for this transmission (session is an abstract concept. If it is TCP, the session may start from a SYN Packet and end with a FIN packet. For UDP, it starts with the first UDP port of the IP address and ends. Haha, it may be a few minutes, maybe several hours, depending on the specific implementation) and assign a port for the session, such as 62000, and then change the source port of the packet to 62000. Therefore, the original data packet (10.0.0.1: 1234-> 18.181.0.31: 1235) is changed to (155.99.25.11: 62000-> 18.181.0.31: 1235) on the Internet ). Once a session is created, Nat remembers that port 62000 corresponds to port 1234 of 10.0.0.1. data sent from Port 18.181.0.31 to port 62000 will be automatically forwarded to 10.0.0.1 by Nat. (Note: data sent from Port 18.181.0.31 to port 62000 will be forwarded, and data sent from other IP addresses to this port will be discarded by Nat) in this way, client a establishes a connection with server S1. The above is some basic knowledge, and the following is the key part. Let's take a look at the following situation:

In the following example, if Client A's original socket (the UDP socket bound with port 1234) then sends a UDP packet to another server S2, what will happen when this UDP packet passes through Nat? In this case, two conditions may occur. One is that Nat creates a session again and assigns a port number (for example, 62001) to the session again ). In addition, Nat creates a session again, but does not allocate a new port number. Instead, it uses the original allocated port number 62000. The previous Nat is called Nat, and the latter is called Nat. If your Nat is the first type, many P2P software may fail. (Fortunately, most of NAT now belongs to the latter, that is, cone Nat.) peakflys Note: Cone Nat is divided into three types: (1) Full Cone ): nat maps all requests from the same internal IP address and port to the same external IP address and port. Any external host can send IP packets to the internal host through the ing. (2) Restricted Cone: Nat maps all requests from the same internal IP address and port to the same external IP address and port. However, an internal host can only send an IP packet to an external host whose IP address is X. (3) Port restricted Cone: the port restriction clone is similar to the restriction clone, but the port number restriction is exceeded, that is, only the internal host is directed to the IP address X first, only when an external host whose port is P sends an IP packet can the external host send an IP packet whose source port is P to the internal host. Now, we can see that it is easy to connect computers in the subnet to external networks through NAT (NAT is transparent, and computers in the subnet and the Internet do not need to know the NAT situation ). However, it is difficult for external computers to access computers in the subnet (and this is exactly what P2P needs ). So what can we do if we want to send a datagram to an intranet computer? First, we must create a "hole" (that is, we mentioned earlier to create a session on Nat) on the Intranet Nat. This hole cannot be played by the outside, it can only be played by hosts in the intranet. This hole also has a direction. For example, a UDP packet is sent from an internal host (such as 192.168.0.10) to an external IP address (such as 219.237.60.1, then, a "hole" in the direction of 219.237.60.1 is added to the NAT device on the Intranet (this is called UDP hole Punching Technology) later, you can use this hole to contact 192.168.0.10 on the Intranet. (However, other IP addresses cannot use this hole ). Common P2P implementation I. Normal direct connection P2P implementation Through the above theory, the last step is to achieve communication between two Intranet hosts: the problem of chicken eggs or eggs, and the two sides cannot actively send connection requests, no one knows who owns the Internet address. How can we make this hole? We need a man in the middle to contact the two Intranet hosts. Now let's take a look at the process of a P2P software, for example: first, client a logs on to the server, and Nat A assigns a port 60000 for this session, the address of client a received by server s is 202.187.45.3: 60000. This is the Internet address of Client. Similarly, if client B logs on to server s and Nat B assigns port 40000 to this session, the address of B received by server s is 187.34.1.56: 40000. In this case, both client a and client B can communicate with server S. If Client A wants to directly send a message to client B at this time, he can obtain the Internet address 187.34.1.56: 40000 of B from server S, is client a able to receive the information sent by client B to this address? The answer is no, because if the message is sent in this way, Nat B will discard this information (because this information is not recommended, for security purposes, most NAT will perform the discard action ). Now we need to create a hole in the direction of 202.187.45.3 (that is, the Internet address of Client A) on Nat B, so Client A sends the information sent to 187.34.1.56: 40000, and client B can receive it. Who will issue this punching command? Naturally, it is server S. To sum up this process: If Client A wants to send information to client B, client a sends a command to server s and requests client B to open a hole in Client. Then Client A can communicate with client B through the Internet address of client B. Note: The above process is only applicable to the case of cone Nat. If it is using NAT, when client B holes in Client A, the port has been reassigned, client B will not be able to know this port (if the port of NAT is allocated sequentially, we may be able to guess this port number, but there are too many factors that may cause the failure, in this case, P2P --- peakflys is generally abandoned ). 2. P2P implementation in stun mode Stun is a NAT penetration method specified by rfc3489. It uses an auxiliary method to detect Nat IP addresses and ports. Undoubtedly, it plays a huge role in the early NAT traversal, and will continue to have a place in Nat penetration. Stun requires a public IP address of the stun server. The UAC behind the NAT must work with the server to send several UDP packets to each other. The UDP packet contains information that UAC needs to know, such as the NAT Internet IP address and port. UAC determines its Nat type by checking whether the UDP packet and the data in the packet are obtained. Assume that the following UAC (B), NAT (A), Server (C), UAC IP is IPB, Nat IP is IPA, and Server IP is ipc1 and ipc2. Note that server C has two IP addresses, and you will understand why two IP addresses are needed later. (1) Nat detection process step 1: B sends a UDP packet to Port 1 of ipc1 in C. C. After receiving the package, C writes the source IP address and port of the received package to the UDP package, and sends the package back to B through ip1c and port1. This IP address and port are the NAT Internet IP address and port, that is, you get the NAT Internet IP address in step 1. Anyone familiar with the working principles of NAT should know that the UDP packet B returned by C to B must receive it. If you do not receive any response packet from stun after sending data packets to a stun server in your application, there are only two possibilities: 1. the stun server does not exist, or you have the wrong port. 2. Your NAT device rejects all UDP packets passing through the firewall from the outside. If you exclude the firewall restriction rules, if such a NAT device exists, it must be broken, in the next step, Nat will detect the firewall type, so I will not talk about it much (as shown below ). If they are different, it indicates that Nat exists and the system performs step 2. Step 2: B sends a UDP packet to ipc1 of C, requesting C to pass another ipc2 and port (different from IP1 of setp1) return a UDP packet to B (now you know why C has two IP addresses, in order to detect the cone Nat type ). Let's analyze. If B receives this packet, what does it mean? It indicates that Nat is not rejected, and no data packet is filtered, that is, full cone Nat In the stun standard. Unfortunately, full cone Nat is too small, which means you are unlikely to receive this packet. If you do not receive the request, the system performs step 3. Step 3: B sends a packet to port2 of ipc2 of C. After C receives the packet, C writes the source IP address and port of the packet it receives to the UDP packet, then, return the package to B through ipc2 and port2. Like step 1, B will certainly receive this UDP response packet. The port in this package is the data we are most concerned about. Let's analyze it below: If this port is the same as the port in step 1, it is certain that this Nat is a cone Nat; otherwise, it is a symmetric Nat. The principle is simple: According to the symmetric Nat rule, when the IP address and port of the destination address change, Nat will allocate a port again, and in step 3, it corresponds to step 1, we changed the IP address and port. Therefore, for symmetric Nat, the two ports must be different. If the port is different in your application at this step, you can only give up P2P because it is the same as in the above implementation. If they are different, only restrict cone and port restrict cone are left. The system uses Step 4 for testing. Step 4: B sends a data request packet to a port PD of ip2 in C, requiring C to return a packet to B using ip2 and a port different from PD. Analysis result: If B receives the packet, it means that as long as the IP address is the same, Nat allows UDP packets to pass through even if the port is different. Apparently this is restrict cone Nat. If you do not receive the packet, there is nothing to say, port restrict Nat. Protocol Implementation Algorithm The running diagram is as follows:
Once the road passes to the red node, UDP communication is impossible (peakflys note: In addition to firewall blocked, it is also possible to establish P2P in other cases, but the cost is too high, generally give up ). Once a node is yellow or green, the connection is possible. Finally, the stun server is used to obtain its own Nat type, public IP address, and port. In the future, it will be very easy to establish peakflys. Note: libjingle is a P2P connection established through ice & stun. About libjingle, To be continued ......

References:
1. Wikipedia stun
Http://midcom-p2p.sourceforge.net/draft-ford-midcom-p2p-01.txt (shootingstars)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.