Principles and implementation of P2P UDP Nat penetration-enhancement (with modified source code) (zz)

Source: Internet
Author: User

Principles and implementation of P2P UDP-based NAT penetration-enhancement (with modified source code)

Keywords: P2P UDP Nat principle penetration traveral hierarchical Ric cone
Author: hwycheng LEO (FlashBT@Hotmail.com)
Source code download: http://bbs.hwysoft.com/download/UDP-NAT-LEO.rar
Reference: http://midcom-p2p.sourceforge.net/draft-ford-midcom-p2p-01.txt
Principles and implementation of P2P UDP-based NAT penetration (shootingstars)

Description:

There are few Chinese documents about UDP-based NAT penetration on the network. Only <principles and implementations of P2P-based UDP-based NAT penetration (shootingstars)> This article has practical reference value. I have been engaged in P2P development for the past two years. The representative is my personal BitTorrent download software-flashbt (abnormal express train ). for P2P download or P2P development Interested friends can visit the software's official homepage: http://www.hwysoft.com/chs/ download to see, maybe there are gains. The main purpose of writing this article is to answer questions from some netizens and write them down at one time, which saves your time, it also makes it easier for users interested in P2P UDP penetration to read and understand. Interested and experienced friends can send me an email or visit my personal blog message: http://hwycheng.blogchina.com. You can freely repost this article, but please keep this description.

Thanks again to the early contribution of shootingstars.

Bytes ----------------------------------------------------------------------------------------------------------------------------

What is the concept and significance of NAT (the IP network address translator?

Nat. For detailed information, see RFC 1631-http://www.faqs.org/rfcs/rfc1631.html, the most authoritative description of NAT definitions and interpretations. Network terms are both abstract and difficult. Unless you are a professional, it is difficult to understand the meaning of NAT literally.

To fully understand the role of Nat, we must understand the two categories of IP addresses. One is private IP addresses, which are called Intranet IP addresses. A non-private IP address is called a public IP address. For an introduction to the concept and role of IP addresses, see my other article: http://hwycheng.blogchina.com/2402121.html.

Intranet IP Address: the private IP address in class A, B, and C. The assigned IP address is unique globally and cannot be directly accessed by other Internet hosts.
Public IP Address: a globally unique IP address that can be directly accessed by other hosts.

Nat initially aims to provide computers using Intranet IP addresses with the ability to access the external network through a few computers with public IP addresses. Nat is responsible for converting the source IP address of IP packets sent from computers with some Intranet IP addresses to external networks to Nat's own public IP address. The destination IP address remains unchanged and the IP packet is forwarded to the router, the computer that finally arrives at the external computer. At the same time, it is responsible for converting the destination IP address of the IP packet returned by the external computer to the Intranet IP address. The source IP address remains unchanged and is finally sent to the computer in the intranet.

--------------------------------------------
| 192.168.0.5 | internat host | 192.168.0.6 | internat host
--------------------------------------------
^ Port: 2809 ^ port: 1827
|
V v
--------------------------------------------
| 192.168.0.1 | NAT device | 192.168.0.2 | NAT device
| 61.51.99.86 | 61.51.77.66 |
--------------------------------------------
^
|
V Port: 80 V Port: 80
--------------------------------------------
| 61.51.202.88 | Internet host | 61.51.76.102 | Internet host
--------------------------------------------

Figure 1: Nat allows a computer with a private IP address to share several public IP addresses to access the Internet.

With the popularization of networks, limitations of IPv4 are exposed. The public IP address becomes a scarce resource. At this time, the limitations of NAT functions are exposed. the IP address of the same public network can only be used by a computer with a private IP address at a certain time. As a result, napt (the IP network address/port translator) came into being. napt enables computers with multiple private IP addresses to access the Internet through one public IP address at the same time. This Temporarily relieves the shortage of IPv4 address resources.

Napt is responsible for converting the source IP address of the TCP/UDP packet sent from some computers with an intranet IP address to the public IP address of napt itself, and converting the source port to a port of napt itself. The destination IP address and port remain unchanged, and the IP packet is sent to the router to reach the external computer. At the same time, it is responsible for converting the destination IP address of the IP packet returned by the external computer to the Intranet IP address, converting the destination port to the port of the Intranet computer, changing the source IP address and source port, and finally sending it to the Intranet computer.


--------------------------------------------
| 192.168.0.5 | internat host | 192.168.0.6 | internat host
--------------------------------------------
Port: 2809 ^ port: 1827
//
V v
----------------------
| 192.168.0.1 | NAT device
| 61.51.99.86 |
----------------------
Map port: 9882 to 192.168.0.5: 2809 ^ map port: 9881 to 192.168.0.6: 1827
//
Port: 80 V Port: 80
--------------------------------------------
| 61.51.202.88 | Internet host | 61.51.76.102 | Internet host
--------------------------------------------

Figure 2: napt allows a computer with a private IP address to share a public IP address to access the Internet.
 
In our work and life, the role of napt can be seen everywhere. The network architecture of most companies, one to N vrouters supporting napt are used to connect all of the company's computers to external internet networks. When I wrote this article, I also used an IBM laptop at home to access the Internet through a desktop with a broadband connection. This article focuses on the napt issue.

Why does napt (the IP network address/port translator) impede the application of P2P software?

The napt determines that only the computers in the napt can actively initiate connections to hosts outside the napt, external hosts are not allowed to directly establish connections with computers in napt. Im (instant messaging) means that the computers in the napt and those outside the napt can only transmit data through servers for communication. For P2P download programs, it means that the computers in the napt cannot receive external connections from the napt, resulting in a small number of connections and difficulty in downloading. Therefore, a problem that P2P software must solve is to be able to solve the problem that computers in napt cannot be connected externally to a certain extent.

What is the principle of UDP penetration through NAT (the IP network address translator?

TCP and UDP protocols are mainly used for TCP/IP transmission. TCP is a reliable connection-oriented transmission protocol. UDP is an unreliable, connectionless protocol. According to the implementation principles of TCP and UDP protocols, the napt is used for penetration, mainly refers to the UDP protocol. The TCP protocol is also possible, but the feasibility is very small and the requirements are higher. We will not discuss it here. If you are interested, you can search for it on Google. Some Articles have discussed this issue. Let's take a look at the principle of using UDP protocol to penetrate napt:

--------------------------------------------
| 192.168.0.5 | internat host | 192.168.0.6 | internat host
--------------------------------------------
UDP port: 2809 ^ UDP port: 1827
//
V v
----------------------
| 192.168.0.1 | NAT device
| 61.51.99.86 |
----------------------
Session (192.168.0.6: 1827 <-> 61.51.76.102: 8098) ^ SESSION (192.168.0.6: 1827 <-> 61.51.76.102: 8098)
Map port: 9882 to 192.168.0.5: 2809 // map port: 9881 to 192.168.0.6: 1827
UDP port: 8098 V v UDP port: 8098
--------------------------------------------
| 61.51.202.88 | Internet host | 61.51.76.102 | Internet host
--------------------------------------------


Figure 3: How the napt transparently transmits the UDP packet of the private IP address with the public network host.

Instructions on transparent transmission of UDP packets through napt:

The napt assigns a port number of the napt to each session. Based on the port number, it determines that the TCP/IP packet returned by the received public IP host is forwarded to the computer of the Intranet IP address. In this case, session is virtual, and UDP communication does not need to establish a connection. But for napt, there must indeed be a session concept. An important issue for transparent transmission of UDP packets by napt is how to handle this virtual session. We all know that sessions connected to TCP start with SYN packets and end with FIN packets. napt can easily obtain and process the TCP session lifecycle. However, UDP is troublesome. napt does not know whether the forwarded UDP protocol package has reached the target host. In addition, due to the characteristics of UDP protocol and poor reliability, napt must maintain the existence of the session to wait for external data to be sent back and forwarded to the computer that once initiated the request Intranet IP address. How does napt deal with UDP session timeout? The implementation of napt varies with devices provided by different vendors. For a few minutes or several hours, the implementation of napt intelligently calculates the timeout value based on the busy status of the device.

[192.168.0.0.6: 1827]
| UDP packet [src ip: 192.168.0.6 SRC port: 1827 dst ip: 61.51.76.102 DST port 8098]
V
[Pub IP: 61.51.99.86] Nat [priv IP: 192.168.0.1]
| UDP packet [src ip: 61.51.99.86 SRC port: 9881 dst ip: 61.51.76.102 DST port 8098]
V
[61.51.76.102: 8098]

Figure 4: napt transfers the source address and source port of the internal UDP protocol package to the public IP host.


[192.168.0.0.6: 1827]
^
| UDP packet [src ip: 61.51.76.102 SRC port: 8098 dst ip: 192.168.0.6 DST port 1827]
[Pub IP: 61.51.99.86] Nat [priv IP: 192.168.0.1]
^
| UDP packet [src ip: 61.51.76.102 SRC port: 8098 dst ip: 61.51.99.86 DST port 9881]
[61.51.76.102: 8098]

Figure 5: napt transfers the destination address and destination port of the UDP packet returned by the Public IP host to the Intranet IP computer.
Now we understand how napt implements transparent communication between computers on the Intranet and hosts on the Internet. Now let's take a look at our most important question: what policies does napt use to determine whether to establish a session for a UDP packet sent by a request? There are several strategies:

A. the source address (intranet IP address) is different. If you ignore other factors, the napt must correspond to different sessions.
B. If the source address (intranet IP address) is the same and the source port is different and other factors are ignored, the napt must correspond to different sessions.
C. If the source address (intranet IP address) is the same, the source port is the same, the destination address (Public IP address) is the same, and the destination port is different, the napt must correspond to the same session.
D. The source address (intranet IP address) is the same, the source port is the same, and the destination address (Public IP address) is different. If the destination port is ignored, how does one process the session on the napt?

D. The issues we are concerned about and want to discuss. According to the method determined by the establishment of the session by the target address (Public IP address), we divide the napt device into two categories:

Symmetric napt:
For connections to the same IP address, use the same session for any port connection; for connections to different IP addresses, use different sessions for any port connection.
We call this napt symmetric napt. That is, as long as the locally bound UDP port is the same and the destination IP address is different, different sessions will be created.

[202.223.98.78: 9696] [202.223.98.78: 9696] [202.223.98.78: 9696]
^
|
V v
9883 9882 9881
|
/[Nat]/
^
|
V
[192.168.0.0.6: 1827]

Figure 6: symmetric ric. Multiple ports correspond to multiple hosts. They are parallel and symmetric!

Cone napt:
For connections to the same IP address, use the same session for any port connection; for connections to different IP addresses, use the same session for any port connection.
We call this napt cone napt. That is, as long as the locally bound UDP port is the same, the sent destination address uses the same session regardless of whether the destination address is the same.

[202.223.98.78: 9696] [202.223.98.78: 9696] [202.223.98.78: 9696]

^
/|/
V v
9881
[Nat]
^
|
V
[192.168.0.0.6: 1827]

Figure 7: Cone indicates a cone. A port corresponds to multiple hosts. Is it like a cone?

Currently, the vast majority of napts belong to the latter, namely, cone Nat. During the test, I had to use a Japanese symmetric Nat. Fortunately, I didn't buy it by myself. I never buy Japanese goods. I hope that my friends who read this article will consciously not buy Japanese goods. The napt of the Win9x/2 k/XP/2003 system also belongs to the cone Nat. This is a blessing, because the UDP penetration we want to do can only be performed between cone Nat, as long as one is not cone Nat, sorry, UDP penetration has no hope, server forwarding. Detailed analysis will be performed later!

Next we will analyze some data structures when napt is working. Here we will describe the basis for UDP to penetrate the cone Nat. The data structure described here is only to illustrate the principles and has no practical reference value. If you are really interested, read the source code about Nat implementation in Linux. The real Nat implementation does not use the database either!

The symmeting data structure of the worker Ric napt is as follows:

Intranet info table:

[Napt allocation port] [Intranet IP address] [Intranet port] [Internet IP address] [sessiontime start time]

Primary Key ([napt allocation port])-> indicates that a primary key is created based on [napt allocation port]. It must be unique and indexed to speed up searching.
Unique ([Intranet IP address], [Intranet port])-> indicates that the two fields cannot be duplicated.
Unique ([Intranet IP address], [Intranet port], [Internet IP address])-> indicates that the three fields cannot be combined.

Ing table:

[Napt allocation port] [Internet port]

Unique ([napt allocation port], [Internet port])-> indicates that the two fields cannot be duplicated.

The port ing data structure when cone napt is working is as follows:

Intranet info table:

[Napt allocation port] [Intranet IP address] [Intranet port] [sessiontime start time]

Primary Key ([napt allocation port])-> indicates that a primary key is created based on [napt allocation port]. It must be unique and indexed to speed up searching.
Unique ([Intranet IP address], [Intranet port])-> indicates that the two fields cannot be duplicated.

Internet information table:

[WID primary key ID] [Internet IP address] [Internet port]

Primary Key ([WID primary key ID])-> indicates that a primary key is created based on [WID primary key ID]. It must be unique and indexed to speed up searching.
Unique ([Internet IP address], [Internet port])-> indicates that the two fields cannot be duplicated.

Ing table: one-to-many

[Napt allocation port] [WID primary key ID]

Unique ([napt allocation port], [WID primary key ID])-> indicates that the two fields cannot be duplicated when combined.
Unique ([WID primary key ID])-> this field cannot be repeated.

After reading the data structure above, do you better understand or get dizzy? Haha! I think it will be clear later. Through Nat, it is easy for computers on the Intranet to connect to the outside, and napt will automatically process it. Our applications do not have to worry about how it is handled. How can external computers access computers in the Intranet? Let's take a look at the following process:

C is an intranet computer behind napt, and S is a computer with an Internet IP address. C initiates a connection request to S. The napt records the connection request in its own data structure based on the rules described above and creates a session. then, two-way transparent data transmission can be achieved between C and S. As shown below:

C [192.168.0.6: 1827] <-> [priv IP: 192.168.0.1] napt [Pub IP: 61.51.99.86: 9881] <-> S [61.51.76.102: 8098]

It can be seen that the communication between a computer with an Internet IP address and the Intranet computer after napt requires the Intranet computer after napt to initiate a UDP packet to the computer with the Internet IP address. The computer of the Internet IP address obtains the Internet IP address of napt and the mapped port using the received UDP packet, and then can communicate transparently with the computer of the Intranet IP address.

Now let's analyze how the Intranet computers behind the two napt that we are most concerned about implement direct communication? Neither of them can actively send connection requests, and no one knows the public IP address of the other's napt and the port number mapped above the napt. Therefore, we need a server with a public IP address to help them establish a connection. When the Intranet computers behind the two napt connect to the server with the public IP address, the server can obtain the public IP address of the two napt devices from the received UDP packets and the session ing port of the session established by the two connections.
The two Intranet computers can obtain the public IP address and mapped port of the napt device from the server.

Assume that the two Intranet computers are a and B, and the corresponding napt is an and bn respectively. If a obtains the IP address of the BN corresponding to B and the mapped port, what happens when a UDP packet is sent to the IP address and the mapped port urgently? Based on the above principles and data structure, we will know that an will generate a record in its own data structure to identify the existence of a new session. After receiving the data packet, BN queries from its own data structure and does not find the relevant records. Therefore, the packet is discarded. B is a slow child. At this time, a UDP packet is sent slowly to an IP address and the mapped port. What is the result? Of course, it is our expected structure. An finds records from its own data structure after receiving the data packet, so it sends the data packet to a for processing. When a sends data packets to B again, everything is unobstructed. OK, big work! Slow. What about cone napt? Let's analyze it by yourself...

Napt (the IP network address/port translator) to analyze the specific situation of UDP penetration!

First, we will clearly divide the napt device into the following parts according to the above description: Hybrid Ric napt and cone napt, which we need. The napt that comes with Win9x/2 k/XP/2003 is also cone napt.

In the first case, both parties are using Symmetric napt:

In this case, there is no problem. UDP penetration is certainly not supported.

In the second case, both parties are cone napt:

This is what we need. We can perform UDP penetration.

In the third case, one is symmetric napt and the other is cone napt:

This situation is complicated, but it is easy to understand it by analyzing the above description and data organization. The analysis is as follows,

Assume: A-> symmetric Nat, B-> cone Nat

1. A wants to connect to B. A obtains the NAT address and ing port of B from the server. A notifies the server that the server notifies B a of the NAT address and ing port, and B initiates a connection to, A certainly cannot receive it. At this time, a initiates a connection to B. The Nat corresponding to a creates a new session and assigns a new ing port. After Nat of B receives the UDP packet, the ing item cannot be found in the ing table, so the package is discarded.

2. B wants to connect a and B to obtain the NAT address and ing port of a from the server, B notifies the server, the server informs a B of the NAT address and ing port, and a initiates a connection to B, nat corresponding to a creates a new session, and a new ing port B is assigned. At this time, B initiates a connection to a. Because B cannot obtain the ing port of the new session established by a, B still uses the ing port obtained on the server for the connection, therefore, after receiving a UDP packet, Nat of a queries in its own ing table and cannot find the ing item. Therefore, the packet is discarded.

According to the above analysis, only when the NAT addresses at both ends of the connection are cone Nat can UDP Intranet penetration be achieved.

Napt (the IP network address/port translator) for UDP penetration how to verify and analyze the reality!

The required network structure is as follows:

The Intranet machines behind the three Nat servers and the two Internet servers. Two of them are cone napt and one is symmetric napt.

Verification Method:

You can use the source code provided by this program to compile and then run the server program and client respectively. The modified source code adds the command for sending messages directly between clients through IP addresses and ports. With this command, you can manually verify napt penetration. For ease of operation, we recommend that you use a remote login software to operate all related computers on one machine. This makes it easy for a person to complete all the work. That's how I did it. If you are interested or experienced, send a letter to us to criticize and correct the situation and make progress together.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.