Principles and implementation of P2P UDP-based NAT penetration-enhancement (with source code)

Source: Internet
Author: User

Keywords: P2P UDP Nat principle penetration traveral hierarchical Ric cone
Author: hwycheng LEO (FlashBT@Hotmail.com)

Download source code:
Http://www.ppcn.net/upload/2005_08/05080112299104.rar
Refer:
Http://midcom-p2p.sourceforge.net/draft-ford-midcom-p2p-01.txt

Principles and implementation of P2P UDP-based NAT penetration (shootingstars)

Description:

There are few Chinese documents about UDP-based NAT penetration on the network. Only <principles and implementations of P2P-based UDP-based NAT penetration (shootingstars)
> This article has practical reference values. I have been engaged in P2P development in the past two years. I am typically using BitTorrent software developed by myself.
-Flashbt (abnormal Express Train). Anyone interested in P2P downloads or P2P development can visit the official software homepage:
Http://www.hwysoft.com/chs/ download to see, maybe there are gains. The main purpose of writing this article is to answer questions from some netizens and write them down at one time, which saves your time, it also makes it easier for users interested in P2P UDP penetration to read and understand. If you are interested or experienced in this, you can send me an email or visit my personal blog to leave a message:
Http://hwycheng.blogchina.com.
You can freely repost this article, but please keep this description.

Thanks again to the early contribution of shootingstars.

------------------------------------

What is the concept and significance of NAT (the IP network address translator?

Nat. For more information, see RFC 1631-
Http://www.faqs.org/rfcs/rfc1631.html, the most authoritative description of NAT definitions and interpretations. Network terms are both abstract and difficult. Unless you are a professional, it is difficult to understand the meaning of NAT literally.

To fully understand the role of Nat, we must understand the two categories of IP addresses. One is private IP addresses, which are called Intranet IP addresses. A non-private IP address is called a public IP address. For an introduction to the concept and function of IP addresses, see my other article:
Http://hwycheng.blogchina.com/2402121.html

Intranet IP Address: the private IP address in class A, B, and C. The assigned IP address is unique globally and cannot be directly accessed by other Internet hosts.
Public IP Address: a globally unique IP address that can be directly accessed by other hosts.

Nat initially aims to provide computers using Intranet IP addresses with the ability to access the external network through a few computers with public IP addresses. Nat
It is responsible for converting the source IP address of IP packets sent from computers with some Intranet IP addresses to the public IP address of the NAT network. The destination IP address remains unchanged,
And forward the IP packet to the router, and finally reach the external computer. At the same time, it is responsible for converting the destination IP address of the IP packet returned by the external computer into an intranet IP address. The source IP address remains unchanged and
Finally, it is sent to computers in the intranet.


Figure 1: Nat allows a computer with a private IP address to share several public IP addresses to access the Internet.

With
With the popularization of networks, limitations of IPv4 are exposed. A public IP address becomes a scarce resource.
Limitations are also exposed. the IP address of the same public network can only be used by a computer with a private IP address at a certain time. So napt (the IP Network
Address/Port
Since the advent of napt, many computers with private IP addresses can access the Internet through one public IP address at the same time. To a large extent
It relieves the shortage of IPv4 address resources.

Napt
The computer responsible for converting the source IP address of the TCP/UDP packet sent from some Intranet IP addresses to the public IP address of the napt itself, and converting the source port to one end of the napt itself.
Port. The destination IP address and port remain unchanged,
And send the IP packet to the router, and finally reach the external computer. At the same time, it is responsible for converting the destination IP address of the IP packet returned by the external computer into the Intranet IP address, and converting the destination port into
The source IP address and source port remain unchanged, and are finally sent to computers in the intranet.

Figure 2: napt allows a computer with a private IP address to share a public IP address to access the Internet.
 
In
In our work and life,
The role of napt can be seen everywhere. Most companies use one to N vrouters supporting napt to connect all their computers to external internet networks. Including
When I wrote this article, I also used an IBM laptop at home to access the Internet through a desktop with a broadband connection. This article focuses on the napt issue.

Why does napt (the IP network address/port translator) impede the application of P2P software?

Through napt
The characteristics of Internet access determine that only the computers in the napt can actively initiate connections to hosts outside the napt. It is not allowed for external hosts to directly establish connections with the computers in the napt. Im (that is
This means that the computers in the napt and those outside the napt can only transmit data through the server for communication. For P2P download programs, it means
The computer cannot receive connections outside the napt, resulting in too few connections and the download speed is very difficult. Therefore, one of the problems that P2P software must solve is to be able to solve napt to a certain extent.
Internal computers cannot be connected externally.

What is the principle of UDP penetration through NAT (the IP network address translator?

TCP and UDP protocols are mainly used for TCP/IP transmission. TCP is a reliable connection-oriented transmission protocol. UDP is an unreliable, connectionless protocol. Based on TCP and UDP protocols
Discuss the implementation principle, for napt penetration, mainly refers to the UDP protocol. The TCP protocol is also possible, but the feasibility is very small and the requirements are higher. We will not discuss it here. If you are interested, you can go
Google searches. Some articles provide an exploratory description of this issue. Let's take a look at the principle of using UDP protocol to penetrate napt:


Figure 3: How the napt transparently transmits the UDP packet of the private IP address with the public network host.

Instructions on transparent transmission of UDP packets through napt:

The napt allocates a port number of the napt to each session. Based on the port number, it determines that the TCP/IP packet returned by the received public IP host is forwarded to the Intranet.
The IP address of the computer. In this case, session is virtual, and UDP communication does not need to establish a connection. But for napt, there must indeed be a session concept. Napt
An important problem for transparent transmission of UDP packets is how to handle this virtual session. We all know that sessions connected to TCP start with SYN packets and use fin
When the package ends, napt can easily obtain TCP
And process the session. However, UDP is troublesome. napt does not know whether the forwarded UDP protocol package has reached the target host. While
In view of the characteristics of the UDP protocol, the reliability is poor, so napt must maintain the existence of the session, in order to wait for the external data to be sent back and forwarded to the Intranet IP address that once initiated the request
Computer. How does napt handle UDP?
What about Session Timeout? Devices provided by different vendors are not nearly the same for napt implementation. Maybe several minutes, maybe several hours, some napt implementations will be smart Based on the busy status of the device.
Calculate the timeout value.


Figure 4: napt transfers the source address and source port of the internal UDP protocol package to the public IP host.

Figure 5: napt transfers the destination address and destination port of the UDP packet returned by the Public IP host to the Intranet IP computer
Now we understand how napt implements transparent communication between computers on the Intranet and hosts on the Internet. Now let's take a look at our most important question: what policies does napt use to determine whether to establish a session for a UDP packet sent by a request? There are several strategies:

A. the source address (intranet IP address) is different. If you ignore other factors, the napt must correspond to different sessions.
B. If the source address (intranet IP address) is the same and the source port is different and other factors are ignored, the napt must correspond to different sessions.
C. If the source address (intranet IP address) is the same, the source port is the same, the destination address (Public IP address) is the same, and the destination port is different, the napt must correspond to the same session.
D. The source address (intranet IP address) is the same, the source port is the same, and the destination address (Public IP address) is different. If the destination port is ignored, how does one process the session on the napt?

D. The issues we are concerned about and want to discuss. According to the method determined by the establishment of the session by the target address (Public IP address), we divide the napt device into two categories:

Symmetric napt:
For connections to the same IP address, use the same session for any port connection; for connections to different IP addresses, use different sessions for any port connection.
We call this napt symmetric napt. That is, as long as the locally bound UDP port is the same and the destination IP address is different, different sessions will be created.


Figure 6: symmetric ric. Multiple ports correspond to multiple hosts. They are parallel and symmetric!

Cone napt:
For connections to the same IP address, use the same session for any port connection; for connections to different IP addresses, use the same session for any port connection.
We call this napt cone napt. That is, as long as the locally bound UDP port is the same, the sent destination address uses the same session regardless of whether the destination address is the same.


Figure 7: Cone indicates a cone. A port corresponds to multiple hosts. Is it like a cone?

Currently, the vast majority of napts belong to the latter, namely, cone Nat. During the test, I had to use a Japanese distributed Ric.
Nat. Fortunately, it's not my own purchase. I never buy Japanese goods,
I hope that my friends who read this article will not purchase Japanese products. The napt of the Win9x/2 k/XP/2003 system also belongs to cone.
Nat. This is a blessing, because the UDP penetration we want to do can only be performed between cone Nat, as long as one is not cone
Nat. Sorry, there is no hope for UDP penetration. Please forward it to the server. Detailed analysis will be performed later!

Next we will analyze some data structures when napt is working. Here we will really show that UDP can penetrate cone
The basis of Nat. The data structure described here is only to illustrate the principles and has no practical reference value. If you are really interested, read the source code about Nat implementation in Linux. Real Nat implementation
The database is not used either!

The symmeting data structure of the worker Ric napt is as follows:

Intranet info table:

[Napt allocation port] [Intranet IP address] [Intranet port] [Internet IP address] [sessiontime start time]

Primary Key ([napt allocation port])-> indicates that a primary key is created based on [napt allocation port]. It must be unique and indexed to speed up searching.
Unique ([Intranet IP address], [Intranet port])-> indicates that the two fields cannot be duplicated.
Unique ([Intranet IP address], [Intranet port], [Internet IP address])-> indicates that the three fields cannot be combined.

Ing table:

[Napt allocation port] [Internet port]

Unique ([napt allocation port], [Internet port])-> indicates that the two fields cannot be duplicated.

The port ing data structure when cone napt is working is as follows:

Intranet info table:

[Napt allocation port] [Intranet IP address] [Intranet port] [sessiontime start time]

Primary Key ([napt allocation port])-> indicates that a primary key is created based on [napt allocation port]. It must be unique and indexed to speed up searching.
Unique ([Intranet IP address], [Intranet port])-> indicates that the two fields cannot be duplicated.

Internet information table:

[WID primary key ID] [Internet IP address] [Internet port]

Primary Key ([WID primary key ID])-> indicates that a primary key is created based on [WID primary key ID]. It must be unique and indexed to speed up searching.
Unique ([Internet IP address], [Internet port])-> indicates that the two fields cannot be duplicated.

Ing table: one-to-many

[Napt allocation port] [WID primary key ID]

Unique ([napt allocation port], [WID primary key ID])-> indicates that the two fields cannot be duplicated when combined.
Unique ([WID primary key ID])-> this field cannot be repeated.

After reading the data structure above, do you better understand or get dizzy? Haha! I think it will be clear later. Through Nat, it is easy for computers on the Intranet to connect to the outside, and napt will automatically process it. Our applications do not have to worry about how it is handled. How can external computers access computers in the Intranet? Let's take a look at the following process:

C is an intranet computer behind napt, and S is a computer with an Internet IP address. C initiates a connection request to S. The napt records the connection request in its own data structure based on the rules described above and creates a session. then, two-way transparent data transmission can be achieved between C and S. As shown below:

C [192.168.0.6: 1827] <-> [priv IP: 192.168.0.1] napt [Pub IP: 61.51.99.86: 9881] <-> S [61.51.76.102: 8098]

It can be seen that the communication between a computer with an Internet IP address and the Intranet computer after napt requires that the Intranet computer after napt actively initiate
UDP data packets. The computer of the Internet IP address obtains the Internet IP address of napt and the mapped port using the received UDP packet, and then can communicate transparently with the computer of the Intranet IP address.

Now
Let's analyze how the Intranet computers behind the two most concerned napt can achieve direct communication?
Neither of them can actively send connection requests, and no one knows the public IP address of the other's napt and the port number mapped above the napt. Therefore, we need to rely on a server with a public IP address to help build the two.
Connection. When the Intranet computers behind the two napt connect to the server with the public IP address, the server can obtain the public IP address of the two napt devices and
The ing port of the session established by the two connections. The two Intranet computers can obtain the public IP address and mapped port of the napt device from the server.

Assume that the two Intranet computers are a and B, and the corresponding napt is an and bn respectively. If a obtains the IP address of the BN corresponding to B and the mapped port, cannot rush to this IP address
Location
The address and the mapped port send a UDP packet. What will happen? Based on the above principles and data structure, we will know that an will generate a record in its own data structure to identify a new
Session. After receiving the data packet, BN queries from its own data structure and does not find the relevant records. Therefore, the packet is discarded. B is a chronic subaccount, And then it slowly changes to an IP address.
What is the result of sending a UDP packet to the mapped port? Of course, it is our expected structure. An finds records from its own data structure after receiving the data packet, so it processes the data packet.
Sent to. When a sends data packets to B again, everything is unobstructed. OK, big work! And slow. For cone napt
What about napt? Let's analyze it by yourself...

Napt (the IP network address/port translator) to analyze the specific situation of UDP penetration!

First, we will clearly divide the napt device into the following parts according to the above description: Hybrid Ric napt and cone napt, which we need. The napt that comes with Win9x/2 k/XP/2003 is also cone napt.

In the first case, both parties are using Symmetric napt:

In this case, there is no problem. UDP penetration is certainly not supported.

In the second case, both parties are cone napt:

This is what we need. We can perform UDP penetration.

In the third case, one is symmetric napt and the other is cone napt:

This situation is complicated, but it is easy to understand it by analyzing the above description and data organization. The analysis is as follows,

Assume: A-> symmetric Nat, B-> cone Nat

1. A wants to connect to B. A obtains the NAT address and ing port of B from the server. A notifies the server that the server notifies B A of the NAT address and ing port,
B initiates a connection to a, and a cannot receive it. At this time, a initiates a connection to B. The Nat corresponding to a creates a new session and assigns a new ing port,
After Nat of B receives a UDP packet, it queries in its own ing table and cannot find the ing item. Therefore, the packet is discarded.

2. B wants to connect to A, B obtains the NAT address and ing port of a from the server, B notifies the server, and the server notifies
B's Nat address and ing port. A initiates a connection to B. The Nat corresponding to a creates a new session, and a new ing port B is assigned. In this case
B initiates a connection to a. Because B cannot obtain the ing port of the new session established by a, B still uses the ing port obtained on the server to connect. Therefore,
After receiving the UDP packet, Nat cannot find the ing item in its own ing table, so the packet is discarded.

According to the above analysis, only when the NAT addresses at both ends of the connection are cone Nat can UDP Intranet penetration be achieved.

Napt (the IP network address/port translator) for UDP penetration how to verify and analyze the reality!

The required network structure is as follows:

The Intranet machines behind the three Nat servers and the two Internet servers. Two of them are cone napt and one is symmetric napt.

Verification Method:

You can use the source code provided by this program to compile and then run the server program and client respectively. The modified source code adds the command to send messages directly between clients through IP addresses and ports.
Command, You can manually verify the napt penetration. For ease of operation, we recommend that you use a remote login software that allows you to directly operate on all related computers on one machine.
People can complete all the work. That's how I did it. If you are interested or experienced, send a letter to us to criticize and correct the situation and make progress together.

Address: http://blog.donews.com/zwell/archive/2006/01/25/708473.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.