Abstract: The basic features and network structure of P2P technology are introduced, and the business model and security issues of P2P technology are analyzed and discussed.
1. Features of P2P Technology
P2P is a distributed network where network participants share part of their hardware resources (such as processing, storage, network connection, and printers ), these shared resources need to be provided by the Network for direct access by other peer nodes without passing through intermediate entities. The participants in this network are both resource (service and content) Provider (server) and resource (service and content) accessors (client ).
1.1 characteristics of P2P Technology
(1) Non-centralized. Resources and Services in the network are distributed across all nodes, and information transmission and service implementation are directly performed between nodes, avoiding possible bottlenecks.
(2) scalability. In P2P networks, with the addition of users, not only does the demand for services increase, but the overall system resources and service capabilities are also being expanded synchronously, which can always easily meet users' needs. Theoretically, the entire system is fully distributed and there is no bottleneck.
(3) robustness. The P2P architecture is inherently attack-resistant and highly fault-tolerant. Because services are distributed across nodes, some nodes or networks are damaged, which has little impact on others. Generally, a P2P network can automatically adjust the overall Topology when some nodes fail to maintain the connectivity of other nodes. P2P networks are usually established in self-organizing mode, allowing nodes to join and exit freely. P2P networks can also make adaptive adjustments based on changes in network bandwidth, number of nodes, and load.
(4) high performance/price ratio. The performance advantage is an important reason why P2P is widely used. P2P architecture can effectively use a large number of common nodes distributed on the Internet to distribute computing tasks or storage data to all nodes. Use idle computing power or storage space to achieve high-performance computing and massive storage. By leveraging a large amount of idle resources in the network, you can provide higher computing and storage capabilities at a lower cost.
(5) privacy protection. In a P2P network, the transmission of information is scattered among nodes without a concentrated link. Therefore, the possibility of private information theft and leakage is greatly reduced.
(6) Server Load balancer. In a P2P network environment, each node is both a server and a client, which reduces the requirements for the computing and storage capabilities of traditional C/S structure servers. At the same time, resources are distributed across multiple nodes, better load balancing for the entire network.
1.2 P2P application types
P2P applications mainly include P2P networks that provide file and other content sharing, mining P2P Peer-to-Peer Computing and storage sharing capabilities, and P2P-based collaborative processing and service sharing platforms, instant Messaging (including ICQ, QQ, and MSN), secure P2P communication and information sharing (such as Skype.
2. P2P Network Structure
The P2P mode has undergone three phases: centralized, distributed, and hybrid.
2.1 centralized P2P
In centralized P2P mode, a central server is responsible for recording shared information and answering queries to such information. Each peer entity is responsible for the information it will share and the Communication it will perform, and download the information on other peer entities as needed. This form is centralized, but it is different from the traditional Client/Server mode: in the traditional sense, the Client/Server mode adopts a monopoly, all data is stored on the server. The client can only passively read information from the server, and there is no interaction between clients; in the centralized P2P mode, all the data provided on the Internet is stored on the client that provides the data, and only the index information is retained on the server, in addition, the server interacts with the peer object and the peer object.
The main advantage of centralized P2P mode is its simple maintenance and high discovery efficiency. Because Resource Discovery relies on a centralized directory system, discovery algorithms are flexible and efficient and can implement complex queries. The biggest problem is similar to the traditional Client/Server structure, which can easily lead to single point of failure, with low reliability and security.
2.2 distributed P2P
In a distributed peer network, a peering machine traverses the entire network system through a connection with an adjacent peering machine. Each peer has similar functions and does not have a dedicated server. The peer must rely on the Distributed Network to find files and locate other peers.
Based on the topological relationship of nodes, distributed P2P networks can be classified as structured and unstructured. The unstructured P2P network uses a Random Graph to quickly discover the target node, which has good availability, easy maintenance, and supports complex searches, but cannot guarantee that the query results are complete. To ensure query results, some P2P networks maintain a central directory, which greatly limits the network scalability and is not feasible in many cases. In a structured P2P network, the node topology of the network is strictly defined, and nodes maintain the topology of the stack network through certain protocols. A Deterministic topology is used to provide efficient and deterministic queries. As long as the target node exists in the network, the accuracy of the discovery will be guaranteed, but maintaining the network topology will consume certain network resources.
Because hash tables are the most efficient to query, the distributed Hash table is an ideal solution for constructing a structured P2P stack network. The Distributed Hash (DHT) is a large hash maintained by nodes on the overlay network. The hash is divided into different small tables and maintained and managed by each node. Each DHT node allocates a unique identifier (ID) of a certain length in a certain way. The resource object also generates a unique resource identifier (Object ID) of the same length through the hash operation ), the resource is stored on nodes with the same or similar node IDs. Each DHT node contains information about other node locations in its maintenance hash, when a query or node positioning request is received, if the node does not have a key ID, the request is forwarded to a node with a more similar node ID and keyword id until the query converges.
The distributed P2P model also has many drawbacks, mainly manifested in the following aspects:
(1) A search request must pass through the entire network or at least a large range to obtain the result. Because of this, this mode occupies a lot of bandwidth, in addition, it takes a long time to return results.
(2) With the expansion of the network scale, the method of locating the peer through diffusion and querying information will cause a sharp increase in network traffic, resulting in network congestion.
(3) The biggest problem with the DHT class structure is that the maintenance mechanism is complicated, especially the network fluctuation caused by frequent addition and exit of nodes, which will greatly increase the maintenance cost of DHT. On the other hand, DHT only supports exact keyword matching queries, and cannot support complex queries such as content/semantics.
2.3 hybrid P2P network
Centralized P2P is conducive to rapid retrieval of network resources and unlimited expansion as long as the server capability is strong enough, but its centralized model is prone to direct attacks. Distributed P2P solves the problem of Attack resistance, however, quick search and scalability are not supported. The hybrid structure draws on the advantages of a centralized structure and a fully distributed unstructured topology, and selects a node with High Performance (processing, storage, bandwidth, and other performance) as a super node. The information of other nodes in the system is stored on each super node. It is found that the algorithm is only forwarded between super nodes, and the super node then forwards the query request to the appropriate leaf node. A hybrid structure is also a hierarchical structure. A high-speed forwarding layer is formed between the superpoints. The superpoints and the common nodes in charge constitute several layers.
3. P2P impact on Network Models
3.1 P2P impact on Network Models
P2P technology will bring about a "fully distributed" network of services. Traffic is more arbitrary, and direct data exchange between users is more frequent. P2P technology networking at the application layer brings more flexibility to network application operators, while also causing shortage of network resources for the basic bearer, network equipment is working at full load for a long time. The impact of P2P on network models is mainly reflected in the following aspects:
(1) As P2P symmetry and P2P Traffic ratio increase, the traffic model of man gradually migrates from asymmetric to symmetric, which is obviously in conflict with the XDSL Asymmetric Network of the access network.
(2) P2P flows are in a disordered state, resulting in network performance deterioration and congestion. bandwidth is greatly consumed and the benefits are zero. Currently, most P2P tools often create a large number of connections to ensure the transmission quality. A large number of connections do not transmit data and consume network resources.
(3) The cross-province traffic exceeds the Provincial and Municipal Traffic, resulting in increasing pressure on the expansion of provincial dry exports.
3.2 How operators guide P2P services
Blocking P2P alone cannot completely solve the problem. The operator must guide the rational application of P2P services. From the perspective of the P2P concept of several major domestic operators, they are experiencing a painful transformation process, from interception to guidance to self-intervention in operation.
(1) The operator's intervention can facilitate the reasonable configuration of Bandwidth Resources. The video service platform established by the operator must be operational, manageable, and billable. The existence of such platforms can reverse the current chaos in the P2P market. That is to say, the vast majority of P2P companies now only consider ensuring high download speeds and improving user experience, regardless of the bandwidth resources of China Telecom, which is regarded as a flood by operators.
(2) Encourage the original P2P operators to think more about the interests of the operators, not only by putting more data exchange on the bras layer, but also by the bandwidth of the operators.
(3) Promote P2P to focus on user experience and transform the use process.
(4) To accelerate the differentiation of P2P operators, the best choice for P2P companies is to become strategic partners of operators, which requires P2P companies to have technologies that meet the network characteristics of operators.
At present, most operators have studied P2P as an important issue, face up to the advantages of P2P, and try to suppress its shortcomings. Therefore, on the one hand, operators need to adjust the network architecture, deploy P2P monitoring devices and management on the edge of the network, and identify P2P applications through feature detection at the application layer of data packets, provides P2P Traffic Management Based on the total amount, the total number of sub-protocols, and individual users. This helps operators limit P2P Traffic between networks, reduce the network resizing pressure, and reduce the amount of settlement between networks. On the other hand, users should be guided to use P2P reasonably. So that it will not consume network resources unlimitedly. For example, by building differentiated network service quality, different billing systems are used at different service levels, and hierarchical management of users is provided, manages the total amount and users in different time periods to build operational P2P control.
With the gradual improvement of the network, I believe that the development of P2P will also become mature. Currently, P2P is not very mature in technology, and we hope to make breakthroughs in future development.
4. P2P impact on network security
The distributed structure used by P2P networks provides scalability and flexibility, while facing a huge security challenge: it needs to have no central node, provides mechanisms such as identity authentication, authorization, secure data transmission, digital signature, and encryption. However, the current P2P technology is still far from achieving this goal, and some of its own security defects hinder its further application.
4.1 malware
When using P2P, it is very difficult to verify the security of the shared file source. Therefore, P2P applications are often used by attackers as a carrier to transmit malicious code. As a result, P2P applications may contain spyware, viruses, Trojans, and worm. In P2P networks, each node has different defense capabilities against viruses, so as long as one node is infected with viruses, through internal sharing and communication mechanisms, viruses can spread to neighboring nodes, which can cause network congestion or even paralysis in a short time, or even completely control the entire network through network viruses.
With the development of P2P technology, various network viruses targeting P2P systems will emerge in the future. Attackers can exploit system vulnerabilities to quickly destroy, collapse, and control the system. Therefore, the potential crisis of Network Viruses puts forward higher requirements on the security and robustness of P2P systems, and requires authorization control on P2P application sources and P2P users.
4.2 Sensitive Information Leakage
When using P2P, other users may be given the access permission to access personal or sensitive information without knowing it, this may cause intentional intrusions to access sensitive information such as personal documents and accounts. Therefore, you need to consider how to protect your security policies.
4.3 Security Control
Many P2P applications need to open specific ports on the firewall to accept shared files. Currently, many P2P applications support firewall traversal and use http80 to carry P2P packets, therefore, it is necessary to consider that the firewall has the capability of Deep Packet detection (DPI), and identifies the application layer protocol used for stream classification through DPI scanning, identifies the specific P2P service type, and uses the layer-3 shaping technology to implement traffic control. Early P2P applications were fixed port numbers, which were easy to detect and manage. Later, they gradually developed to dynamic random port numbers, and some traditional detection methods were ineffective. Recently, new P2P applications are becoming more and more aware of anti-reconnaissance. They use encryption techniques to disguise HTTP protocols and transmit blocks to escape identification and detection. How to propose new detection methods for rapidly evolving P2P applications is a problem that requires in-depth research.
Over 4.4 P2P applications flood the network
A large number of P2P applications flood the Network, which will not guarantee telecom-level services. Therefore, it is necessary to consider that the network edge devices have the ability to identify services and service awareness, and have powerful QoS functions. In the IP layer, P2P streams are identified by statistical traffic characteristics, which can extract new types of P2P streams that have been compiled or are unknown and block unauthorized P2P streams.
5. Conclusion
After years of development and evolution, P2P applications have been widely used and are attracting more and more enterprises to research in this area. Blocking P2P alone cannot completely solve the problem. It is necessary to guide the rational application of P2P services and solve P2P development problems such as copyright, standards, security, and management, it is worth further research and discussion.