The topological structure refers to the physical or logical interconnection between computing units in a distributed system. The Topological Structure between nodes has always been an important basis for determining the system type. Currently, centralized and hierarchical topologies are widely used in interconnected networks. Interne is the world's largest non-centralized interconnected network, however, some network application systems established in 1990s are completely centralized, and many web applications run on centralized server systems. The centralized topology system is currently faced with some difficult problems, such as excessive storage load and DoS attacks.
P2P systems generally need to construct a non-centralized topology, during the construction process, it is necessary to solve the problems such as how to name and organize a large number of nodes in the system, how to determine the addition/exit methods of nodes, and how to recover errors.
Based on the topological relationship, P2P research can be divided into four forms: centralized topology and decentralized unstructured topology ); decentralized structured topology, also known as DHT network, and partially decentralized topology ).
Among them, the biggest advantage of the centralized topology is the high efficiency of simple maintenance and discovery. Because Resource Discovery relies on a centralized directory system, discovery algorithms are flexible and efficient and can implement complex queries. The biggest problem is similar to the traditional Client/Server structure, which may easily cause single point of failure (spof), access to "hot spots", and legal issues. This is the structure model used by the first generation of P2P networks, the typical case is the famous MP3 sharing software Napster.
Napster is one of the first P2P systems that have emerged, and has grown rapidly in the short term. Napster is not a pure P2P system. It stores the index and location information of all the music files uploaded by Napster users through a central server. When a user needs a music file, he/she first connects to the Napster server for retrieval on the server, and the server returns the user information containing the file; then, the requester directly connects to the file owner to transfer the file.
Napster first realizes the separation of file query and file transmission, effectively saving the bandwidth consumption of the central server and reducing the system's file transmission latency. The biggest risk of this method lies in the central server. If the server fails, the entire system will be paralyzed. When the number of users increases to 105 or higher, Napster's system performance will be greatly reduced. Another problem is security. Napster does not provide an effective security mechanism.
In the Napster model, a group of high-performance central servers store the directory information of all active peer computers sharing resources in the network. When a file needs to be queried, a peering opportunity sends a file query request to a central server. After the central server performs a search and query, it returns the peer address information list that meets the query requirements. After the query initiator receives a response, it selects the response based on network traffic and latency, establishes a connection with the appropriate peer, and starts file transmission.
This peer-to-peer network model has many problems, including:
(1) The paralysis of the central server can easily lead to the collapse of the entire network, with low reliability and security.
(2) As the network scale expands, the cost of maintaining and updating the central Index Server will increase sharply, and the required cost is too high.
(3) The existence of the central server has caused copyright disputes over shared resources, and thus is attacked as a non-purely P2P network model. For small networks, centralized directory models have some advantages in management and control. However, due to its various defects, this model is not suitable for large-scale network applications.