P2P popularity Series II: distributed architecture

Source: Internet
Author: User
Pastry is an extensible distributed object locating and routing protocol proposed by Microsoft Research Institute. It can be used to build large-scale P2P systems. In pastry, each node is allocated with a 128-bit nodeid. All node identifiers form a circular nodeid space ranging from 0 to 2128-1, when a node is added to the system, the hash node IP address is randomly allocated in the 128-bit nodeid space.

At MIT, multiple P2P-related research projects were conducted: chord, grid, and Ron. The goal of the chord project is to provide a distributed resource discovery service suitable for P2P environments. It uses the DHT technology to make it necessary to maintain the O (logn) Length of the specified object.
In DHT technology, a network node is assigned a unique node ID in a certain way. A resource object generates a unique resource ID (Object ID) through the hash operation ), the resource is stored on nodes with the same or similar node IDs. When you need to find the resource, you can locate the node that stores the resource in the same way. Therefore, the main contribution of chord is to propose a distributed search protocol that maps the specified key to the corresponding node ). SlaveAlgorithmHere, Chord is a variant of the compatible hash algorithm. The MIT grid and Ron projects propose a system framework for searching resources in the distributed wide area network.

The Content Addressable Networks project in the T & T acsiri center is unique in that it uses multi-dimensional identifiers to implement distributed Hash algorithms. Can maps all nodes to an n-dimensional Cartesian space, and allocates an area as evenly as possible for each node. The hash function used by can is used to calculate the key in the (Key, value), obtain a vertex in the Cartesian space, and convert (Key, value) the pair is stored in the node that owns the region where the point is located. The routing algorithm used by can is very direct and simple. After knowing the coordinates of the target point, the request is sent to the node with the coordinates closest to the target point in the four adjacent nodes of the current node. Can is a highly scalable system. Given n nodes and with the system dimension D, the routing path length is O (N1/d ), the route table information maintained by each node is independent of the network size ).

The biggest problem with the DHT class structure is the complicated maintenance mechanism of DHT, especially the network fluctuation caused by frequent addition and exit of nodes, which will greatly increase the maintenance cost of DHT. Another problem facing DHT is that DHT only supports exact keyword matching queries and cannot support complex queries such as content/semantics.

The semi-distributed structure (called hybrid structure in some documents) draws on the advantages of a centralized structure and a fully distributed unstructured topology, and features high selection performance (processing, storage, bandwidth, and other performance) as a supervertex (supernodes, hubs), the node stores the information of other nodes in the system on each supervertex, then, the query request is forwarded to the appropriate leaf node. The semi-distributed structure is also a hierarchical structure. A high-speed forwarding layer is formed between super points, and several layers are formed between the super points and the common nodes in charge. The most typical case is Kazaa.

KaZaA is one of the world's most popular P2P software. According to ca statistics, KaZaA has downloaded more than 0.25 billion times worldwide. Using the Kazaa software for file transmission consumes 40% of the Internet bandwidth. It is so successful because it combines the advantages of Napster and Gnutella. In terms of structure, it uses the fully distributed structure of Gnutella, which can be better extended by the system because it does not need to store the file name on the central Index Server, it automatically converts a high-performance machine into a Supernode and stores the file information of the leaf node closest to it. These supernodes are connected to form an overlay network. the indexing function of Supernode greatly improves the search efficiency.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.