The principle of DHT technology

Source: Internet
Author: User
Tags sqlite database
introduction of Peer-to-peer and DHT technology

Peer-to-peer in the mind can be said to be the Internet thought/spirit/philosophy is very concentrated embodiment of common participation, transparent open, equal sharing (reminds me of the previous study, is now the crazy speculation of the "centralized" system of cloud computing). There are many applications based on Peer-to-peer technology, including file sharing, instant messaging, collaborative processing, streaming media communication and so on. Through the contact of these applications, analysis and understanding, peer-to-peer its essence is a new network communication technology, this new communication technology broke the traditional C/s structure, gradually to central, flat, this may be to a certain extent, the "world is flat" trend, hehe. Peer-to-peer file-sharing applications (bts/emules, etc.) is the most concentrated of peer-to-peer technology, our research here is Peer-to-peer file sharing network as the portal, Peer-to-peer file sharing network development of the following several stages, including the tracker server network, Pure DHT networks without any servers, mixed Peer-to-peer networks. DHT network development is "thought/culture" on the "development", there are certain commercial needs (copyright management).

DHT full name is a distributed hash table (distributed hash table), is a distributed storage method, a class of information can be uniquely marked by the key value to be stored in a certain agreement/protocol on multiple nodes, so as to effectively avoid the "centralized type" Server, such as: Tracker, caused by a single failure of the entire network paralysis. There are many kinds of technologies/algorithms to implement DHT, which are commonly used: chord, pastry, kademlia and so on. We are here to study the Kademlia algorithm, because of BT and BT derivative (Mainline, btspilits, Btcomet, utorrent ... ), emule and emule various mods (VERYCD, Easy emules, Xtreme ...) Peer-to-peer file sharing software is based on this algorithm to achieve DHT network, BT using Python Kademlia implementation called Khashmir (Koshmir), the official website below. emule using C + + Kademlia implementation simply called Kad, of course, there are some differences between them, but the basis is kademlia. Here we take bt-dht as an example to analyze the introduction, the following said DHT can be the default is BT-KADEMLIA-DHT.

Official website: Http://www.tribler.org/trac/wiki/Khashmir

Second, the Kademlia realization principle

A variety of DHT implementation algorithms, whether it's chord, pastry or kademlia, the most immediate goal is to locate the desired node at the fastest speed, and in Peer-to-peer file sharing applications, find the peers list information that is sharing a file/seed at the fastest speed. Because each node is distributed anywhere on the earth, if the distance between two nodes is measured by geographical distance, it can be extremely complex or even impossible to measure, so basically all DHT algorithms use some kind of logical distance, In Kademlia, a simple XOR or calculation is used to measure the distance between two nodes, it has nothing to do with geographical distances, but it has the most characteristic of geometrical formula:

(1) The different or distance between the node and itself is 0

(2) The different or the distance is symmetrical: that is, the different or distance from a to B is equal to the distance from B to a

(3) The difference or distance conforms to the triangular inequality: given three vertices a B C, if the difference or distance between AC is the largest, then the difference or distance between AC will be less than or equal to AB difference or distance and the sum of the difference between the BC and the distance.

(4) for a given distance, there is only one node B in distance A, one-way, and one-way on the search path, which is different from the geographical distance.

Kademlia that all nodes have a node ID, the Node ID generation method and the info hash in the seed file use the same algorithm: the Sha-1 (Security hash algorithm), so the ID of each node, and each shared file/ The info-hash of the seeds are unique and are composed of 20 characters and 160bits bits. The distance between the two nodes is the difference or result of the two node IDs, and the distance of the node from the key value (the seed) is the ID of the node and the Info-hash or result of the seed file. Kademlia the entire DHT network topology into a binary prefix tree based on the metric of XOR or distance (the implementation of ARP in the Xuanwu system is an example), all nodes (all of which are running, and the Bt,btspilits application that has the DHT function opened) As the leaf node of the two-fork prefix tree, it can be imagined that the binary tree can hold up to 2,128 leaves (nodes), which is enough to organize a network of any size. For each node, the tree can be divided into 160 Shang trees according to their distance, each subtree and the node have a common prefix, the less common prefixes farther away. As shown in the following illustration:


(Note: The above image is just an example of a subtree that is not located on the same layer of leaves)

The above figure Red node example 0011-bit example, it can divide the other nodes to a bit 4 different subtree, closer to themselves the subtree and its own longer public prefixes, if the nodes are evenly distributed, the nearer the subtree contains fewer leaf nodes (only one of the brothers is the one with 159 common prefixes). Because the nodes are located at the bottom of the tree's leaf position, the level looks like all the leaves are on a line, if this line as 2128 of each point of space, it is better to reflect the division of the above characteristics (binary split). In order to reach these 160 Shang trees quickly, each node in the DHT network records the K node information (ip,port,id) on each Shang tree, and K is fixed to 8 in BT, for example, the red node in the image above may hold 8 leaf node information of the leftmost subtree, Of course, close to their own subtree may not have 8 leaves, then all the existing leaves recorded, this record information in the Kademlia algorithm is called K Bucket, also known as "routing table", of course, this "routing table" information and our IP route meaning a little different, It represents a node that is in a distance of its own range [2i-2i+1], which can be further positioned through the selected K nodes in the range, and the following figure is a "routing table" structure:



Note: Here is just an example, in the actual "road table" may be not 160 copies, because the routing table generation process is split in half, initially only a K bucket (range: 0-2160, and only includes itself), in the interpolation process when the K bucket node is greater than K (8), then split into two halves, Half of which includes its own node, and half does not include itself. The cycle continues, forming a dynamic size (1<=len (table) <=160) "Routing table."

Each new node added to the DHT network the first of these "routing table" information is empty, and there are several ways in which you can incrementally build and form your own "routing table" information:

(1) If this node once started the process, read it directly from the saved routing table file and then refresh the routing table

(2) If the node first starts (for example, a new installation btspilits and then starts), and the node comes with a "super node" that indirectly generates its own "routing table" by using these "super nodes" (in a version of Kashmir there is a file that holds the "hyper-contact information"), Btsplits, Btcomet, emules are embedded in more than 20)

(3) If the first startup node does not have these so-called "super nodes" (such as mainline), its routing table generation process needs to be deferred to the download file process. It extracts the nodes field from the seed file it gets to. The field is generated when the seed is made (supports the seed of the DHT Network), the General nodes field is set to the IP and port of the original seed, or the node that is the seed is the nearest K node from the Info-hash of the seed. The nodes in these nodes fields are used to indirectly generate their own routing tables.

(4) Dynamic establishment process, the process for the node after the initialization, in the download or upload or no task in the process of receiving any messages sent by any node, will check the current "routing table" and try to follow certain rules to establish/refresh the routing table.

        We know that the main goal of the DHT network is to replace tracker (pure Peer-to-peer Network, no traker) or as a backup of tracker (a hybrid Peer-to-peer network, Current basic all mainstream file sharing applications are this type. The main function of tracker is to maintain a peers list of each shared file (seed), and then tell the queried client who needs to be downloaded. The method of implementation is to tracker the peers-list information of all the seeds maintained in a centralized way by using DHT to hash and save the nodes in all DHT networks, and then provide a method of searching on this basis. The "Road has a table" function is to speed up the search process. The DHT implementation includes two types of lookups, one looking for nodes (Find_nodes) and the other looking for peers (get_peers). The process of finding nodes is primarily to establish a local "routing table", whose ultimate goal is to find peers later. The process of finding a node is probably this, if node x needs to find node Y, then x gets the K comparison closer node in the local K bucket corresponding to the XOR (X,y), and then continues to ask the K nodes that compare close to see if it has a node closer to Y. These K-nodes of course also from their corresponding K bucket to return K more near the node to the x,x and then from the return result to select K more More closer nodes repeat the above action, until the node can not return to the more near, then the last found K node is the  Most closest nodes, any K-close node that is returned in this process will try to plug into its own routing table. The X lookup peers-list is similar to the method for finding the node above, but it looks for Info-hash as a parameter, and if any of the nodes in the lookup process are returned (info-hash, peers-list) Yes, the lookup is ended prematurely. When a node has been peers-list by the above method, an attempt is made to initiate a TCP connection to each peers to continue with the actual download process (as defined by the Peer-peer protocol protocol). It also sends its own peer information to the previous speaker and to the K-nearest node in his K bucket to store the peer-list information. This information can be saved for 24 hours on the K node and will not be valid after 24 hours if no update message is sent by X. So an active node stores two parts of the information, part of the local "routing table", and the other part (info-hash, peers-list) list information (can have multiple). The value of the Info-hash is of course part of (0-2160) space, but unlike the node ID, the node ID can be the leaf of the invisible two-fork prefix tree (why is it invisible because each node is actually not storing the tree in the data structure), Info-hash only attaches to the node id closest to its value.

Iii. News of Kademlia

To achieve the above "routing table" build, refresh, get peers-list, save peers-list These functions,Kademlia define four basic KRPC operations:

(1) The ping action is to probe a node to determine whether the node is still online.

(2) Store operation, the role is to inform a node to store a <key,value> pair, so that later query needs.

(3) Find_node operation, which is to return K-node information (IP address,udp Port,node ID) to the sender from the K bucket corresponding to its "routing table"

(4) Faind_value operation, the role is to Info-hash as a parameter, if this operation receiver just stored Info-hash peers then return peers list, otherwise from their own "routing table" Returns the K node information (same Find_node process) closer to the Info-hash.

The above is only the most basic operation, a nodes or Info-hash lookup process requires the node to perform several times above the find operation, a recursive lookup process. Use the above operation to describe one node x at a time. To find a node with an ID value of T, the procedure is as follows:

1, calculate the distance to T: D (x,y) = X⊕y.

2, from the X [㏒d] K bucket to remove the information of alpha node (each implementation alpha value is different, some is 3 is equal to K value), while the Find_node operation. If the information in this K bucket is less than alpha, select a total of alpha nodes closest to D from a number of buckets nearby.

3, to accept the query operation of each node, if you find yourself is T, then answer yourself is the closest to T. Otherwise, measure the distance between yourself and T, and select the Alpha node's information from your corresponding k bucket to x.

4. X performs the find_node operation again on each node that is newly received, and this process is repeated repeatedly until

Each branch has nodes that are closest to the T, or the node value returned by the Find_node operation has not been looked up, that is, a closer node is not found.

5. Through the above search operation, x obtains the node information of K closest to T.

Note: The term "closest" is used here because a node with an ID value of t does not necessarily exist in the network, meaning that T is not assigned to any computer.

The process of finding peers-list is replaced by a find_value action, but note that the differences mentioned above can be described in a similar context.

The above four primitives were renamed on the implementation of BT-DHT, and the following four types of information were defined, which are called KRPC (K for Khashmila/kademlia), sent via UDP, and a response or error is requested.

(1) Ping (with Kademlia with the same function)

Beconded (Take bitsprits as an example):

Ping Request Format:

D1:ad2:id20:xxxxxxxxxxxxxxxxxxxe1:q4:ping1:t4:tttt1:y1:qe

Meaning: This action is requested for the ping operation, and the parameter is the sender's ID: xxxxxxxxxxxxxxxxxx

Ping reponse Format:

D1:RD2:ID20:YYYYYYYYYYYYYYYYYY E1:t4:1:y1:re

Only one responder's ID information is included in the returned data.

(2) Find_node(with the same function as Kademlia)

Beconded (Take bitsprits as an example):

Find_node Request Format:

D1:ad2:id20:xxxxxxxxxxxxxxxxxxxx6:target20:yyyyyyyyyyyyyyyyyyyy1:q9:find_node1:t4:1:y1:qe

Meaning: This action is the Find_node request, the parameter is the Sender ID and the ID of the target node

Find_node reponse Format:

D1:RD2:ID20:XXXXXXXXXXXXXXXXXXXX5:NODES208:NNNNNNNNNNNNN5:TOKEN20:OOOOOOOOOOOOO1:T4:TTT 1:y1:re

The meaning is: 8 Most recent nodes were found, nodes208 represents 8 node information (ip,port,id) Total 208Bytes

(3) get_peers (corresponding to the Find_value message in Kademlia)

Beconded (Take bitsprits as an example):

Get_peers Request Format:

D1:ad2:id20:xxxxxxxxxxxxxxxxxxxx9:info_hash20:zzzzzzzzzzzzzzzzzzzze1:q9:get_peers1:t4:tttt1:y1:qe

Meaning: This operation is a get_peers operation request, the parameter is: the sender's ID and the Info-hash to query the seed.

There are two kinds of get_peers response formats, one is to find the peers list information that the node contains the Info-hash, the following format:

The meaning of the expression:

D1:RD2:ID20:XXXXXXXXXXXXXXXXXXX5:TOKEN20:OOOOOOOOOOOOOOOOOOO6:VALUESL6: (IP1,PORT1) + (IP2,PORT2) + (Ipi,porti) ... E1: T4:tttt1:y1:re

(values followed by the peers list, IP, port)

The other is that the list information is not found, as in the following format:

D1:RD2:ID20:XXXXXXXXXXXXXXXX5:NODES208:NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 5:TOKEN20:OOOOOOOOOOOOOOO1:T4:TTTT1 : Y1:re

The meaning of the expression is:

No node with Info-hash was found, but 8 nodes closer to the Info-hash were found, nodes208 represents 8 node information (ip,port,id) Total 208bytes

(4) Announce_peer (corresponding to the store message in Kademlia )

Beconded (Take bitsprits as an example):

Announce _peers request format:

d1:ad2:id20:xxxxxxxxxxxxxxxxx9:info_hash20:zzzzzzzzzzzzzzzzzz4:porti10756e5:token20:ooooooooooooooooo1:q13: Announce_peer1:t4:tttt1:y1:qe

The meaning is: This operation for Announce_peer request operation, tell the end of my side of the Info-hash file upload and download, can become a peers list of a member, the port number is 10756.

Announce_peer reponse Format

D1:rd2:id20:xxxxxxxxxxxxxxxxxxxx2:ip4:pppp1:t4:tttt1:v4:utb*1:y1:re

The attachment is crawling for a simple download process/An initial initialization of the routing table packet: can be compared to the analysis

Four, bttorent DHT to achieve several important processes

seed Production:

1). The/maketorrent-console parameter use_tracker is set to False, the announce tracker field is not generated

2 Read the local "routing table" file and find out the nearest node K from Info-hash as the nodes field

startup process:

1 to load the "routing table" K-bucket information from the routing_table file and initialize the memory "routing table" information

2 Force the refresh of the "routing table" in each k bucket, the refresh process is randomly generated by an ID for Findnode lookup.

to refresh the routing table:

1 Force Refresh when starting

2 Every 15 minutes if the information in the K bucket is not updated, then the K-bucket is refreshed once, that is refreshtable

3 A checkpoint operation every 5 minutes to store the current routing table in the Routing_table file

Format of routing_table file:

{' id ': node.id, ' host ': Node.host, ' Port ': node.port, ' age ': Int (node.age)}

Some implementations use the SQLite database to implement this part of the function.

The K bucket for each "routing table" has a "recently updated" attribute that is updated and restarts a 15-minute timer when a ping response to any of the nodes in the bucket is received, or any node is joined or replaced. If the timer times out (the K bucket node does not have any update operation within 15 minutes), a refresh operation is performed on the K bucket, and the procedure is to select a random ID from the K bucket range and then find_node the ID. Nodes in the routing table need to remain live, which means that they are not offline, and if the 3 consecutive requests to a node in the routing table do not receive a response, the node is considered invalid.

refreshtable (force=1) process:

(1) If force=1, the current each k bucket is refreshed

(2) If the current nodes number of K bucket is less than K (8), also refresh

(3) If there is an invalid node in the K bucket, a node with three consecutive messages not receiving a response

(4) If all the nodes in the K bucket have not interacted for more than 15 minutes, the refresh is also done

When a node receives any RPC message (ping/find/getpeers/announce_peer), it checks to see if the sender of the message is in the local routing table, if the sender already exists in the node's local routing table. The sender is moved from its corresponding K bucket to the end of the K bucket. If the sender is not in the node's routing table, it attempts to insert into the local "routing table" K bucket, which is also a process for the dynamic establishment of the routing table , as follows:

(0) Find the sender of the corresponding K bucket

(1) If the node is found in a response message, update the node Lastseen = time ()

(2) If the K bucket size is less than K (8) is inserted directly behind the K bucket

(3) If the K bucket is full, check if there are invalid nodes, if there are those invalid nodes removed, and put the node into the end of K bucket. (But then again, these pre-existing nodes will be ping again to determine if they are invalid, and if so, put the nodes back as K barrels)

(4) If the K bucket is full, and all the nodes are valid, then you need to see whether it (the customer is short) in the K bucket (that is, the K bucket is its own K bucket), if not then directly discard the node

(5) If the K bucket is not its own K bucket, it needs to be a K-barrel split. The split method is one that becomes two equal length k barrels, one that includes itself and one that is not included.

(6) Add the node to a K bucket after the split.

findnodes (ID, invalid=true) process:

The process is an internal process, giving the following Findnode ()

(0) If the node is in its own K bucket, it returns directly, ending the process

(1) If invalid=true, you need to exclude the currently invalid node

(2) If all the nodes in the K bucket are selected above the K (8), they need to be supplemented from other buckets, as follows

(3) Add the nodes in the left and right two K barrels, and then sort all the nodes according to their distance from the ID, and select the nearest K (8) node

(4) Returns the nearest K node that is last obtained.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.