Two days ago, I saw the news on the Internet that razorback 2, a world-renowned E-Commerce Server, was blocked and 4 people were detained. I am deeply aware that it is not easy to use P2P file exchange software such as eMule and BitTorrent. Replacing the centralized Index Server with a distributed Hash table (DHT, distributed Hash table) is one of the few P2P software development trends that can be foreseen at present. Typical solutions include: can, Chord, tapestry, pastry, kademlia, and Viceroy. The kademlia protocol is one of the most widely used, practical, and concise principles and implementations, currently, mainstream P2P software uses it as its own secondary retrieval protocol, such as eMule, bitcomet, bitspirit, and Azureus. In view of the growing influence of kademlia, I wrote this article in my blog today, which is a summary of the relevant knowledge system.
1. Brief description of kademlia
Kademlia (KAD) is a typical structured P2P overlay network ), storage and retrieval of information using the distributed application layer network-wide approach is the main problem it is trying to solve. In the kademlia network, all information is stored in the form of hash table entries of <key, value>. These entries are scattered and stored on each node, thus, a huge distributed Hash table is formed in the form of a whole network. We can visually regard this hash table as a dictionary: as long as we know the key of the information index, we can use the kademlia protocol to query the corresponding value information, regardless of the node on which the value information is stored. In P2P file exchange systems such as eMule and BitTorrent, kademlia mainly plays a key role in File Information Retrieval Protocol. However, the application of Kad network is not limited to file exchange. The following describes the design and implementation of the KAD network in eMule.
2. What information does the KAD network of Emule store?
As long as the information in the form of dictionary entries that can be expressed as <key, value> can be stored in the KAD network, a Kad network can store multiple distributed hash tables at the same time. Taking eMule as an example, at any time, its Kad network stores and maintains two distributed Hash Tables, one of which can be named as a keyword dictionary, the other one can be called the file index dictionary.
A.Keyword dictionary: It is mainly used to query the corresponding file name and Related File Information Based on the given keywords. The key value is equivalent to the 160-bit sha1 hash of the given keyword string, the corresponding value is a list. In this list, file information with corresponding keywords in all file names is provided, we can simply use a 3-tuple entry to represent this information: (file name, file length, and file sha1 check value). For example, assume that there is a file named "warcraft_frozen_throne.iso ", when we use the three keywords "Warcraft", "Frozen", and "Throne" to query Kad, Kad may return three different file lists respectively, these three lists all contain an information entry named "warcraft_frozen_throne.iso, we can obtain the name, length, and 160-bit sha1 check value of the corresponding ISO file.
B.File index dictionary: Used to query the object owner (the download service provider of the object) based on the given file information. The key value is equal to the sha1 check value of the object to be downloaded (this is mainly because, from a statistical point of view, the 160-bit sha1 file checksum can uniquely identify a file with a specific data content), and the corresponding value is also a list, it provides the network information of all nodes that own the file. The list entries can also be represented by a triple: (owner IP, download listening port, owner node ID ), based on this information, eMule will know where to download the same file with the same sha1 checksum.
3. What is the basic process for searching and downloading files using the KAD network?
Based on our understanding of the two dictionaries in the KAD network of eMule, the basic process of searching and downloading a specific file on the KAD network is very clear. We still take "warcraft_frozen_throne.iso" as an example, first, we can use any keyword such as Warcraft, frozen, throne to query the keyword dictionary to obtain the sha1 checksum of the ISO, and then query the KAD file index dictionary using the checksum, to obtain all the network nodes that provide "warcraft_frozen_throne.iso" download, and then download the entire ISO file from these nodes in multipart download mode.
In the above process, the KAD network actually serves two dictionaries, but it is worth noting that, kad does not store and search these two dictionaries in a centralized Indexing Server (such as Chinese P2P source power, razorback 2, and donkeyserver, because all the <key, value> entries of the two dictionaries are distributed and stored in the nodes involved in the KAD network, the storage and exchange of related file information and download location information does not require the participation of the centralized indexing server. This not only improves the query efficiency, but also improves the reliability of the entire P2P file exchange system, at the same time, it has considerable anti-Denial-of-Service attack capabilities. What's more interesting is that it can help us effectively resist the FBI's pursuit, because the saying goes well: the law does not govern the public... I believe everyone can understand the benefits of "distributed information retrieval. But how are these entries stored? How can we find them through the KAD network? Don't worry, take it easy.
4. What is the distance between the node ID and the node?
Each node in the KAD network has an exclusive ID. The specific form of this ID is similar to the sha1 hash value. It is an integer up to bits and is randomly generated by the node itself, the possibility of two nodes having the same ID is very small, so we can think that this is almost impossible. In a Kad network, the distance between two nodes is not measured by physical distance or the number of router hops. In fact, the KAD network defines the distance d between any two nodes as the bit-by-bit binary number and the id value of the two nodes. That is, if the IDs of the two nodes are a and B, they are: D = a xor B. In Kad, each node can determine the distance between other nodes based on this concept. When the value of D is large, the distance between nodes is far away, and when the value of D is small, the two nodes are very close to each other. Here, both "distance" and "distance" are just a logical metric description. In Kad, there is no direction for the distance, that is to say, the distance from A to B is always equal to the distance from B to A, because a xor B = B XOR
5. <key, value> How are entries stored in the KAD network?
From the above, we can find that the similarity between the node ID and the key value in the <key, value> entries: both the key of the keyword dictionary and the key of the file index dictionary are 160bit, the node ID is exactly 160bit. This is obviously purposeful. In fact, the ID value of a node determines which <key, value> entries can be stored in the node, because a <key, value> the entry is simply stored at the node where the node id value is exactly the same as the key value in the entry. We can meet (ID = key) the node with this condition is named N. In this way, a query <key, value> entry is simply converted into a node where the ID is equal to the key value.
In the actual Kad network, the target node n must exist or be online at any time. Therefore, the KAD network requires any <key, value> entry, based on the specific value of the key, the entry will be copied and stored in the K nodes whose node ID is closest to the key value (that is, the current one is closest to the target node N; the reason why we need to repeat <key, value> to save K copies is the Redundancy Introduced in consideration of the stability of the entire Kad system. The value of this K is also exquisite, it is an heuristic estimate. The criterion for selecting values is: "select at least K nodes in the KAD network of the current scale, the probability that they are offline at any time is almost 0 ". Currently, the typical value of K is 20, that is, to ensure that at any time we can find at least one copy of A <key, value> entry, we must first copy at least 20 copies of the entry in the KAD network.
As shown above, for a <key, value> entry, the closer the ID is to the node area of the key in the KAD network, the more copies the entry stores and the more concentrated the storage; in fact, in order to achieve a short query response latency, any entry can be cached on any node during the query process. In addition, to prevent excessive cache and ensure that the information is fresh enough, the <key, value> entry storage timeliness must be considered: the closer the entry is to the target node N, the longer the entry is saved. On the contrary, the shorter the entry timeout; the entries stored on the target node can be retained for up to 24 hours. If the entry is re-released by its publishing source during this period, the retention time can be extended.
6. What status information do Kad nodes need to maintain?
In the KAD network, each node maintains 160 lists. Each list is called a K-bucket, as shown in. In list I, the known distance between the current node and itself is recorded as 2 ^ I ~ 2 ^ (I + 1) some other peer node network information (node ID, IP address, UDP port), each list (k-bucket) it can store up to k peer node information. Note that K here is consistent with the meaning of the replication coefficient K mentioned above; the peer node information in each list is sorted by the Access time. The earliest access is in the list header, and the last access is placed at the end of the list.
Updating node information in the K-bucket basically follows the least-recently seen eviction principle: when the list capacity is not full (the number of nodes in the K-bucket is less than K ), when the last accessed peer node information is not in the current list, the information is directly added to the end of the list team. If the information is already in the current list, it will be moved to the end of the team; when the capacity of the K-bucket is full, it is special to add a new node. It will first check whether the first node to be accessed is still responding. If yes, the first node of the team is moved to the end of the team, and the new Access Node information is discarded. If not, the first node is discarded and the last accessed node information is inserted to the end of the team. We can see that reusing existing node information as much as possible and sorting by time is the main feature of K-bucket node update methods. From an enlightening perspective, this method has a certain basis: A node with a long online time is more trustworthy because it has been online for several hours. Therefore, it will be more likely to stay online within the next hour than the node we recently accessed, or more directly. Here I will give a more humane explanation: MP3 file exchange is a violation of the Copyright Law. A node has committed several hours of work. Therefore, it will not care about committing an hour more crimes than other new nodes ...... -_-B
It can be seen from the above that the original intention of adopting this multi-K-bucket data structure is mainly two:. maintain recent updates to the latest node information. B. to quickly filter node information, that is, you only need to know the ID of a specific target node n to be searched, we can quickly find several known nodes closest to N from the K-buckets structure of the current node.
7. How to find a specific node in the KAD network?
Find the network information (node ID, IP address, and UDP port) corresponding to the nearest K nodes in the current Kad network, it is a node lookup process in the KAD network ). Note: Kad does not strictly define the node Query Process as simply querying a single target node. This is mainly because the KAD network does not make any assumptions about the node's launch time, therefore, in most cases, we are not sure that the target node to be searched must be online or exist.
The entire node query process is very direct, similar to the DNS iterative query:
A. the query initiator filters out several nodes closest to the target ID from his K-bucket and sends asynchronous query requests to these nodes at the same time;
B. After receiving the request, the queried node will find several nodes closest to the queried target ID in its K-bucket and return them to the initiator;
C. After receiving the returned information, the initiator selects a number of nodes that have not been requested and repeats step 1;
D. The preceding steps are repeated until you cannot obtain an active node that is closer to the target than the K nodes currently known by the queryer.
E. During the Query Process, nodes that do not respond in time will be immediately excluded; The queryer must ensure that the final K closest nodes are all active.
Simply summarize the above process. In fact, it is very similar to finding someone to inquire about something in our daily life. For example, you are an Agent Smith and want to find a key) ask him about his mobile phone number (value), but you don't know him beforehand. First, you will definitely go to the person you know who works in the same company as Xiao Li. For example, Xiao Zhao, then Xiao Zhao will tell you to find Xiao Liu in the same department as Xiao Li, and then Xiao Liu will further tell you to find Xiao Zhang from the same project team as Xiao Li. Finally, you found Xiao Zhang, yo, and Xiao Li went on a business trip (the node is offline), but Xiao Zhang happened to know Xiao Li's number (cache), so you finally found the required information. In the process of node search, "distance between nodes" represents the same meaning as "closeness of interpersonal relationships" in the preceding example.
Finally, let's talk about the limitations of the above query process: the KAD network is not suitable for fuzzy search, such as wildcard support and partial search, but for file sharing, keyword-based exact search is enough (it is worth noting that, in fact, we only need to slightly improve the search process and make it support keyword-matching-based Boolean condition queries, but still not optimized ). This problem is reflected at the application level of Emule. It directly illustrates the importance of naming when a file is shared. That is, the more clearly the keyword definition in the file name, the easier it will be to find the file, thus, it is more conducive to the dissemination of P2P networks. On the other hand, in eMule, every shared file can have its own comments, and the importance of comment has not yet been recognized by everyone: in fact, keywords in this file comment can also be directly used to replace the keyword of the file name, so as to guide and facilitate user search, especially when the file name itself does not reflect the keyword.
8. How to store and search for a specific <key, value> entry in the KAD network?
In essence, the problem of storing and searching a specific <key, value> entry is actually a node search problem. When you need to store an entry in the KAD network, you can first find the entry through the NodeAlgorithmFind the K nodes closest to the key and notify them to save the <key, value> entry. The process of searching entries is similar to the node Query Process. The search initiator continuously queries nodes that are closer to the key in an iterative manner, once any node in the query path returns the value to be searched, the entire search process ends. To improve efficiency, after the search is successful, the initiator can store the searched entries to multiple nodes in the query path as a cache for subsequent queries; the timeout time of the entry cache is exponentially inversely proportional to the distance between the node-key.
9. How to add a new node to the KAD network for the first time?
When a new node tries to join the KAD network for the first time, it must do three things. First, no matter what path, obtain the information of a node that has been added to the KAD Network (we can call it node I) and add it to its own K-buckets. Second, initiate a query request for the node ID to obtain information about a series of other nodes that are adjacent to the node through node I. Finally, refresh all K-buckets, make sure that all the node information you obtain is fresh.
References
1. kademlia: a peer-to-peer information system based on the XOR metric, Petar maymounkov, Proc. iptps 2002
2. What is Kad, http://bbs.5qzone.net/read.php? Tid = 321431
3. Principle introduction of kademlia, http://www.edonkey2000.cn/bbs/viewthread.php? Tid = 58238