[Switch] The distributed algorithms of NoSQL databases repost A Very Good distributed algorithm of NoSQL databases. The content of this article is as follows: the original article is published in the famous technical blog "HighlyScalableBlog". the distributed algorithms and ideas in NoSQL databases are described in detail. The article is very long and translated and contributed by @ juliashine. Thank you for your sharing.
[Switch] The distributed algorithms of NoSQL databases repost A Very Good distributed algorithm of NoSQL databases. The content of this article is as follows: the original article is published in the famous technical Blog Highly Scalable Blog. the distributed algorithms and ideas in NoSQL databases are described in detail. The article is very long and translated and contributed by @ juliashine. Thank you for your sharing.
[Switch] distributed algorithms of NoSQL Databases
Repost a good NoSQL database distributed algorithm with the following content:
This article is published in the famous technical Blog Highly Scalable Blog. It gives a detailed explanation of distributed algorithms and ideas in NoSQL databases. The article is very long and translated and contributed by @ juliashine. Thank you for your sharing spirit!
Translator's introduction: Juliashine has been an engineer for many years. He is now working on massive data processing and analysis, focusing on the Hadoop and NoSQL ecosystem.
Distributed Algorithms in NoSQL Databases
Address: distributed algorithms for NoSQL Databases
System scalability is the main reason for promoting the development of NoSQL, including distributed system coordination, failover, resource management and many other features. In this case, NoSQL sounds like a big basket, and everything can be inserted. Although the NoSQL movement has not brought about fundamental technological changes to distributed data processing, it still triggers overwhelming research and practices on various protocols and algorithms. It is through these attempts that some effective database construction methods have been gradually summarized. In this article, I will systematically describe the distributed features of NoSQL databases in Hong Kong server leasing.
Next, we will study some distributed strategies, such as the replication in fault detection. These strategies are marked in italics and divided into three parts:
Data Consistency
As we all know, distributed systems often encounter network isolation or latency. In this case, the isolated part is unavailable. Therefore, it is impossible to maintain high availability without sacrificing consistency. This fact is often referred to as the "CAP theory ". However, consistency is very expensive in distributed systems, so we often need to make some concessions on it, not just for availability, but also for a variety of trade-offs. To study these trade-offs, we have noticed that the consistency problem of distributed systems is caused by data isolation and replication, and there is no record filing space. Therefore, we will start from studying the characteristics of replication:
Now let's take a closer look at common Replication technologies and give them classes based on the characteristics described. The first figure depicts the logical relationship between different technologies and the trade-off between different technologies in terms of system consistency, scalability, availability, and latency. The second figure details each technology.
The duplicate factor is 4. The read/write Coordinator can be an external client or an internal proxy node.
We will repeat all the technologies from weak to strong based on consistency:
Some trade-offs in the above analysis need to be further emphasized:
Anti-entropy protocol and rumor Propagation Algorithm
Let's start with the following scenarios:
There are many nodes, and each piece of data will have copies on several of them. Each node can process the update request separately, and each node regularly synchronizes with other nodes. After a period of time, all the copies will be consistent. How is the synchronization process performed? When does synchronization start? How do I select a synchronization object? How to exchange data? We assume that the two nodes always use a newer version of data to overwrite the old data or that both versions are retained for processing at the application layer.
This problem is common in scenarios such as Data Consistency Maintenance and cluster status synchronization (such as cluster member information dissemination. Although the introduction of a database monitoring and synchronization plan coordinator can solve this problem, decentralized databases can provide better fault tolerance. The main approach to decentralization is to use a well-designed infectious Protocol [7], which is relatively simple but provides a good convergence time and can tolerate failure of any node and network isolation. Although there are many types of infectious algorithms and virtual hosts, we only focus on the anti-entropy protocol, because NoSQL databases are using it.
The anti-entropy Protocol assumes that the synchronization will be executed according to a fixed schedule. Each node is randomly selected or another node is selected to exchange data according to certain rules to eliminate the difference. There are three anti-entropy protocols: Push, pull, and hybrid. The principle of the push protocol is to simply select a random node and send the data status to it. It is silly to push all the data out of real applications, so nodes generally work in the way shown.
Node A prepares A Data Abstract As the synchronization initiator, which contains the data fingerprint of node. After receiving the summary, Node B compares the data in the summary with the local data and returns the data difference as A summary to node. Finally, A sends an update to B and B updates the data. The pull and hybrid protocols are similar, as shown in.
The anti-entropy protocol provides sufficient convergence time and scalability. Displays the simulation results of an update propagation in a cluster with 100 nodes. In each iteration, each node is associated with only one randomly selected peer node.
We can see that the pull method has better convergence than the push method, which can be theoretically proved [7]. There is also a problem of "tail convergence" in the push mode. After multiple iterations, although almost all nodes are traversed, few of them are not affected. Compared with the simple push and pull method, the hybrid method is more efficient, so this method is usually used in practical applications. The inverse entropy is scalable because the average conversion time increases in the form of a logarithm function of the cluster Scale.
Although these technologies seem simple, there are still many studies focusing on the performance of anti-entropy protocols under different constraints. One of them replaces the random selection with a more effective structure using a network topology [10]. Adjust the transmission rate or use advanced rules to select the data to be synchronized when the network bandwidth is limited [9]. Abstract computing is also facing challenges. The database maintains a recently updated log to facilitate abstract computing.
Eventually Consistent Data type Eventually Consistent Data Types
In the previous section, we assume that the two nodes always merge their data versions. But it is not easy to solve the update conflict, so it is unexpected to make all copies reach a semantic correct value. A well-known example is that deleted entries in the Amazon Dynamo database [8] can be reproduced.
Let's assume an example to illustrate this problem: the database maintains a logical global counter, and each node can increase or decrease the count. Although each node can maintain its own value locally, these local counts cannot be combined by simply adding or subtracting them. Assume that there are three nodes A, B, and C. Each node performs an add operation. If A obtains A value from B and adds it to the local copy, then C obtains the value from B, and then C obtains the value from A, then C's last value is 4, this is wrong. To solve this problem, use a data structure similar to vector clock [19] to maintain a pair of counters for each node [1]:
Class Counter {int [] plus int [] minus int NODE_IDincrement () {plus [NODE_ID] ++} decrement () {minus [NODE_ID] ++} get () {return sum (plus)-sum (minus)} merge (Counter other) {for I in 1 .. MAX_ID {plus [I] = max (plus [I], other. plus [I]) minus [I] = max (minus [I], other. minus [I]) }}}
Cassandra uses a similar method to count [11]. The State-based or operation-based replication theory can also be used to design a more complex and eventually consistent data structure. For example, [1] mentioned a series of such data structures, including:
The final consistency Data Type features are usually limited, but it also brings additional performance overhead.
Data placement
This part focuses on algorithms that control data placement in distributed databases. These algorithms map data items to appropriate physical nodes, migrate data between nodes, and globally allocate resources such as memory.
Balanced data