Thinking about the Distributed System (I.)

Source: Internet
Author: User
Tags cassandra mongodb redis

"Abstract" This paper deals with the theory and ideas of some distributed systems, including caps, BASE, NWR and so on. and analyzes the pros and cons of some mainstream database distributed schemes, so that we can think, choose and design more deeply and comprehensively in the development. The following is the text:

Before discussing the common architecture, let's take a brief look at the cap theory:

Caps are abbreviations for consistency, availablity, and Partition-tolerance. Respectively: Consistency (consistency): each read operation is guaranteed to return the latest data; availability (availablity): Any node that does not fail will return a normal result within a reasonable time; Partitioning tolerance (partition-tolerance): When a network partition occurs between nodes, services can still be provided.

Cap theory points out: Cap Three can only take the second, do not have both. In fact, this is very good understanding: first, a stand-alone system can only guarantee CP; When there are two or more nodes, two nodes in the cluster cannot communicate with each other when the network partition occurs. At this point, if the data is guaranteed to be consistent with C, then there must be a node that is marked as unavailable. Violation of the requirements of availability A, only to ensure CP; Conversely, if the availability of a, two nodes can continue to handle the request, then because the network can not synchronize data, will inevitably lead to inconsistent data, Only AP can be guaranteed.

One, single instance

Stand-alone systems are clearly only guaranteed to be CP, sacrificing availability A. A stand-alone version of MySQL, Redis, MongoDB and other databases are this mode.

In practice, we need a high availability system that can continue to provide services even after some of the machines have been hung up. Two, multiple copies

There is one more node to back up the data than the single instance.

For read operations, availability is elevated because you can access any of the two nodes.

For write operations, the update strategy is divided into three different scenarios:

Synchronous Update: Write operation needs to wait for two nodes to be updated successfully before returning. In this case, if a network partition failure occurs, the write operation is not available, sacrificing A;

Asynchronous update: The write operation is returned directly without the need to wait for the node to update successfully and the node to update the data asynchronously.
This way, the sacrifice of C to ensure a. That is, there is no guarantee that the data will be updated successfully, and it may cause data inconsistency due to network failures.

Compromise: The update part of the node succeeds and returns.

Here, first introduce the next class dynamo system to control the consistency level in distributed storage Systems--NWR:N: Number of copies of the same data W: write operation to ensure successful copy number R: Number of copies to read

When W+r>n, because the read-write operation covers the replica set will have intersection, read operations as long as the copy set data to compare the modification time or version number can choose the latest, so the system is strong consistent;
Conversely, when w+r<=n is weakly consistent.

such as: (N,w,r) = (1,1,1) is a stand-alone system, is strongly consistent, (n,w,r) = (2,1,1) is a common master-slave mode, is weakly consistent.

Example:

In the case of a compromise scheme in Cassandra, quorum, as long as more than half of the node update succeeds, returns the consistent value of most replicas when read. Then, for inconsistent copies, it can be solved by read repair.

Read repair: When reading a piece of data, query all copies of this data, compare the data and most copies of the latest data is consistent, if not, a consistent fix.

Among them, the w+r>n, therefore is strong consistent.

Also such as Redis's Master-slave mode, the update succeeds one node is returned, the other nodes asynchronously to back up the data. This approach only guarantees final consistency.

Final consistency: The final consistency allows the data to be inconsistent over a period of time, in contrast to the strong consistency of the data at all times. But as time grows, the data will eventually reach a consistent state.

where,w+r< N, so only final consistency is guaranteed.

In addition, the greater the N, the better the data reliability. However, because the larger the W or R, write or read overhead, the performance is worse, so generally need to consider the consistency, availability and read and write performance, set W, R is N/2 + 1.

In fact, the compromise and asynchronous update is essentially the same way, is to lose a certain C in exchange for A's improvement. Moreover, the problem of ' brain crack '---------------------------------------

In general, the database provides a solution for partitioning recovery: from the Source: Set the timeout for node communication, and the "Minority" node does not provide services after the timeout period. This will not show inconsistent data, but less availability; Resolve from Recovery: If you compare and merge data from different nodes in the event of a communication recovery, this availability is guaranteed. However, before the recovery is complete, the data is inconsistent and data conflicts may occur.

The light is not enough, when the amount of data is large, because a machine resources are limited and can not accommodate all the data, we would like to divide the data into several machines stored. Third, fragmentation

Compared to a single instance, here is a node to split the data.

Because all data is only one copy, consistency is guaranteed, there is no communication between nodes, and there is partition tolerance.

However, system availability is not guaranteed when any one node is hung up and a portion of the data is lost.

In summary, this and stand-alone version of the same scheme, can only guarantee CP.

So, what are the benefits? The suspension of a node will only affect some services, that is, service demotion, due to the fragmentation of data, you can balance the load, the amount of data increase/decrease can be corresponding expansion/reduction capacity.

Most database services provide fragmented functionality. such as Redis of slots, Cassandra Partitions, MongoDB shards and so on.

Based on fragmentation to solve the problem of large data, but we still hope that our system is highly available, then, how to sacrifice a certain degree of consistency to ensure availability of it. iv. Cluster

As you can see, the first two ways are combined in this way. As above analysis, different data synchronization strategies are adopted, and the CAP guarantee of the system is different. However, the general database system provides an optional configuration, and we choose different policies for different scenarios to achieve different characteristics.

In fact, for most non-financial internet companies, the requirement is not strong consistency, but a guarantee of usability and eventual consistency. This is also one of the main reasons why NoSQL is popular in internet applications, and it is more inclined to base:basically Available than the acid principle of strong consistency systems: basically usable, that is, to allow partitioning to fail, out of the question only service demotion; Soft-state: Soft State, That is, allow asynchronous; eventual consistency: final consistency, allowing the data to eventually be consistent, rather than always consistent. v. Summary

Basically, several of the ways discussed above already cover most distributed storage systems. As we can see, these schemes always need to be sacrificed for a portion of the total to reach the 100% cap.

Choose which option to base what features are more important in a particular scenario.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.