From: orders
HAT-TRICK). In a distributed data system
CAP THEOREM)But this is not his hat. The CAP principle has three elements:
CONSISTENCY)AVAILABILITY)PARTITION TOLERANCE)The CAP principle refers to the three elements.
Up to two points can be implemented simultaneously,
It's impossible to take both of them into consideration.. Therefore, you must make a trade-off when designing a distributed architecture. For distributed data systems,
Partition adequacy is a basic requirement; otherwise, value will be lost.. Therefore
Designing a distributed data system balances consistency and availability.. For most web applications, there is no need for strong consistency,
Therefore, sacrificing consistency in exchange for high availability is the direction of most distributed database products.. Of course, the sacrifice of consistency does not completely ignore data consistency. Otherwise, the data is disordered, and the system availability is no longer valuable even if the system is highly distributed. Sacrifice consistency, but do not require any
High ConsistencyBut as long as the system can
Final consistencyIn consideration of the customer experience, this final consistent time window should be as transparent as possible to users, that is, to ensure "user-perceived consistency ". Usually through
Asynchronous replication of multiple data copiesTo achieve the high availability of the system and the eventual consistency of data,"
User-perceived consistencyThe time window is determined by the time when the data is copied to the consistent state.
EVENTUALLY CONSISTENT)For consistency, it can be divided
Client and serverTwo different perspectives. From the client perspective, consistency mainly refers
How to obtain updated data during multiple concurrent accesses. From the server side, it is
Update how to copy data to the entire system to ensure data consistency.. Consistency is caused by the existence of concurrent reads and writes. Therefore, pay attention to the consistency problem.
Concurrent read/write considerations. From the client perspective, when multiple processes Access Data concurrently, how the updated data gets different policies in different processes determines different consistency. For relational databases, it is strongly consistent to require that updated data be visible for subsequent access. If some or all of the subsequent accesses are tolerated, it is weak consistency. If you want to access the updated data after a period of time, it is the final consistency. The final consistency depends on the time and method of data access by each process after data is updated, and can be distinguished as: Causal consistency) if process a notifies process B that it has updated a data item,
Then, the updated value will be returned for subsequent access to process B.And one write will replace the previous write. Access to process C without A causal relationship with process A follows General eventual consistency rules. "
READ-YOUR-WRITES)"Consistency. After process A updates A data item, it always accesses the updated value and never sees the old value. This is a special case of the causal consistency model.
SESSION consistencyThis is a practical version of the previous model,
It places the process accessing the storage system in the context of the session.. As long as the session still exists, the system ensures the consistency of "read by yourself. If the session is terminated due to some failures, a new session is required, and the system guarantee will not be extended to the new session.
MONOTONIC read consistencyIf the process has seen a value of the data object, no subsequent access will return the value before that value.
Monotonous write consistencyThe system ensures that write operations from the same process are executed in sequence. If the system cannot guarantee the consistency of this degree, it will be very difficult to program.
Different methods of final consistency can be combined.For example, monotonous read consistency and read-write consistency can be combined. In addition, from a practical point of view, the combination of the two can read the updated data, and will not read the old version once the latest version is read. For program development in this architecture, there will be less extra troubles. From the server perspective, how to distribute the updated data to the entire system as soon as possible and reduce the time window for eventual consistency is very important to improve the system availability and user experience. For Distributed Data Systems: N-data copies, W-update data is the number of nodes that need to be written, r-the number of nodes to be read when reading data. If W + R> N, the written node overlaps with the read node, which is strongly consistent. For example, for a typical relational database with one master database and one slave database for Synchronous replication, N = 2, W = 2, R = 1, no matter whether the data is read from the master database or the slave database, all are consistent. If W + R <= N, It is weak consistency. For example, for a relational database with one master node and one slave node for asynchronous replication, if N = 2, W = 1, R = 1, then if the read is a slave database, it may not be able to read the data that has been updated in the master database, so it is weak consistency. For distributed systems
High Availability, Generally set N> = 3.
Different N, W, and r combinations provide a balance between availability and consistency to adapt to different application scenarios.. If N = W, R = 1, any write node becomes invalid, the write fails, And the availability is reduced. However, because N nodes in the data distribution are written synchronously, therefore, strong consistency can be ensured. If N = R, W = 1, you only need to write a node successfully. The write performance and availability are high. However, the process that reads data from other nodes may not be able to obtain the updated data, so it is weak consistency. In this case, if W <(N + 1)/2 and the written nodes do not overlap, a write conflict exists.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.