Reprint: Cap Theory for distributed systems

Last Update:2016-02-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original reproduced Hollis original article: http://www.hollischuang.com/archives/666

In July 2000, Professor Eric Brewer of the University of California, Berkeley, presented the CAP conjecture at the ACM PODC Conference. 2 years later, MIT's Seth Gilbert and Nancy Lynch proved the cap in theory. After that, cap theory formally became the accepted theorem in the field of distributed computing.

Overview of Cap theory

A distributed system can meet only two of the three items of consistency (consistency), availability (availability), and partition fault tolerance (Partition tolerance) at the same time.

Consistency consistency

Consistency refers all nodes see the same data at the same time to "", that is, after the update operation succeeds and returns the client completes, all nodes at the same time are fully consistent with the data. Distributed consistency

For consistency, it can be divided into two different perspectives from the client and server side. From the client side, consistency mainly refers to the problem of how the updated data gets when multiple concurrent accesses are being accessed. From the server side, it is how updates replicate across the system to ensure that the data is ultimately consistent. Consistency is due to the problem of concurrent read and write, so in understanding the consistency of the problem, it is important to consider the combination of concurrent read and write scenarios.

From the client's perspective, when multi-process concurrent access, the updated data in different processes how to obtain different policies, determine the different consistency. For relational databases, it is strong consistency to require that the updated data be visible to subsequent accesses. If you can tolerate any subsequent partial or full access, it is weak consistency. If the updated data is required after a period of time, it is final consistency.

Availability Availability

Availability means " Reads and writes always succeed ", that is, the service is always available and is a normal response time.

For an availability distributed system, each non-faulted node must respond to each request. That is, any algorithm used by the system must eventually terminate. When partitioning tolerance is required, this is a strong definition: even for serious network errors, each request must be terminated.

Good usability mainly refers to the system can be very good for the user Service, there is no user operation failure or access timeout, such as bad user experience. Availability and distributed data redundancy, load balancing, etc. are often associated with usability.

Partition Tolerance Partitioning fault tolerance

Partition fault tolerance means " the system continues to operate despite arbitrary message loss or failure of part of the system ", that is, the distributed system can still provide services that satisfy the consistency and availability when encountering a node or network partition failure.

Partitioning is closely related to fault tolerance and extensibility. In distributed applications, the system may not function properly due to some distributed causes. Good partitioning of fault tolerance requires that the application be a distributed system, but it seems to be in a functioning whole. For example, the current distributed system has one or several machines have been down, the rest of the machine can be run to meet the needs of the system, or the machine has network anomalies, the distributed system is separated from several parts, the various parts can also maintain the operation of the distributed system, so that has good partition fault tolerance.

Cap Tradeoff

With the CAP theory, we know that we can't meet the three features of consistency, availability, and partition fault tolerance at the same time, which one to discard?

CA without P: if p is not required (partitioning is not allowed), then C (strong consistency) and a (availability) are guaranteed. But the partition is not the problem you want to do, but it will always exist, so the CA's system is more to allow the partition after the subsystems remain CA.

CP without A: if A (available) is not required, the equivalent of each request needs to be strongly consistent between servers, and P (partition) can cause unlimited synchronization time, so the CP is also guaranteed. Many traditional database distributed transactions belong to this model.

AP wihtout C: To be highly available and allow partitioning, you need to discard the consistency. Once a partition occurs, the nodes may lose contact, and in order to be highly available, each node can only serve with local data, which can result in inconsistencies in global data. Many of the NoSQL classes now fall into this category.

For most large-scale Internet applications, the host is numerous, the deployment is scattered, and now the cluster size is increasing, so node failure, network failure is the norm, and to ensure that the service availability of N 9, that is, to ensure that P and a, discard C (back to the second to ensure eventual consistency). While some places affect the customer experience, it does not reach the severity of the user process.

For scenarios involving money so that there is no compromise, C must be assured. The network fails rather to stop the service, which is to guarantee the CA, which discards p. It seems that the domestic banking industry in recent years, there are not more than 10 accidents, but the impact of small, reporting is not much, the broad masses know less. There is also a guarantee CP, discard a. For example the network fault thing is read-only not written.

Whichever is better, there is no conclusion, only according to the scene to decide, suitable is the best.

Reprint: Cap Theory for distributed systems

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Reprint: Cap Theory for distributed systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Reprint: Cap Theory for distributed systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support