Reprint: Cap Theory for distributed systems

Source: Internet
Author: User

Original reproduced Hollis original article: http://www.hollischuang.com/archives/666


In July 2000, Professor Eric Brewer of the University of California, Berkeley, presented the CAP conjecture at the ACM PODC Conference. 2 years later, MIT's Seth Gilbert and Nancy Lynch proved the cap in theory. After that, cap theory formally became the accepted theorem in the field of distributed computing.

Overview of Cap theory

A distributed system can meet only two of the three items of consistency (consistency), availability (availability), and partition fault tolerance (Partition tolerance) at the same time.

Consistency consistency

Consistency refers all nodes see the same data at the same time to "", that is, after the update operation succeeds and returns the client completes, all nodes at the same time are fully consistent with the data. Distributed consistency

For consistency, it can be divided into two different perspectives from the client and server side. From the client side, consistency mainly refers to the problem of how the updated data gets when multiple concurrent accesses are being accessed. From the server side, it is how updates replicate across the system to ensure that the data is ultimately consistent. Consistency is due to the problem of concurrent read and write, so in understanding the consistency of the problem, it is important to consider the combination of concurrent read and write scenarios.

From the client's perspective, when multi-process concurrent access, the updated data in different processes how to obtain different policies, determine the different consistency. For relational databases, it is strong consistency to require that the updated data be visible to subsequent accesses. If you can tolerate any subsequent partial or full access, it is weak consistency. If the updated data is required after a period of time, it is final consistency.

Availability Availability

Availability means " Reads and writes always succeed ", that is, the service is always available and is a normal response time.

For an availability distributed system, each non-faulted node must respond to each request. That is, any algorithm used by the system must eventually terminate. When partitioning tolerance is required, this is a strong definition: even for serious network errors, each request must be terminated.

Good usability mainly refers to the system can be very good for the user Service, there is no user operation failure or access timeout, such as bad user experience. Availability and distributed data redundancy, load balancing, etc. are often associated with usability.

Partition Tolerance Partitioning fault tolerance

Partition fault tolerance means " the system continues to operate despite arbitrary message loss or failure of part of the system ", that is, the distributed system can still provide services that satisfy the consistency and availability when encountering a node or network partition failure.

Partitioning is closely related to fault tolerance and extensibility. In distributed applications, the system may not function properly due to some distributed causes. Good partitioning of fault tolerance requires that the application be a distributed system, but it seems to be in a functioning whole. For example, the current distributed system has one or several machines have been down, the rest of the machine can be run to meet the needs of the system, or the machine has network anomalies, the distributed system is separated from several parts, the various parts can also maintain the operation of the distributed system, so that has good partition fault tolerance.

Cap Tradeoff

With the CAP theory, we know that we can't meet the three features of consistency, availability, and partition fault tolerance at the same time, which one to discard?

CA without P: if p is not required (partitioning is not allowed), then C (strong consistency) and a (availability) are guaranteed. But the partition is not the problem you want to do, but it will always exist, so the CA's system is more to allow the partition after the subsystems remain CA.

CP without A: if A (available) is not required, the equivalent of each request needs to be strongly consistent between servers, and P (partition) can cause unlimited synchronization time, so the CP is also guaranteed. Many traditional database distributed transactions belong to this model.

AP wihtout C: To be highly available and allow partitioning, you need to discard the consistency. Once a partition occurs, the nodes may lose contact, and in order to be highly available, each node can only serve with local data, which can result in inconsistencies in global data. Many of the NoSQL classes now fall into this category.

For most large-scale Internet applications, the host is numerous, the deployment is scattered, and now the cluster size is increasing, so node failure, network failure is the norm, and to ensure that the service availability of N 9, that is, to ensure that P and a, discard C (back to the second to ensure eventual consistency). While some places affect the customer experience, it does not reach the severity of the user process.

For scenarios involving money so that there is no compromise, C must be assured. The network fails rather to stop the service, which is to guarantee the CA, which discards p. It seems that the domestic banking industry in recent years, there are not more than 10 accidents, but the impact of small, reporting is not much, the broad masses know less. There is also a guarantee CP, discard a. For example the network fault thing is read-only not written.

Whichever is better, there is no conclusion, only according to the scene to decide, suitable is the best.

Reprint: Cap Theory for distributed systems

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.