Paxos algorithm of distributed lock service Chubby

Source: Internet
Author: User

Paxos algorithm of distributed lock service Chubby


In the field of distributed system design, Paxos is the most important consistent algorithm. Google's Daniel says that


All working protocols for asynchronous consensus we had so far encountered has Paxos at their core.


The status of this algorithm is visible. There are a plethora of articles on the web that discuss this algorithm, but most of them are still confused, even in Wikipedia, where the description of the algorithm is ambiguous and erroneous. But in fact, the core idea of this algorithm is relatively simple, but most of the analysis of the article out of the actual application, or into a large number of implementation details to cover up the core of the algorithm. This paper first gives the design purpose of the Paxos algorithm, and the algorithm flow, and then analyzes the principle of the algorithm.


The Paxos algorithm realizes the consistency of data over multiple nodes of a distributed system, and the algorithm has the following characteristics

1. Based on message passing, allowing message transmission to be lost, duplicated, disorderly, but not allowed to be changed

2. If the node is less than half of the failure of the case can still work normally, node failure can occur at any time without affecting the normal execution of the algorithm.


Here is the basic Paxos algorithm, note that this algorithm only has the ability to select one of several conflicting requests, and does not have the ability to serialize multiple requests in turn.


The Paxos algorithm consists of three role Proposor,acceptor,learner.

Implemented with a fixed number of servers, each server in the same three roles, multiple clients to their own request value value_i randomly sent to a server processing, and then this group of servers negotiated to obtain a unified value Chosen_ Value, which must be learned by each server while replying to all the client that initiated the request.


The specific algorithm flow is as follows, in order to avoid ambiguity, key words propose,proposal,accept,value,choose, such as retaining the original English.


Stage 1a---Prepare (scheduled proposal serial number)


After each proposor has received a client request Value_i, at this stage can not initiate proposal, can only send a proposal ordinal n, send the serial number to all acceptor (that is, all servers including their own), The serial number of all proposal in the entire system cannot be duplicated and each proposor must have an increment of its own ordinal number, and it is common practice to assume that K-server runs the Paxos algorithm together, then Server_i (i=0...k-1) The initial value of the proposal ordinal is I, and the subsequent increment of K when a new ordinal is generated, which ensures that the proposal sequence number of all servers is not duplicated.

Phase 1b---Respond with Promise

Each acceptor received proposal serial number, first check whether the Repond sequence number higher proposal, if not, then give response, This response with the highest number of proposal (if not accept any proposal, reply to null), at the same time, promise himself no longer accept the proposal below the receiving sequence number. Otherwise, respond is rejected.


Phase 2a---Initiate proposal, request accept

Proposal if he gets a response from more than half of acceptor, he is eligible to launch proposal<n,value> to acceptor. where n is the sequence number that is sent in phase 1a, value is the value of the proposal that is the largest ordinal of the received response, and if the received response is all NULL, then value is customized and can be selected directly from a client request Value_i


Stage 2b--accept proposal

Check if the serial number of the received proposal is in violation of phase 1b promise, and if not violated, accept the proposal received.


All acceptor Accept proposal to constantly notify all learner, or learner actively to inquire, once learner confirm proposal has been more than half of acceptor Accept, Then it means that the value of this proposal can be chosen,learner to learn the value of this proposal, and at the same time on its own server can no longer accept proposor requests.


What is the effect of this algorithm, as long as more than half of the server to maintain the normal work, while the network connected to the work server is normal (network allows message loss, duplication, chaos), it will be guaranteed,


P2A: at some point in the future, since after a proposal is accepted by a majority acceptor, the proposal value of the accept must be the same as this proposal value.


This is the key to the entire algorithm, to ensure this, the rest of the Learn value process is simple, no longer for message loss, server downtime and worry, for example, assume that 5 server number 0~4,server0,server1,server2 has been accept Proposal 100, and then server0,server1 learn to proposal 100, just learn to complete server0,server1 are all down, but at this time, Server2 Server3 and Server4 because did not learn to chosen Value, so also to continue to propose proposal, then, according to this magic algorithm, finally can make Server3 Server4 future accept the value must be the proposal of the previous selected 100value.


See here, you should be able to vaguely guess, in this process, Server2 before accept proposal 100 value played a key role, below, we will strictly prove the above red font representation of the algorithm key points:


First, we review several key points of the two-stage protocol:

1. Before initiating the proposal, first obtain the proposal Value with the largest number of acceptor in the majority. If value is null, it can use its own value.

2. Phase 1b promise no longer accept the proposal below the receiving sequence number.

3.Propsal by the acceptor accept can be recognized as chosen value to be learner learning.


These constraints work together to achieve the above P2A requirements of the effect, Paxos algorithm Leslie Lamport is how to construct it, in fact, very simple:

First, the P2A is strengthened to the following conditions:


P2B: Since a proposal was acceptor accepted by a majority, Proposor proposed proposal value must be the same as this proposal value.


Obviously, by P2B can be introduced P2A, then how to meet p2b, in fact, as long as the following conditions are met:


P2C: The value of the originating proposal is the largest proposal value in any one of the majority acceptor collections. If this acceptor collection does not accept proposal to use its own value.


How to launch p2b from P2C, the use of mathematical induction can easily prove that: suppose in a moment a super half acceptor set C co-accept a proposal K, because set C and any one of the majority acceptor set S There must be a common member, then, after this moment, the proposal of the maximum number of acceptor in a collection s of any one majority will be proposal K or the proposal with a sequence number greater than proposal K, Assumed to be proposal K2. Similarly, the value of proposal K2 is equal to the value of proposal K or proposal K3, and K<K3<K2, which pushes it down, finally launches the value determined by P2C, which is necessarily the value of proposal K.


As we can see, the P2C condition is basically the key point of the above two-phase protocol 1, but there is a problem, this p2c condition requires to find out this "maximum ordinal value" and the proposed proposal must be an atomic operation , which is actually difficult to achieve, so, The two-phase protocol is a clever way to avoid this problem, which is the role of the above key point 2 promise. In acceptor respond "maximum ordinal value", promise no longer accept below the proposal received the serial number, so "find this ' Maximum number value '" and "propose proposal" It is impossible to insert a new accepted sequence number, thus avoiding the p2c condition being destroyed.


So far, the basic Paxos algorithm has been thoroughly analyzed, but, now this algorithm is using multiple proposal, will create a live lock problem, need to introduce leader to optimize, and, the algorithm can only achieve in the multiple conflict value of the election of a value function, As for serializing multiple value implementations of state machines, the MULTI-PAXOS algorithm is required.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Paxos algorithm of distributed lock service Chubby

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.