"Turn" Paxos algorithm 2-algorithm procedure

Source: Internet
Author: User

--turn from: {Old yards ' column}

1. Numbering processing

According to P2C, proposer will consult acceptor before the proposal to see the largest number and value of its approval before deciding which value to submit. We have previously emphasized the higher numbered proposal, without stating what to do with the low-numbered proposal.

|--------low Number (l<n)--------|--------Current number (N)--------|--------high number (h>n)--------|

The correctness of the P2C is by the current number n and produced some higher number h to ensure that the lower number l at some time before, may also be in line with P2C, but because of the unreliable network communications, resulting in L is delayed to the same time with H, L and H may have different value, which clearly violates the P2C, The workaround is that Acceptor does not accept any proposal that have expired, and a more precise description is:

P1a:an Acceptor can accept a proposal numbered n iff it has no responded to a prepare request have a number greater T Han N.

Obviously, Acceptor received the first proposal to meet this condition, that is to say P1a contains P1.

For further discussion on numbering problems, refer to "Re- numbering problem: Unique number " in the following section.

2. Paxos Algorithm Formation

Re-organizing P2C and P1A can propose Paxos algorithm, which is divided into 2 stages:

Phase1:prepare

    1. Proposer Select a proposal number n and send it to a majority in acceptor
    2. If Acceptor finds that n is the largest number in the request it has replied to, it will reply to the largest proposal it has accepted and the corresponding value (if any), with a promise that the number less than N is not approved proposal

Phase2:accept

    1. If the proposer receives a majority response, it sends an accept message to the majority of the acceptor (which is numbered N,value v proposal) (which can be different from the prepare's majority), the key is what this value V is, If value is included in the acceptor response, the one with the largest number is taken as V; If no value is included in the response, then there is a proposer arbitrarily select a
    2. Acceptor receive the Accept message after check, if there is no greater than n response than n proposal, then accept the corresponding value; otherwise reject or not respond

Sensory algorithms are surprisingly simple, but it is difficult to understand how the algorithm is formed. After careful consideration, this algorithm will produce more questions:

Re-discussion numbering problem: unique number

An important factor to ensure the correct operation of the Paxos is the proposal number, the number should be able to compare size/successively, if it is a proposer easy to do, if it is multiple proposer at the same time, how to deal with the proposal? Lamport do not care about this problem, just ask that the number must be full-order, but we must care. The problem seems simple, and it's actually a little tricky, because it's essentially a distributed issue.

In Google's chubby paper, this is a way to:

    • Assuming that there are n proposer, each number is IR (0<=ir <n), any value s of the Proposol number should be greater than its known maximum value, and satisfy: s%n = IR = + S = m*n + ir
    • Proposer the maximum known value is from two parts: Proposer own value after the number increment and the reject received after acceptor
    • Take 3 proposer P1, P2, P3 for example, start m=0, numbering 0,1,2 respectively
    • P1 submitted by the time found P2 has been submitted, P2 number 1 > P1 0, so P1 recalculate number: new P1 = 1*3+0 = 4
    • P3 submitted with number 2 and found 4 less than P1, so P3 re-numbered: new P3 = 1*3+2 = 5

The whole Paxos algorithm is basically around the proposal number in progress: Proposer busy choosing a larger number to submit proposal,acceptor then compare the submitted proposal number is the largest, as long as the number is determined, The corresponding value is determined. So, there is nothing more important in the Paxos algorithm than the proposal number.

Live lock

When a proposer submitted poposal is rejected, it may be because acceptor promise the larger numbered proposal, so the proposer increase number continues to be submitted. If 2 proposer find their numbers too low to move to higher numbered proposal, it can lead to a dead loop, also known as a live lock.

Leader elections

The problem of live lock in theory does exist, Lamport gives the solution is to elect a proposer for leader, all proposal through leader to submit, when leader down immediately re-election other leader.

Leader is able to solve this problem, because it can control the progress of the submission, than if the previous proposal no results, after the proposal wait, not anxious to increase the number of re-submission, the equivalent of a distributed problem into a single point of problem, The robustness of a single point is ensured by the electoral mechanism.

The problem seems to be getting more complicated because of the need for a leader election algorithm, but lamport in fast Paxos that the problem is relatively simple, because leader election failure will not affect the system, so he does not want to discuss this issue. But then he said, Fischer, Lynch, and Patterson's findings suggest that a reliable electoral algorithm must use either random or timed out (leases).

Paxos is an election algorithm, can you use Paxos to elect leader? Election leader is part of the election proposal, Paxos is the use of Paxos in the leader of elections? A simplified version of the Paxos algorithm, known as Paxoslease, can complete leader elections, such as Keyspace, Libpaxos, Zookeeper, Goole Chubby, and other implementations of the algorithm. About Paxoslease, we will discuss it in detail later.

Although Lamport mentions the stochastic and timeout mechanisms, I personally think that the more robust and elegant approach is paxoslease.

The confusion brought by leader

Leader solves the live lock problem, but introduces a question:

Now that you have leader, just set a queue on the leader, all proposal can be numbered globally, except that leader can be elected, very similar to the single point MQ mentioned in Paxos algorithm 1.

Does that mean that the Paxos algorithm is implemented as long as you elect a master from multiple MQ? Now MQ itself support Master-master mode, is it a lap, Paxos is the dual master mode?

Only from the number, it is true, as long as the election of a single master to receive all proposal, numbering problem solved, there is no need to go acceptor process. However, the Paxos algorithm requires that a value can be selected in each election and can be learned by learn, no matter what error occurs. For example, leader, Acceptor,learn can be down, and then, may also "wake up", these processes must ensure the correctness of the algorithm.

If there is only one Master, the results of the election will not be learn learning, that is to say, leader election mechanism is to ensure that the correctness of the algorithm, false alarm, Paxos originally not master-master.

Here, we first mentioned the role of "learn", after value is chosen, learn's work is to learn the final resolution, learning is also part of the algorithm, the same to ensure correctness in any case, the follow-up of the main work will be around the "learn" unfold.

Paxos and Section two submissions

Google's people have said that other distributed algorithms are a simplified form of Paxos. If leader only submits a simple case of proposal to acceptor:

    • Send prepared to the majority acceptor
    • Receive a response from a majority
    • Send an accept to the majority to approve the corresponding value

is actually a two-segment commit problem, the entire Paxos algorithm can be seen as multiple cross-execution and interactive two-segment commit algorithm.

How to select multiple value

The process described by the Paxos algorithm occurs in the "one election" process, as mentioned earlier, the actual Paxos algorithm execution is a round, each round there is also a proprietary salutation: instance (translated into Chinese a bit strange), each instance to choose a unique value.

In each instanc, a proposal may be submitted several times to obtain acceptor approval, and the general practice is that if acceptor is not accepted, then the proposer will continue to submit with the increase number. If Acceptor has not chosen (majority approval) a value,proposer can submit the value arbitrarily, otherwise it must submit the opinion choice, which has been explained in P2C.

Another question to mention in the Paxos is that the number of proposal is submitted in the prepare phase, and then the decision to commit which value, that is, the value is submitted separately from the number, is a bit different from our thinking.

3. Learning resolutions

After the resolution is finally elected, the most important thing is to let learn learn the resolution, study resolution is to decide how to deal with the resolution.

In the process of learning, the first problem encountered is how learn know that the resolution has been selected, the simple practice is that each approval proposal acceptor told each need to learn learn, but such a large amount of traffic. The simple optimization method is only to tell a learn, let this unique learn notify other learn, this is good is to reduce the traffic, but the disadvantage is also obvious, will form a single point, of course, the compromise is to tell a small number of learn, complexity is learn and there will be distributed problems.

In any case, one thing is certain, that is, every acceptor to send the approval message to learn, if not, learn can not know whether this value is the final resolution, so the optimization problem is reduced to one or more learn problems.

Can you choose a leader for learn like Proposer's leader? Since each acceptor has persistent storage, it is possible to do so, but the system is becoming more and more complex, and we will discuss the issue in detail later.

Learn study the resolution, there is also an important problem is to learn sequentially, the previous election algorithm spent a lot of effort is to give all the proposal global number, the purpose is to be used sequentially. However, the order of the resolution received by learn may not be inconsistent, it is possible to receive resolution 10th first, but the 9th has not yet arrived, this time must wait for the arrival of 9th, or the initiative to acceptor to request resolution 9th, before learning 9th, 10th resolution.

4. Exceptional conditions, persistent storage

In the process of algorithm execution will produce a lot of abnormal situation, such as proposer down, acceptor in receiving proposal after the outage, proposer after receiving the message down, acceptor after accepting downtime, learn downtime, etc. There are even many errors such as storage failures.

However, no matter what kind of error must ensure the correctness of the Paxos algorithm, this requires proposer, aceptor, learn can be persistent storage, so that the server "wake up" can still participate in Paxos processing properly.

    • Propose the maximum proposal number, resolution number (instance ID) that the store has submitted
    • Acceptor the largest number of promise, the maximum number of the accept and the value, the resolution number
    • Learn stores the resolutions and numbers that have been learned

The above is the approximate introduction of Paxos algorithm, the purpose is to have a rough understanding of the Paxos algorithm, know what the algorithm solves the problem, the role of the algorithm and how to produce, there is the process of algorithm execution, the core and the requirements of fault-tolerant processing.

However, it is difficult to translate it into an executable algorithm program based on the above description, because there are an infinite number of problems to be solved:

    • Leader election algorithm
    • Leader down, but the new leader has not been selected, the system will have any impact
    • More overlapping errors occur, and can guarantee the correctness of the algorithm
    • Learn arrive how to learn the resolution
    • Instance no, proposal no is where is the maintenance?
    • Performance

Many problems, such as snowflakes, can only be discussed once they are resolved. There is, of course, one of the most important questions that the Paxos algorithm proves to be correct, but how can the program prove to be correct?

For more information, refer to the following chapters.

"Turn" Paxos algorithm 2-algorithm procedure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.