The old driver took you with the Go language implementation of Paxos algorithm

Source: Internet
Author: User
Tags rounds
This is a creation in Article, where the information may have evolved or changed.

In theoretical computer science, the cap theorem (Cap theorem), also known as the Brewer's theorem (Brewer's theorem), points out that it is impossible for a distributed computing system to meet the following three points simultaneously:

    1. Consistency (consistence), equivalent to all nodes accessing the same copy of the latest data;
    2. Availability (availability), each request can get a non-error response--but not guaranteed to get the latest data;
    3. Partition fault tolerance (Network partitioning), in effect, partitions correspond to time-of-use requirements for communication. If the system cannot achieve data consistency within the time frame, it means that the partition has occurred and that a choice must be made between C and A for the current operation.

Get in

Today's talk about consistency, there are two models of node communication in Distributed systems: Shared memory and message delivery (Messages passing).

In a distributed system based on the messaging communication model, the following errors inevitably occur: The process can be slow, killed, or restarted, and messages may be delayed, lost, and duplicated. In the underlying Paxos scenario, the case of a possible message tampering, or Byzantine error, is not considered first. The problem with the Paxos algorithm is how to agree on a value in a distributed system where the above anomalies can occur, ensuring that no matter what happens above, the consistency of the resolution is not compromised. A typical scenario is that in a distributed database system, if the initial state of each node is consistent, each node executes the same sequence of operations, then they can finally get a consistent state. To ensure that each node executes the same sequence of commands, a "consistency algorithm" is executed on each instruction to ensure that the instructions seen by each node are consistent. A general consistency algorithm can be applied in many scenarios and is an important problem in distributed computing, so the research on consistency algorithm has not stopped since the 1980s.

Departure (Paxos algorithm)

The Paxos algorithm determines a resolution by two stages:

    • PHASE1: Determine who has the highest number, only the highest number of people have the right to submit proposal (proposal: Given the specific value);
    • Phase2: The highest-numbered submitted proposal, if no other node proposed a higher number of proposal, then the proposal will be passed smoothly, or the whole process will be repeated.

The conclusion is this conclusion, as to the derivation of the whole process, it is not here to elaborate. One thing to note, however, is that there may be a live lock in the first stage of the process. You are numbered high, I am taller than you, repeated so, the algorithm can never end. can use a "Leader" to solve the problem, this Leader is not deliberately to choose one, but naturally formed. Again no discussion, this article is mainly Code-based ha!

Phase1

func (px *Paxos)Prepare(args *PrepareArgs, reply *PrepareReply) error {px.mu.Lock()defer px.mu.Unlock()round, exist := px.rounds[args.Seq]if !exist {//new seq  of commit,so need newpx.rounds[args.Seq] = px.newInstance()round, _ = px.rounds[args.Seq]reply.Err = OK}else {if args.PNum > round.proposeNumber {reply.Err = OK}else {reply.Err = Reject}}if reply.Err == OK {reply.AcceptPnum = round.acceptorNumberreply.AcceptValue = round.acceptValuepx.rounds[args.Seq].proposeNumber = args.PNum}else {//reject}return nil}

In the Prepare phase, mainly through the RPC call, ask each machine, the current proposal can pass, judging the condition is that the current number of submissions is greater than the number of other machines before the Prepare, code if args.PNum > round.proposeNumber judgment. Another is that if the previous machine did not pass, even if the current is the first to commit the Prepare machine, then directly agreed to pass. Code snippet:

round, exist := px.rounds[args.Seq]if !exist {// new seq  of commit,so need newpx.rounds[args.Seq] = px.newInstance()round, _ = px.rounds[args.Seq]reply.Err = OK}

After the completion of the logical judgment, if this proposal is passed, it will need to be returned to the proposer, which has already adopted the proposed and determined values. Code snippet:

if reply.Err == OK {reply.AcceptPnum = round.acceptorNumberreply.AcceptValue = round.acceptValuepx.rounds[args.Seq].proposeNumber = args.PNum}

Phase2

func (px Paxos)Accept(args *AcceptArgs, reply *AcceptReply) error {px.mu.Lock()defer px.mu.Unlock()round, exist := px.rounds[args.Seq]if !exist {px.rounds[args.Seq] = px.newInstance()reply.Err = OK}else {if args.PNum >= round.proposeNumber {reply.Err = OK}else {reply.Err = Reject}}if reply.Err == OK {px.rounds[args.Seq].acceptorNumber = args.PNumpx.rounds[args.Seq].proposeNumber = args.PNumpx.rounds[args.Seq].acceptValue = args.Value}else {//reject}return nil}

The basic and Prepare stages of the Accept phase are the same. Determine if the current proposal exists, if it is not pure in the show is new, then go straight back to OK!

round, exist := px.rounds[args.Seq]if !exist {px.rounds[args.Seq] = px.newInstance()reply.Err = OK}

Then also judge whether the proposal number is greater than or equal to the current proposal number, if so, then also return OK, no one refused.

if args.PNum >= round.proposeNumber {reply.Err = OK}else {reply.Err = Reject}

The important point is that if the proposal is passed, then the proposed number and the proposed value of the turn should be set.

if reply.Err == OK {px.rounds[args.Seq].acceptorNumber = args.PNumpx.rounds[args.Seq].proposeNumber = args.PNumpx.rounds[args.Seq].acceptValue = args.Value}

Throughout the use of the map and arrays to store some auxiliary information, the map is mainly stored, each round of voting is determined by the result, key represents each round of the voting number, Round represents the store has accepted values. The completes array is primarily used to store the smallest number that has been determined to be completed during the use of the process.

rounds     map[int]*Round //cache each round  paxos result key is seq value is valuecompletes  [] int         //maintain peer min seq of completedfunc (px *Paxos)Decide(args *DecideArgs, reply *DecideReply) error {px.mu.Lock()defer px.mu.Unlock()_, exist := px.rounds[args.Seq]if !exist {px.rounds[args.Seq] = px.newInstance()}px.rounds[args.Seq].acceptorNumber = args.PNumpx.rounds[args.Seq].acceptValue = args.Valuepx.rounds[args.Seq].proposeNumber = args.PNumpx.rounds[args.Seq].state = Decidedpx.completes[args.Me] = args.Donereturn nil}

At the same time, the Decide method is used for the proposer to determine a value, which is mapped to the application of the state machine in the distribution.

Customer segment by submitting instructions to the server, the server through the Paxos algorithm is now more than one machine, all the servers in order to execute the same instructions, and then the state machine to execute the instructions of the last machine results are the same.

Station

In distributed environments, network failure is a normal phenomenon. If a machine goes down and then recovers after a while, how can he restore the previous instructions when he is down? When he submits a JMP directive, indexes 1 and 2 are already defined directives, so you can start directly from index 3, and when he submits Propser (JMP), he will receive S1, S3 's return value (CMP), according to the PAXOS algorithm the latter recognizes the former principle, so he will be in PHAs The E2 phase submits a request with a value of CMP accept, and the last index of 3 becomes the CMP, and if there is no return value at this stage, then the return value of the client is selected, and the final agreement is reached.

Originated from MIT, then used for self-study, source annotated address.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.