Bottom-level algorithm series: Paxos algorithm

Source: Internet
Author: User
Tags unique id

About the algorithm, the surface is too wide. This series only studies the core algorithms that are encountered in practical applications. Understanding these algorithms and applications is necessary for the Java code to be advanced.

For Paxos study argumentation process, confirmed a sentence: learn Paxos best place in history wiki:Paxos (computer science)

Directory

1. Background

2.Paxos algorithm

3.muti-paxos algorithm

The application of 4.muti-paxos in Google Chubby

=============== Text Split Line ============================

One background

The Paxos protocol is a communication protocol that solves the Agreement (resolution) between multiple nodes on a value (proposal) in a distributed system . . But the Paxos algorithm is obscure, and the original paper is difficult to understand. Therefore, with this article, I hope to provide you with a little ideas.

Two, Paxos algorithm 2.1 role (core on 3 roles)

Client: Clients, initiates the request and waits for the return.
Proposer: The proposed initiator, which processes the client request, sends the client's request to the cluster in order to determine whether the value can be approved.
acceptor: The proposed approver is responsible for handling the received proposals, and their response is a vote. Some state is stored to determine whether to receive a value.
Learner: When a protocol with the same value is adopted by more than half of acceptor and sends a message to learner, learner adopts the protocol value.
Leader: A special proposer.

2.2Basic-paxos algorithm

The core implementation of Paxos instance consists of two stages :

the preparation phase (prepare phase) and the proposal phase (accept phase). The refinement is 4 small stages, as described on the wiki:

Simply put,Basic Paxos is a classic two-phase commit (2PC)

First Stage :

    • 1a Prepare Preparation: proposer a protocol to acceptor, where the agreement is the desired "consistent content"
    • 1a Promise Commitment: acceptor promises to receive only the maximum protocol number of the Protocol (including prepare and accept) and rejects the protocol that is smaller than the current protocol number n, replying to all protocol values received before proposer. If the current protocol number n is smaller than before, the reply is rejected.

Phase II :

    • 2a Accept request to initiate "accept" requests:Proposer received acceptor feedback enough commitment, set the maximum value of the agreement, if not reply, arbitrarily set a value. Send the "accept" request to the acceptors of the selected value.
    • 2b Accepted: Acceptor accepts the agreement (the acceptor has not previously committed an agreement greater than the agreement number) and notifies proposer and learner.

      The following wiki flowchart is provided:

of which the role of the prepare stage, as shown in the following:

1.S1 first initiated the Accept (1,red), and in S1,s2 and S3 reached the majority, red on the s1,s2,s3 on the persistence of
2. Subsequently S5 initiated the Accept (5,blue), in S3,s4 and S5 reached the majority, blue in S3,S4 and S5 persistence
4. The final result is that the value of S1 and S2 is red, and the value of S3,S4 and S5 is blue and there is no agreement. So the two phases are essential , and the prepare phase is the function of blocking the old proposal and returning the acceptedproposal that has been received.

Solution:

1. sort the proposal by assigning a unique ID to each proposal, which specifies that the larger the ID is the newer, obviously (5,blue) and (1,red), 5:1 large, so keep the blue

2. A two-phase approach was adopted to reject the old proposal.

2.3 Muti-paxos Algorithm

Personal understanding Muti-paxos and basic Paxos The biggest difference is:

1. Multiple instances

2. There is a unique leader (a special proposer), and the leader is submitted by the value to the acceptor for voting. At this point the prepare process can be skipped.

Many articles have misunderstood that Muti-paxos is a one-stage commit, which is limited to leader stability. just picked when coming to a new leader , it is still two stages to commit such as:

If the leader is stable , prepare and promise steps are not required, such as:

Multi Paxos leader is used to avoid live locks (e.g. 1 leader,4 proposer,2 proposals A, 2 offer B cannot agree, resulting in a live lock), but the presence of leader brings other problems, one is how to elect and maintain the only leader ( Although no leader or more leader does not affect the consistency, but affects the resolution process progress), the second is to act as a leader node will bear more pressure, how to balance the load of the node. MENCIUS[1] proposed that the node rotates as the leader to achieve a balanced load; a lease (lease) can help implement a unique leader, but leader failure conditions can cause the service to be unavailable for a short period of time.

The application of 2.4 Muti-paxos in Google Chubby

Google Chubby is a highly available distributed lock service designed to be a distributed lock service that requires access to a centralized node. This article only analyzes the implementation of the chubby server.

The basic architecture of the chubby server is broadly divided into three tiers

① is the most basic fault-tolerant log system (Fault-tolerant log), through the Paxos algorithm can ensure that the log on all machines in the cluster exactly the same, with good fault tolerance.

Above the ② log layer is the Key-value type of fault tolerant database (Fault-tolerant DB), which guarantees consistency and fault tolerance through the underlying logs.

On top of the ③ storage tier is the distributed lock service and small file storage service that chubby provides externally.

The Paxos algorithm is used to ensure that the logs of each replica node within the cluster remain consistent, chubby the transaction log (Transaction LOG) Each value corresponds to a instance in the Paxos algorithm (corresponding to proposer), because the chubby need to provide continuous service, so the transaction log will grow indefinitely, so throughout the chubby operation, there will be more than Paxos Instance, at the same time, Chubby assigns a globally unique Instance number to each Paxos Instance, and writes it sequentially to the transaction log.

In Paxos, each Paxos instance need to perform one or more rounds of prepare->promise->propose->accept such a complete two-stage request process to complete the selection of a proposed value, In order to ensure the correctness of the performance of the algorithm as much as possible, you can allow multiple instance to share a set of serial number allocation mechanism, and the prepare->promise merged into a phase. The following are the specific practices:

① when a replica node passes an election to master, it broadcasts a prepare message using the newly assigned number n, which is shared by all instance that are not agreed and instance that are not currently started.

② when Acceptor receives the prepare message, it must respond simultaneously to multiple instance, which can usually be achieved by encapsulating the feedback information in a single packet, assuming that a maximum of K instance is allowed at the same time as the proposed value is selected:

-There are currently a number of K-instance that have not been agreed upon, encapsulating the outstanding proposed values of these pending instance into a single packet and returning as a promise message.

-Also, determine if n is greater than the current acceptor highestpromisednum value (the largest proposed number value that is currently accepted), if it is greater than, Then the value of the highestpromisednum for these pending instance and all future instance is marked N, so that these pending instance and all future instance can no longer accept any proposal with a number less than N.

③master performs propose->accept phase processing for all pending instance and all future instance, and if Master is able to run stably, it is no longer necessary to perform the prepare-during the next algorithm run >promise handled it. However, once Master discovers that acceptor returns a reject message stating that there is another master in the cluster and attempting to send a prepare message with a larger proposal number, The current master will need to reassign the new proposal number and proceed with the prepare->promise phase again.

Visible chubby is a typical Muti-paxos algorithm application, in the case of Master stable operation, only need to use the same number to sequentially perform each instance promise->accept stage processing.

  

Iii. Summary

There are many variants of the Paxos algorithm, such as Cheap Paxos,Fast Paxos , and so on, this paper introduces the most widely used Muti-paxos algorithm. Hope to bring you a little bit of the idea of getting started with distributed consistency algorithms .

====================

Reference:

1.paxos of Wiki:paxos (computer science)

2.CSDN Blog: Step-by-step understanding of the Paxos algorithm

3. Book: "From Paxos to Zookeeper"

4. Thesis: "Time-clocks-and-the-ordering-of-events-in-a-distributed-system"

Bottom-level algorithm series: Paxos algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.