Advanced theory of Distributed Systems-Raft, Zab

Source: Internet
Author: User

Introduction

"Theory advanced in Distributed Systems-Paxos" introduces the consistency protocol Paxos, today we will learn two other common consistency protocols--raft and Zab. By comparing with Paxos, we understand the core ideas of raft and Zab and deepen the understanding of conformance agreements.

Raft

Paxos is biased toward theory, which mentions less about how to apply to engineering practice. The difficulty of comprehension coupled with the realistic bone feeling, it is very difficult to achieve a correct distributed system based on Paxos in the production environment [1]:

there is significant gaps betweenthe description of the Paxos algorithm and the needs of a real-world system. In order to build a real-world system, an expert needs to use numerous ideas scattered in the literature and make several Relatively small protocol extensions. The cumulative effort would be substantial and the final system would be based on an unproven protocol.

Raft[2][3] In 2013, the proposed time, although not long, but there are many systems based on Raft implementation. Compared to Paxos,raft's buy point is more conducive to understanding, easier to implement.

To achieve easier understanding and implementation of the purpose, raft the problem decomposition and materialization: Leader Unified processing of change operation requests, the role of the consistency protocol to ensure that the operation of the node between the log copy (log replication) consistent with the term as a logical clock (logical Clock) To ensure timing, the node runs the same state machine [4] to obtain a consistent result. The specific process of the raft protocol is as follows:

    1. The client initiates the request, and each request contains an action instruction
    2. Request to leader processing, leader the operation instruction (entry) append (append) to the operation log, followed by follower request to Appendentries, try to let the operation log copy in follower
    3. If the follower majority (quorum) agrees to Appendentries request, the leader commits the action and the instruction is processed by the state machine
    4. Returns the result to the client after the state machine processing is complete

The order is guaranteed by log index (instruction ID) and term number, and normally the leader and follower state machines execute the instructions in the same order, resulting in the same result and consistent state.

Downtime, network differentiation and so on can cause Leader re-election (each time the election to generate new Leader, the generation of new term), leader/follower between the state inconsistent. Raft leader for themselves and all follower maintain a Nextindex value, which represents leader tightly next to process the instruction ID and the command ID to be sent to follower, Lnextindex is not equal to Fnextindex when the leader operation log and the follower operation log are inconsistent, this will start from the follower operation Log initially inconsistent, leader operation Log covered follower, Until Lnextindex and Fnextindex are equal.

Paxos the existence of leader in order to improve the efficiency of the resolution, leader and the number does not affect the consistency of the resolution, raft requirements have the only leader, and the consistency of the issue to maintain the consistency of the log copy, in order to achieve more than Paxos easier to understand, Goals that are easier to achieve.

Zab

The full name of Zab[5][6] is zookeeper Atomic broadcast protocol, which is the consistency protocol used within zookeeper. The greatest feature compared to Paxos,zab is to ensure strong consistency (strong consistency, or linear consistency linearizable consistency).

Like raft, Zab requires the only leader to participate in the resolution, Zab can be broken down into discovery, sync, broadcast three stages:

    • Discovery: The election produces PL (prospective leader), PL collects follower epoch (Cepoch), generates follower according to Newepoch Feedback PL ( Each time a new leader is elected, a new epoch is created, similar to the term of raft.
    • Sync: pl-padded compared to follower majority missing state, after each follower to the state of PL missing, PL and follower completed state synchronization after PL becomes formal leader (established leader)
    • Broadcast: Leader handles the write operation of the client and broadcasts the status change to the Follower,follower majority after leader initiates the status change to the ground (Deliver/commit)

Leader and follower between the health state through the heartbeat, under normal circumstances Zab in the broadcast stage, leader downtime, network isolation and other anomalies when Zab back to the discovery phase.

To understand the basic principles of Zab, we look at how Zab ensure strong consistency, zab through the order of constraint transactions to achieve strong consistency, first broadcast transactions first commit, Fifo,zab called Primary Order (hereinafter referred to as PO). The core of implementing PO is ZXID.

Each transaction in the Zab corresponds to a zxid, which consists of two parts: <e, C>,e, leader, which is generated at the time of the election, represents the number of transactions in the second epoch, incremented in sequence. Suppose that the zxid of two transactions are Z, Z ', when satisfying z.e < Z '. E or z.e = Z '. E && z.c < Z '. c, the definition z occurs before Z ' (Z < z ').

To implement Po,zab, follower and leader have the following constraints:

    1. There are transactions Z and Z ', if leader broadcasts Z first, then follower need to ensure that the transaction of the commit Z corresponds first
    2. There are transactions Z and Z ', Z is broadcast by Leader P, Z ' is broadcast by Leader Q, Leader P is prior to Leader Q, then follower need to guarantee first commit Z corresponding transaction
    3. There are transactions Z and Z ', Z is broadcast by Leader P, Z ' is broadcast by Leader Q, Leader p precedes Leader Q, if follower has committed z, then Q must be guaranteed to have committed Z to broadcast Z '

1th, 2 point guaranteed transaction FIFO, 3rd guaranteed leader have all committed transactions.

Compared to Paxos,zab, the transaction order is constrained, and the scenario for strong consistency needs is applied.

Comparison of Paxos, Raft and Zab

In addition to Paxos, raft and Zab, viewstamped Replication (referred to as VR) [7][8] is also a more consistent protocol to discuss. These protocols contain a lot of common content (Leader, quorum, state machine, etc.), so we can't help asking: where is the difference between Paxos, Raft, Zab and VR, or is it just a matter of fact? [9]

Paxos, Raft, Zab and VR are all agreements to solve the consistency problem, the Paxos protocol is inclined to the theory, Raft, Zab, VR tend to practice, and the degree of consistency guarantees the difference between these protocols. Help us understand similarities and differences in these protocols [10]:

Compared to raft, Zab, Vr,paxos more pure, closer to the origin of consistency problem, although Paxos inclination theory, but does not mean that Paxos can not be applied to the project. Based on Paxos engineering practice, it is necessary to take into account the specific requirements scenarios (such as the degree of consistency to achieve), and then paxos the original semantics of packaging.

Summary

The above introduces the core idea of distributed conformance Protocol raft and Zab, and analyzes the similarities and differences between raft, Zab and Paxos. When realizing the distributed system, the Raft, Zab, VR, Paxos and other protocols are not absolutely good and bad, but suitable for the specific needs and scenarios.

[1] Paxos made Live-an engineering perspective, Tushar Chandra, Robert Griesemer and Joshua Redstone, 2007

[2] in Search of an understandable Consensus algorithm, Diego Ongaro and John Ousterhout, 2013

[3] In Search of an understandable Consensus algorithm (Extended Version), Diego Ongaro and John Ousterhout, 2013

[4] Implementing Fault-tolerant Services Using The state machine, Fred B. Schneider, 1990

[5] Zab:high-performance broadcast for Primary-backup systems, Flaviop.junqueira,benjaminc.reed,andmarcosera?ni, 2011

[6] ZooKeeper ' s atomic broadcast protocol:theory and practice, Andr´e Medeiros, 2012

[7] viewstamped Replication A New Primary Copy Method to support highly-available distributed Systems, Brian M.oki and Bar Bar H.liskov, 1988

[8] viewstamped Replication Revisited, Barbara Liskov and James Cowling, Barbara Liskov and James Cowling, 2012

[9] Can ' t we all just agree? The morning paper, 2015

[Ten] Vive La Difference:paxos vs. viewstamped Replication vs. Zab, Robbert van Renesse, Nicolas Schiper and Fred B. Schne Ider, 2014

Advanced theory of Distributed Systems-Raft, Zab

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.