[Reprint] Consistency problem and raft consistency algorithm

Last Update:2015-07-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original: http://daizuozhuo.github.io/consensus-algorithm/

The raft protocol is indeed too much understood than the Paxos protocol.

Consistency issues

The consistency algorithm is used to solve the consistency problem, so what is the consistency problem? In a distributed system, the consistency problem (consensus problem) is that for a set of servers, given a set of operations, we need a protocol to make the final result agree. A more detailed explanation is that when one of the servers receives a set of instructions from the client, it must communicate with other servers to ensure that all servers receive the same instruction in the same order, so that all servers produce consistent results that look like a single machine.

The consistency algorithm in real production requires the following properties:

Safety: It doesn't return the wrong result anyway
Available: As long as most of the machines are normal, they can still work. For example, a cluster of five machines allowed up to two machines to break down.
Do not rely on time to ensure consistency, that is, the system is asynchronous.
In general, the running time is determined by most machines and will not affect overall efficiency because of the small number of slow machines.

Why resolve consistency issues?

We can say that the reliability of a distributed system reaches 99.99...%, but it cannot be said that it reaches 100%, why? It is because the consistency problem cannot be solved completely. Issues in the following four distributed systems are related to consistency issues:

Reliable Multicast Reliable multicast
Management of members in membership Protocal (Failuer detector) cluster
Leader election election algorithm
Mutual exclution mutexes, such as the exclusive and allocation of resources

Raft consistency algorithm

Before I introduced some of the textbook election algorithms, they are also a consistency algorithm, that is, all the last server leader is consistent. Now there are two Paxos and Raft in the mainstream consistency algorithm in practical application. Zookeeper is the choice of Paxos, and ETCD uses the raft. As a go enthusiast, I'll take a look at raft.

Raft is because Paxos too difficult to understand too difficult to achieve and proposed, the purpose is to be reliable in the case of Paxos, as simple as possible to understand. But raft's paper In Search of an Understandable Consensus Algorithm still has 18 pages, and I'm going to be easier to understand than that.

Raft to divide the consistency problem into three small problems:

Leader election elections
Log replication logging, synchronizing
Safety security

Basic concepts

Each server has three states: leader, follower, candidate

Follower: do not send request but will only reply to leader and candidate request.
Leader: Handling requests from the client
Candidate:leader's candidate

Raft divides the time into terms. Each term starts with a single election. Each term has a maximum of one leader, or no leader.

RPC implementation

The algorithm requires two types of RPC, Requestvote RPC: initiated by candidates during the election process, and when another server receives the RPC, only when the other party's term and log are at least as new as their own, will the vote be voted for, Candidate, who received most of the votes in favour, will be elected leader.

Appendentries RPC was initiated by leader to distribute the log, forcing Follwer's log to be consistent with itself.

Leader election

If a follower in election timeout time did not receive leader information, enter the new term, turn into candidate, vote for themselves, the election requestvote RPC. This state persists to any of the following three occurrences:

It won the election
In addition, the server obtains the election
1 A term has passed, or there is no election result

Why is there a 3 this situation, that is when the election, if everyone at the same time to vote for themselves, then no server can get the majority of votes, this time to enter the next term, and then choose again. To prevent this from happening, each server's election time is randomly set to a different value, so the next election can be initiated first by a timeout.

Log replication

After choosing the leader, you can distribute the log.

Each log has a log index and a term number. When most follower copy this log, it is said that this log is committed and can be executed. Leader remembers the maximum log index that has been commit and uses it to distribute the next appendentries RPC. This function is the same as the number of the TCP segment.

When a leader is re-elected, its log and follower log may be inconsistent, then it forces all follower to be consistent with their own log. First leader to find the largest number consistent with the follower between the log, Then overwrite the log after that.

Safety

But so far there is no guarantee of security. For example, when leader in commit log, a follower dropped, and then this follower was later selected as leader, it would overwrite the Follwer now committed the log, Since these logs have been executed, different machines will execute different instructions as a result. In the course of the election, one more restriction could prevent this from happening, namely:

Leader completeness property: 对于任意一个term, leader都要包含所以在之前term里committed的logs.

This is the complete raft algorithm.

Note: Images are from paper in Search of an understandable Consensus algorithm

If you find it useful, please point to star

[Reprint] Consistency problem and raft consistency algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Reprint] Consistency problem and raft consistency algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Reprint] Consistency problem and raft consistency algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support