Consistency Algorithm Quest (Extended Version) 7

Source: Internet
Author: User

5.5 Follower and candidate crashes

Until this point we have focused on leader failures. Follower and candidate crashes is much simpler to handle than leader crashes, and they is both handled in the same. If a follower or candidate crashes, then the future equestvote and Appendentries RPCs sent to it would fail. Raft handles these failures by retrying indefinitely; If the crashed server restarts, then the RPC would complete successfully. If a server crashes after completing a RPCs but before responding and then it'll receive the same RPC again after it Restar Ts. Raft RPCs is idempotent, so this causes no harm. For example, if a follower receives an appendentries request that includes log entries already present in its log, it Igno Res those entries in the new request.

The collapse of 5.5 follower and candidate

We have been focusing on the time when leader hung up. Follower and candidate are much simpler to deal with than leader crashes, and they can be handled in the same way. If a follower or candidate crashes, Equestvote and appendentries RPC will send the failed message. The raft handles these problems by constantly restarting, and if the crashed server restarts, then RPC will be completed successfully. If a server crashes after the RPC completes but before the response, it will receive the same RPC again after the restart. Raft RPC is idempotent (a power-like operation is characterized by the effect of any number of executions being the same as the effect of a single execution), so this does not cause a loss. For example, if a follower receives a appendentries request that contains a log entry owned by itself, it ignores these log entries in the new request.

5.6 Timing and Availability

One of our requirements-Raft is, safety must not depend on timing:the system must not produce incorrect results J UST because some event happens more quickly or slowly than expected. However, availability (the ability of the system to respond to clients in a timely manner) must inevitably depend on timin G. For example, if message exchanges take longer than the typical time between server crashes, candidates won't stay up Long enough to win an election; Without a steady leader, Raft cannot make progress.

Leader election is the aspect of Raft where timing are most critical. Raft'll be able to elect and maintain a steady leader as long as the system satisfies the following timing requirement:

broadcasttime ? electiontimeout ? MTBF

In this inequality broadcasttime are the average time it takes a server to send RPCs on parallel to every SERVER&NB Sp;in the cluster and receive their responses; Electiontimeout is the election timeout described in section 5.2; And MTBF is the average time between failures for a single server. The broadcast time should be a order of magnitude less than the election timeout so that leaders can reliably send the He Artbeat messages required to keep followers from starting elections; Given the randomized approach used for election timeouts, this inequality also makes split votes unlikely. The election timeout should is a few orders of magnitude less than MTBF so that the system makes steady progress. When the leader crashes, the system would be unavailable for roughly the election timeout; We would like the represent only a small fraction of overall time.

The broadcast time and MTBF are properties of the underlying system, while the election timeout is something we must choos E. Raft ' s RPCs typically require the recipient to persist information to stable storage, so the broadcast time range F Rom 0.5ms to 20ms, depending on storage technology. As a result, the election timeout is likely to be somewhere between 10ms and 500ms. Typical server MTBFS is several months or more, which easily satisfies the timing requirement.

5.6 Timing and availability

One of the necessary conditions for raft is that security cannot be dependent on timing: the system cannot produce incorrect results because some events occur faster or too slowly than expected. However, availability (the ability of the system to respond to clients in a timely manner) must depend on timing. For example, when the message exchange time is longer than the average time of a server crash, candidate cannot have enough time to get elected, and without stable leader,raft, progress cannot be made.

The timing of Raft's leader elections is a more important aspect. Raft can elect and maintain a stable leader as long as the following timing requirements are met:

Broadcasttime ?  electiontimeout ? MTBF

In this inequality, Boroadcasttime is the average time that a server sends RPC to other servers in the cluster in parallel and receives their response, Electiontimeout is the election timeout as described in section 5.2, and the average time between MTBF server failures. The broadcast time must be significantly smaller than the election timeout, allowing leader to effectively send heartbeat messages to avoid follower re-election, and this inequality makes voting less likely to be dispersed, given the election time-out using stochastic strategies. The election timeout requires a small margin of less than MTBF to allow the system to run stably. When the leader crashes, the system will not be available because of the election timeout, which we think represents only a fraction of the total time.

Broadcast time and MTBF are the properties of the underlying system, and the election timeout is a necessary choice for us. Raft RPC typically requires the recipient to persist information to stable storage, so broadcast time may be between 0.5ms-20ms, depending on the storage technology. As a result, the election timeout may be between 10ms-500ms. The MTBF for a typical server is a few months or more, which is easy to meet timing requirements.



Consistency Algorithm Quest (Extended Version) 7

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.