Zookeeper Data consistency

Source: Internet
Author: User
Tags ack commit zookeeper

Zookeeper provides a consistent guarantee for stored data, regardless of the server from which the application obtains data, to obtain consistent data. Zookeeper uses atomic Broadcast Protocol (ZAB) as the core of its consistent replication, and achieves data consistency by ordering the service-side requests. Data Consistency assurance for zookeeper

The zookeeper is high performance, scalable and provides the following data consistency guarantees for applications:
1) Sequential consistency
Updates from the client will be processed in the order sent by the client;
2) atomicity
Update or success or failure, there is no partial success or partial failure of the scene;
3) Single View
Regardless of which server the client connects to, see the same view;
4) Reliability
Once an update is in effect, it will remain until it is updated again;
5) Real-time sex
In a certain period of time, any system changes can be seen by the client, or be heard by the monitor. Zab: Atomic Broadcast Protocol

In the zookeeper cluster environment, there are leader, follower, and observer nodes, leader is responsible for wrapping the write operation as a transaction execution, follower participate in the execution of the writing process of voting, Observer, however, receives only the last inform message and does not participate in the updated voting process.
When follower receives a write request, follower transfers it to Leader,leader to encapsulate it as a transaction and process, each transaction contains a Zxid,zxid divided into two parts, epoch and counter. Epoch is used to identify the current leader, a new leader generation, Epoch plus 1, and the different epochs represent different leader;counter for transaction counts (see "Zookeeper Cluster Management"). The transaction contains all the information that is required for the update operation.
The interactive process for the Zab protocol is as follows:
1) leader sends a proposal message p to all follower;
2) follower loopback an ACK to leader, indicating that the receiving proposal,follower will persist the ACK information;
3) leader sends a message notifying all follower commits when it receives an ACK from most (including leader itself) follower reply.
Follower before confirming proposal, it is necessary to check that proposal really comes from the leader he follows by judging whether proposal's epoch is consistent with the current ZXID epoch. And check that the order of confirmation and submission proposal is consistent with the order of leader broadcasts.
The Zab guarantees:
1) If leader has broadcast T and T ', each server must be submitted in the Order of T, T ';
2) If any server commits a transaction in the order of T, T ', then all servers must commit the transaction in the order of T, T '.
The 1th ensures that leader and server process transactions in the same order, while the 2nd guarantees that none of the servers will miss the transaction.
Since the leader is not guaranteed to function properly, it is necessary to take into account the situation after the leader anomaly. When the leader is abnormal, the leader's re-selection will be triggered (see "Zookeeper Cluster Management"), and the newly selected leader must meet:
1) new leader before broadcasting the message, it is necessary to submit all the things sent by the previous leader;
2) At any moment, there should be only one leader with most supporters.Observer

Because Observer does not participate in the poll and therefore does not receive proposal, and the commit message contains only ZXID, observer uses the new message inform,inform to contain the contents of proposal and the notification to commit the transaction. service-side processing flow

Zookeeper contains various types of service-side, leader, follower, and Observer, and the following describes each service-side processing process for messages. leader

The processing flow of leader is as follows:

1) Preprequestprocessor: Receives the request and encapsulates it as a transaction
, 2) Proposalrequestprocessor: Encapsulates the transaction as proposal and broadcasts it to the follower Proposalrequestprocessor pre-turn all requests to commitrequestprocessor, and forward all write requests to syncrequestprocessor;
3) Syncrequestprocessor: Persist the transaction and send the message to ackrequestprocessor for processing;
4) Ackrequestprocessor: Generate a confirmation and pass it on to yourself;
5) Commitrequestprocessor: Wait to receive enough confirmation (more than half), submit proposal;
6) Tobeappliedrequestprocessor: Get the request waiting to be processed, and handed over to finalrequestprocessor
for processing; 7) Finalrequestprocessor: Execute request, including write request and read request.

In the case of persistent transactions, the zookeeper is used to speed up the efficiency of persistence:
1) Storage of multiple transactions at a time, reducing disk I/O;
2) Pre-allocating disk blocks for files. Follower

Follower needs to receive different messages from 3: Client request, proposal, and commit.

1) Followerrequestprocessor: Receive and process client requests, forward all requests to commitrequestprocessor, and pre-transfer requests to leader;
2) Commitrequestprocessor: Direct read request to Finalrequestprocessor processing, and for write request, wait for leader commit notice, when the commit notification is received, write the request to
3) Syncrequestprocessor: Receiving proposal from leader, persisting the transaction and handing it over to Sendackrequestprocessor;
4) Sendackrequestprocessor: Send confirmation message to leader;
5) Finalrequestprocessor: Execute request, including write request and read request. Observer

Observer is handled in a similar manner to follower, observer does not participate in the confirmation proposal, so the process of sending acknowledgment messages to follower and persistent transactions is absent compared to leader. ordered processing of requests

The zookeeper server needs to guarantee the ordered processing of the request, and after receiving a request from the zookeeper's server, it will be placed in a queue, including read requests and write requests, and then be processed sequentially. For a read request, the current server directly processes and returns the result (refer to the service-side process described above), and for the write request, the current server waits until the write request processing is complete before proceeding to the next request, through which the zookeeper guarantees the ordered processing of the request. Snapshot

Each server side of the

zookeeper frequently serializes the entire data to a file to generate a snapshot that contains the zxid of the last processed transaction, and when the service end multiplicity, the server can take the snapshot and obtain the incremental transaction from leader based on the zxid of the last processed transaction and process it. Can reach the latest state.
the process by which the server generates the snapshot is:
1) records the ZXID of the currently last processed transaction, and
2) progressively serializes and persists the entire information on the service side into the file.
During the persistence process, the service side does not stop the processing of messages, so the zxid of the last processing of the records may not really be the last transaction to be processed. For example, suppose there are two znode nodes/z and/Z1, assuming that the initial/z and/Z1 data are 1, now take a step to generate the snapshot:
1) Get the last processed transaction t0,
2) serialize the data from/z to the snapshot;
1) Set the data for/Z to 3 (transaction T1) ;
4) Set the data for/Z1 to 2 (transaction T2),
5) to serialize the/Z1 data 2 to the snapshot.
Thus, the final snapshot is: The last processed transaction t0,/z data for 1,/Z1 is 2.
However, this does not cause problems, and when the snapshot is reused, both the transaction T1 and T2 are re-executed, because zookeeper requires all transactions to be idempotent, that is, the sequential processing of a batch of transactions sequentially will result in the same results, so the server will still get the correct data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.