Zookeeper 3.4 Official Document Translation

Source: Internet
Author: User
Tags command access

DescriptionThe level of personal English is very general, understanding may be biased, if there is inappropriate translation, please crossing guidance. 1. Introductiondistributed systems are like zoos, where each server is like an animal, zookeeper is like a zoo keeper, coordinating and serving animals in zoos.
Zookeeper is a high-performance coordination service for distributed applications.
Zookeeper exposes public service interfaces through simple interface services, such as: Naming,configuration management,synchorization and group services.
You can leverage Zookeeper out-of-the-box services to achieve consistency (consensus), Group Management (group management), lead elections (leader election), and presence agreements (presence protocols).
You can also build your own, unique needs.
Zookeeper is designed as a simple, easy-to-program implementation service.
The Zookeeper data model is similar to the tree structure of a file system.
The Zookeeper is running on the Java platform.
Coordination services are notoriously difficult to achieve. Because of particularly error-prone, for example: conditional competition and deadlock. The goal of zookeeper is to solve the problem of distributed systems starting from scratch to achieve their own coordination services. 2. Design Objectives
2.1, Zookeeper is a set of easy to understand framework
Zookeeper allows distributed processes to coordinate with one another, and by sharing a namespace system, this namespace system is organized like a standard file system directory structure.
The namespace is made up of a number of data registers (the data register), which is described in the official language of zookeeper, known as Znoder (this is the small animal, as I understand it).
Znoder are similar to files and directories in a file system.
Unlike file systems, file systems are designed to store data, while zookeeper data is stored in memory.
So Zookeeper is able to achieve high throughput (higher throuphput) and low latency (lower latency).
Zookeeper has a very good implementation in terms of high performance, high availability, and strict command access. High performance, Zookeeper can be used in large-scale distributed systems, and in high availability, Zookeeper solves a single point of failure. Strict command access,
Allows clients to implement complex synchronization primitives.
2.2, Zookeeper is a set of redundant mechanism
just as the distributed process is coordinated, the Zookeeper itself replicates all host collections.

(Figure 2.2:zookeeper Service)The servers that make up the Zookeeper Service (server) must be connected to each other.
Together with transaction logs and snapshots, they maintain the state of an image in memory in the persisted storage.
As long as most of the servers are available, then the Zookeeper service is available.
When a client connects to a single Zookeeper Server. The client maintains a TCP connection by sending a request (requests), getting a response (responses), getting a watch event (watch events), sending a heartbeat (heart beats), and so on.
If a client connects to a Zookeeper server with a TCP connection that is interrupted, then the client will be connected to a different Zookeeper server.
2.3. Zookeeper is sequentialZookeeper appends a number label to each update, indicating the order of transactions in the Zookeeper.
Subsequent operations can be implemented in order to achieve advanced extraction, such as synchronization primitives.
2.4, Zookeeper is very fastIt is very fast when it burdens "read-as-master" (read-dominant). The Zookeeper is particularly suitable for scenarios where the main load is read.
The Zookeeper application runs on thousands of machines, with the best performance in read-to-write, and read-write ratios of 10:1. 2.5. Data model and hierarchical namespaces (hierarchical namespace)Zookeeper provides namespaces that are very similar to standard file systems.
A name is a sequence of path elements separated by (/) (similar to a path to a file system).
Each node is a different path in the Zookeeper namespace. (Each node in the namespace is identified by a path)

(Figure 2.5:zookeeper Hierarchical namespace)2.6, node (Nodes) and ephemeral nodes (ephemeral Nodes)Unlike file systems, each node in the Zookeeper namespace can have data associated with it, or it can have child nodes associated with it.
This is like a file system where a node is both a file and a directory.
Zookeeper is designed to hold data such as state, configuration, location, etc. for reconciling transactions, so the amount of data stored by each node is small, typically around a few to thousands of bytes in scope.
In Zookeeper's official language, Zookeeper's data nodes are called Znode.
Znode maintains a stat structure data that allows caching of valid data and coordinating updates.
Stat includes: Data change version number, ACL change version number, timestamp.
Each time the Znode data is updated, the corresponding version number is incremented.
When the client obtains the data, it also gets the version number of the data.
The data read and write of each znode is atomic, and the read operation reads the entire node's data, and the write operation replaces the entire node.
Each node has an ACL that indicates who can do what.
There are short nodes in Zookeeper, and these ephemeral nodes coincide with the life cycle of the session that created it, and if the session ends, the node is deleted as well.
2.7. Conditional updates and monitoring points (conditional updates and watches)The Zookeeper supports a monitoring point where a client can add a watch point to a znode, and when the znode changes, the monitoring points are triggered and deleted.
When a monitoring point is triggered, the client receives a packet informing it that the Znode has changed.
If the client's connection to Zookeeper Server is interrupted at this point, the client receives a local notification.
2.8. Protection (Guarantees)Because the goal of Zookeeper is to build more complex services (such as synchronization) based on Zookeeper, while the Zookeeper is very fast and very simple to use, there is a need to provide some assurance.
Zookeeper offers the following benefits:
1), sequential consistency: Updates from the client, according to the Order of delivery implementation.
2), atomicity: The result of the update is only success or failure.
3), unique system mirroring: Clients will only see the same service view regardless of which Zookeeper Server the client connects to.
4), Reliability: Once the update has been implemented, it will remain in an updated state until the client finishes overwriting the update.
5), timeliness: in a certain period of time, the client sees the status of the system is up-to-date.
2.9. Simple APICreate: Creates a node at a location in the tree;
Delete: Deletes the node;
Exists: Checks for the presence of a node in a location;
Get data: reading from a node;
Set data: Writes to a node;
Get children: Gets a list of child nodes of a node;
Sync: Wait for the data to propagate.
2.10. Realization (Implementation)
(Figure 2.10:zookeeper components)This diagram shows the advanced components of the Zookeeper service. Each server that makes up the Zookeeper service, in addition to the request processor (requests Processor), copies one of its own replicas from each component.
Replica Database (Replicated): is a memory database that stores the entire data tree. For recoverability, the updated data is written to disk, and the data is stored in the in-memory database before being written to the disk.
Each Zookeeper server can provide services to clients. The client only needs to connect to any of the Zookeeper servers and submit the request.
The client reads the request, and the zookeeper server supplies the data from the local replica database.
The client's write request, the service status modification request, needs to be handled through a contract protocol.
As required in the contract protocol, all client write requests are forwarded to a server called leader (Leader) "The rest of the servers are called Entourage Servers (followers)."
The attendant server receives a modification request from the leader's server and agrees to transfer the data.
The message layer is very concerned about the status of the leading server, when the leader server fails, the message layer will produce a new leader server to replace the failed leader server, and then synchronize to inform all the attendant server.
Zookeeper has customized an atomic message protocol. Because the operation of the message layer is atomic, zookeeper can guarantee the consistency of the data of the local replica database.
When the lead server receives a write request, it calculates the state of the system after the write request is completed, and then writes the operation to the transaction, resulting in a new state from the transaction.
2.11. UseAlthough the Zookeeper programming interface is very simple, you can use it to implement sequential operations at the top level, such as: Synchronization primitives, member groupings, owning permissions, and so on.
2.12. Performance (performance)The Zookeeper design goal is high performance.
The zookeeper is very high performance when those reading operations are far greater than the write operation. Read operations are a typical application for coordination services, which is greater than write operations.
If the write operation is much larger than the read operation, it is not. Because the write operation causes synchronization between all servers.
Official team performance Test-throughput graph: Zookeeper throughput as the Read-write Ratio varies(3.2 version of the test map)(Fig. 2.12-1:Zookeeper throughput as the Read-write Ratio varies)Hardware environment:
Cpu:dual 2Ghz Xeon
HDD: 2 SATA HDD 15K RPM
Server settings:
1 HDD dedicated to Zookeeper log;
Another 1 blocks as OS and Zookeeper snapshots;
Zookeeper Ensemble is set to disallow client connections.
Test Purpose:
Increase the number of servers and throughput in the same ratio of read and write requests;
With the same number of servers, increase the rate of read requests and throughput growth.
If there are 3 servers, the read-write request ratio is 50%, then the throughput of the Zookeeper service is approximately 40,000 times per second.

Official team reliability test-failure Reliability diagram: Reliabilityin the presence of Errors
(Figure 2.12-2:Reliabilityin The presence of Errors )The marked events in the figure are as follows:
1), a follow failure and recovery
2), another follow failure and recovery
3), leader failure
4), two follow failure and recovery
5), another leader fails

2.13. Reliable (rliability)
As can be seen from the official team reliability test-the reliability diagram of failure, first, if the follower fails and soon recovers,
Zookeeper is still able to withstand high throughput, and secondly, the leader election algorithm takes into account the rapid system recovery and avoids too much degradation of throughput.
Zookeeper selected a new leader in about 200ms. Thirdly, once the follower is restored, the zookeeper increases throughput as soon as the new request is processed.




Zookeeper 3.4 Official Document Translation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.