Zookeeper typical usage scenarios at a glance

Source: Internet
Author: User

Zookeeper is a highly available, distributed data management and system coordination framework . Based on the implementation of the Paxos algorithm, the framework guarantees the strong consistency of the data in the distributed environment, and it is based on such characteristics that the zookeeper can be applied to many scenarios. The usage scenarios of ZK are categorized as follows:

Scene category Typical scenario description (ZK feature, how to use it) Specific use in the application
Data Publishing and Subscriptions Publish and subscribe is so-called configuration Management , as the name implies is the system to publish data to the ZK node, for subscribers to dynamically obtain data, to achieve centralized management of configuration information and dynamic update. For example , global configuration information, address lists, and so on are ideal for use. (Diamond and Configserver also feature in this area) 1. index information and machine node status in the cluster Some specified nodes stored in ZK for each client to make

    2. system log (processed) storage, these logs are usually cleared after 2-3 days.

    3. Some , takes the initiative to obtain once when the application launches, and registers a watcher on the node, each subsequent configuration has the update, the real-time notification to the application.

    4. Some of the business logic needs to be used global variables , such as Message Queuing for some message middleware usually has an offset, which is stored on ZK so that each sender in the cluster can know the current sending progress.

    5. Some , and there will be manual modification of this information manually. Previously exposed interfaces, such as JMX interfaces, were used as long as the data was stored on the ZK node.

Name Service This is mainly as a distributed naming Service , can call ZK's Create API, can easily create a globally unique path, this path can be used as a name.
Distribution Notification/coordination Watcher registration and asynchronous notification mechanism in zookeeper can realize the notification and coordination between different systems in distributed environment, and realize real-time processing of data change.

The use of the method is usually different systems to the ZK on the same znode registration, monitoring Znode changes (including Znode itself and child nodes), one of the system update Znode, then the other system can receive notification and processing accordingly.

1. Another heartbeat detection mechanism : the detection system and the detected system are not directly related to each other. Instead, the system coupling is greatly reduced through a node association on the ZK.

    2. Another system scheduling mode : A system has a console and push system two parts, the role of the console is to control the push system for the corresponding push work. Some of the operations that managers make in the console actually modify the state of certain nodes on ZK, and ZK notifies them to register watcher's client, the push system, and then makes the corresponding push task.

    3. Another job reporting mode : Some similar to the task distribution system, after the subtasks start, go to ZK to register a temporary node, and periodically report their progress (write the progress back to this temporary node).

     in summary, using zookeeper for distributed notification and coordination can greatly reduce the coupling between systems.

Distributed locks Distributed locks, this is mainly due to zookeeper for us to ensure strong consistency , that is, the user as long as full trust every moment, the ZK cluster in any node (a ZK server) on the same Znode data is necessarily the same.

This I feel can be divided into two categories, one is to remain exclusive, and the other is to control the timing .

The so-called hold exclusive, is all views to get this lock client, finally only one can successfully obtain this lock. The usual practice is to think of a znode on ZK as a lock, which is achieved by the way of Create Znode. All clients are going to create the/distribute_lock node, and the client that was successfully created has the lock.

Control timing, that is, all views to get the lock of the client, will eventually be scheduled to execute, but there is a global timing. The procedure is basically similar to the above, just here/distribute_lock is already pre-existing, the client creates a temporary ordered node below it (this can be controlled by the properties of the node: createmode.ephemeral_sequential). ZK's parent node (/distribute_lock) maintains a copy of the sequence, guaranteeing the temporal timing of the creation of the child nodes, thus guaranteeing the global timing of each client.

Cluster Management 1. Cluster Machine monitoring: This is commonly used in the cluster of machine status, machine on-line rate of high requirements of the scene, can quickly respond to machine changes in the cluster. In such a scenario, there is usually a monitoring system that detects whether the cluster machine is alive in real time.

  The usual practice is that the monitoring system detects each machine by some means (such as ping), or each machine periodically reports "I'm Alive". This approach is feasible, but there are two obvious problems: 1. When the machine in the cluster changes, there are more things involved. 2. Delay.

Zookeeper has two properties: A. The client registers a watcher on node x, and if the child node of x changes, the client is notified. B. Create a node of type ephemeral, and once the client and server sessions end or expire, the node disappears.

For example, the monitoring system registers a watcher on the/clusterservers node and, after each dynamic machine, creates a ephemeral-type node for/clusterservers:/clusterservers/{ hostname}. In this way, the monitoring system will be able to know in real time the increase or decrease of the machine, as the follow-up is monitoring system business.

2. The Master election is the most classic usage scenario in zookeeper.

In a distributed environment, the same applications may be distributed across different machines, some business logic (e.g., time-consuming computations, network I/O processing), often with one machine in the entire cluster, and the rest of the machines sharing this result, which can greatly reduce duplication of effort, So this master election is the main problem in this scenario.

The strong consistency of zookeeper guarantees the uniqueness of node creation in distributed high concurrency situations, that is, there are multiple client requests to create/currentmaster nodes, and eventually only one client request can be created successfully.

With this feature, you can easily select clusters in a distributed environment.

In addition, this scenario evolves, which is the dynamic master election. This will use the characteristics of the Ephemeral_sequential type node.

As mentioned above, all client creation requests, and eventually only one can be created successfully. A slight change here is to allow all requests to be created successfully, but with a creation order, so that all requests end up creating results on ZK, one possible scenario is this:/currentmaster/{sessionid}-1,/currentmaster/{ Sessionid}-2,/currentmaster/{sessionid}-3 ..... Each time you select the machine with the lowest serial number as Master, if the machine is hung up, since the node he created will be an hour away, then the smallest machine is master.

1. In a search system, if each machine in the cluster generates a full-scale index, it is time-consuming and does not guarantee that the index data is consistent with each other. So let Master in the cluster do a full-scale index generation and then sync to the other machines in the cluster.

 

     2. In addition, the disaster tolerance for master election is that you can manually specify master at any time, that is, when ZK is unable to obtain master information, it can get master in one place, such as HTTP.

Distributed queues In terms of queues, I currently feel that there are two kinds, one is the normal FIFO queue, the other is to wait until the queue member NAND after the unified sequential execution .

For the second FIFO queue, and the control timing scenario in the Distributed lock service, the rationale is the same, and this is not the case.

The second type of queue is actually an enhancement based on the FIFO queue. It is usually possible to pre-establish a/queue/num node under the/queue Znode, and assign n (or assign n to/queue directly), indicating the size of the queue, and then each time a queue member joins, it is determined whether the queue size has been reached and whether the execution can begin. A typical scenario for this usage is that in a distributed environment, a large task a requires a lot of subtasks to be completed (or conditionally ready). At this time, when one of the subtasks is complete (ready), then go to/tasklist to set up their own temporal timing node (reatemode.ephemeral_sequential), when/tasklist found that the following child nodes meet the specified number, You can proceed to the next step in order to process.

This article refers to: Zookeeper typical use scenes at a glance

Zookeeper typical usage scenarios at a glance

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.