Distributed service framework zookeeper

Source: Internet
Author: User

Installation and configuration details

Zookeeper introduced in this article is based on the stable version 3.2.2. The latest version can be found on the official website.
Http://hadoop.apache.org/zookeeper/to obtain, Zookeeper installation is very simple, the following describes the installation and configuration of zookeeper from the standalone mode and cluster mode.

Standalone Mode

Single-host installation is very simple, as long as you get the zookeeper compressed package and extract to a directory such as:/home/zookeeper-3.2.2, Zookeeper STARTUP script in the bin directory, the STARTUP script in Linux is zkserver. sh. In version 3.2.2, Zookeeper does not provide a STARTUP script in windows. Therefore, to start zookeeper in Windows, you must manually write one, as shown in Listing 1:

Listing 1. zookeeper STARTUP script in Windows

 setlocal  set ZOOCFGDIR=%~dp0%..\conf  set ZOO_LOG_DIR=%~dp0%..  set ZOO_LOG4J_PROP=INFO,CONSOLE  set CLASSPATH=%ZOOCFGDIR%  set CLASSPATH=%~dp0..\*;%~dp0..\lib\*;%CLASSPATH%  set CLASSPATH=%~dp0..\build\classes;%~dp0..\build\lib\*;%CLASSPATH%  set ZOOCFG=%ZOOCFGDIR%\zoo.cfg  set ZOOMAIN=org.apache.zookeeper.server.ZooKeeperServerMain  java "-Dzookeeper.log.dir=%ZOO_LOG_DIR%" "-Dzookeeper.root.logger=%ZOO_LOG4J_PROP%"  -cp "%CLASSPATH%" %ZOOMAIN% "%ZOOCFG%" %*  endlocal 

Before executing the startup script, you need to configure several basic configuration items. The Zookeeper configuration file is under the conf directory, which contains zoo_sample.cfg and log4j. properties, you need to change zoo_sample.cfg to zoo. CFG, because zookeeper will find this file as the default configuration file at startup. The following describes in detail the meaning of each configuration item in this configuration file.

 tickTime=2000  dataDir=D:/devtools/zookeeper-3.2.2/build  clientPort=2181 

  • Ticktime: This time is used as the interval between the zookeeper server or between the client and the server to maintain the heartbeat, that is, each ticktime will send a heartbeat.
  • Datadir: As the name implies, it is the directory where zookeeper saves data. By default, Zookeeper also stores log files that write data in this directory.
  • Clientport: the port connecting the client to the zookeeper server. zookeeper listens to the port and accepts access requests from the client.

After these configuration items are configured, you can start zookeeper now. After starting zookeeper, check whether zookeeper is in the service, you can run the netstat-ano command to check whether the clientport number you configured is in the listening service.

Cluster Mode

Zookeeper not only provides services on a single machine, but also supports multi-host cluster creation to provide services. In fact, Zookeeper also supports another pseudo-cluster method, that is, you can run multiple zookeeper instances on one physical machine. The following describes how to install and configure the cluster mode.

The installation and configuration of the zookeeper cluster mode are not very complex. You need to add several configuration items. In cluster mode, the following configuration items are added in addition to the preceding three configuration items:

 initLimit=5  syncLimit=2  server.1=  server.2= 

  • Initlimit: This configuration item is used to configure zookeeper to accept the client (the client mentioned here is not the client that the user connects to the zookeeper server, but the follower server connected to the leader in the zookeeper server cluster) the maximum heartbeat interval that can be tolerated during connection initialization. When the length of the heartbeat exceeds 10 (ticktime), the zookeeper server does not receive the response from the client, which indicates that the connection to the client fails. The total length is 5*2000 = 10 seconds.
  • Synclimit: This configuration item identifies the length of time for sending messages, requests, and responses between the leader and Follower. the maximum length of time is 2*2000 = 4 seconds.
  • Server. A = B: C: D: where A is a number, indicating the number of the server. B is the IP address of the server; c Indicates the port on which the server exchanges information with the leader server in the cluster. D indicates that if the leader server in the cluster fails, a port is required for re-election, select a new leader, which is the port used for communication between servers during the election. For the pseudo cluster configuration method, because B is the same, different zookeeper instance communication port numbers cannot be the same, so you need to assign them different port numbers.

Except for zoo. the CFG configuration file also needs to be configured in cluster mode. This file is under the datadir directory, and there is a data in this file that is the value of A. zookeeper will read this file when it is started, get the data and zoo. compare the configuration information in CFG to determine the server.

Data Model

Zookeeper maintains a hierarchical data structure, which is very similar to a standard file system, as shown in 1:

Figure 1 zookeeper Data Structure

The Zookeeper data structure has the following features:

  1. Each sub-directory item, such as nameservice, is called znode. This znode is uniquely identified by its path. For example, the znode of server1 is/nameservice/server1.
  2. Znode can have sub-node directories and each znode can store data. Note that directory nodes of the ephemeral type cannot have sub-node directories.
  3. Znode has versions. Data stored in each znode can have multiple versions, that is, multiple copies of data can be stored in one access path.
  4. Znode can be a temporary node. Once the client that creates this znode loses contact with the server, this znode will also be automatically deleted. The client of zookeeper communicates with the server through a persistent connection, each client and server are connected by heartbeat. The connection status is called session. If znode is a temporary node and the session becomes invalid, znode is deleted.
  5. The directory name of znode can be automatically numbered. If app1 already exists and is created, it will be automatically named app2.
  6. Znode can be monitored, including the modification of data stored in this directory node and the change of the sub-node directory. Once changed, the monitoring client can be notified. This is the core feature of zookeeper, many functions of zookeeper are implemented based on this feature. examples will be provided in the following typical application scenarios.

Back to Top

How to Use

As a distributed service framework, Zookeeper is mainly used to solve the consistency problem of application systems in Distributed clusters. It can provide data storage based on a directory node tree similar to a file system, however, Zookeeper is not used to store data. It is mainly used to maintain and monitor the status changes of your stored data. By monitoring the changes in the data status, you can achieve data-based cluster management. The following describes some typical problems that zookeeper can solve, zookeeper operation interface and simple example.

Common interface list

To connect to the zookeeper server, the client can create an instance object of org. Apache. zookeeper. zookeeper and then call the interface provided by this class to interact with the server.

As mentioned above, Zookeeper is mainly used to maintain and monitor the status of data stored in a directory node tree. All the operations we can perform on zookeeper are roughly the same as those on the directory node tree, for example, to create a directory node, set data for a directory node, obtain all the sub-directory nodes of a directory node, Set permissions for a directory node, and monitor the status changes of this directory node.

These interfaces are shown in the following table:

Table 1 org. Apache. zookeeper. zookeeper method list

Method Name Method description
Stringcreate (string path,
Byte [] data,
List <ACL> ACL,

Create a given directory node path and set data for it. createmode identifies four forms of directory nodes: Persistent directory node, the data stored in this directory node will not be lost; persistent_sequential: the directory node that is automatically numbered sequentially. This directory node will be automatically added based on the number of existing nodes.
1, and then return to the directory node name that has been successfully created by the client; ephemeral: temporary directory node. Once the client and server port of this node are created, that is, Session Timeout, this node will be automatically deleted; ephemeral_sequential: Temporary auto-numbered Node
Statexists (string path,
Boolean Watch)
Determine whether a path exists and set whether to monitor the directory node. Here, watcher is the watcher specified when the zookeeper instance is created. The exists method also has an overload method that can specify a specific
Statexists (string path,

Watcher watcher)
Overload method. Here, a specific watcher is set for a directory node. watcher is a core function in zookeeper. watcher can monitor the data changes of directory nodes and the changes of subdirectories. Once these statuses change, the server will notify all watcher sets on this directory node, so that every client will soon know that the status of the directory node that it is concerned with has changed, and then respond accordingly.
Delete (string path, int Version)
Delete the directory node corresponding to path. If version is-1, all data of this directory node can be deleted.
List <string>

Getchildren (string path, Boolean Watch)
Obtains all sub-directory nodes in the specified path.
The getchildren method also has an overload method that allows you to set the status of a specific watcher monitoring subnode.
Statsetdata (string path,
Byte [] data, int Version)
Set Data for path. You can specify the version number of this data. How can I match any version if version is-1?
Byte []
Getdata (string path, Boolean watch,

Obtain the data stored in the directory node corresponding to this path. The data version and other information can be specified through STAT. You can also set whether to monitor the data status of this directory node.
Addauthinfo (string scheme, byte [] auth)
The client submits its own authorization information to the server. The server verifies the access permission of the client based on the authorization information.
Statsetacl (string path,

List <ACL> ACL, int Version)
Re-set the access permission for a directory node. Note that the directory node permission in zookeeper does not have the transmission permission, and the permission of the parent directory node cannot be transferred to the sub-directory node. The directory node ACL consists of perms and ID.
Perms includes all, read, write, create, delete, and Admin.
The ID identifies the list of identities for accessing directory nodes. By default, there are two types:
Anyone_id_unsafe = new ID ("world", "anyone") and auth_ids = new ID ("auth", "") indicate that anyone can access and the Creator has access permissions.
List <ACL>

Getacl (string path,

Obtains the access permission list of a directory node.

In addition to the methods listed in the above table, there are also some overload methods, such as providing a callback class overload method and setting a specific watcher overload method. For specific methods, refer to Org. apache. zookeeper. zookeeper class API description.

Basic operations

The following is a sample code for basic operations on zookeeper, so that you can have an intuitive understanding of zookeeper. The following list includes creating a connection to the zookeeper server and the most basic data operations:

Listing 2. Basic operation example of zookeeper

// Create a connection zookeeper zk = new Zookeeper ("localhost:" + client_port, clientbase. connection_timeout, New watcher () {// monitor all triggered events public void process (watchedevent event) {system. out. println ("triggered" + event. getType () + "event! ") ;}}); // Create a directory node zk. create ("/testrootpath", "testrootdata ". getbytes (), IDs. open_acl_unsafe, createmode. persistent); // create a sub-directory node zk. create ("/testrootpath/testchildpathone", "testchilddataone ". getbytes (), IDs. open_acl_unsafe, createmode. persistent); system. out. println (new string (zk. getdata ("/testrootpath", false, null); // retrieve the subdirectory node list system. out. println (zk. getchildren ("/testrootpath", true); // modify the zk of the sub-directory node data. setdata ("/testrootpath/testchildpathone", "modifychilddataone ". getbytes (),-1); system. out. println ("directory node status: [" + zk. exists ("/testrootpath", true) + "]"); // create another sub-directory node zk. create ("/testrootpath/testchildpathtwo", "testchilddatatwo ". getbytes (), IDs. open_acl_unsafe, createmode. persistent); system. out. println (new string (zk. getdata ("/testrootpath/testchildpathtwo", true, null); // Delete the sub-directory node zk. delete ("/testrootpath/testchildpathtwo",-1); zk. delete ("/testrootpath/testchildpathone",-1); // Delete the parent directory node zk. delete ("/testrootpath",-1); // close zk. close ();

The output result is as follows:

The none event has been triggered! Testrootdata [testchildpathone] directory node status: [5, 5,] has triggered the nodechildrenchanged event! Testchilddatatwo has triggered the nodedeleted event! The nodedeleted event has been triggered!

When the status of the directory node is enabled, the process method of the watcher object is called once the status of the directory node changes.

Back to Top

Typical application scenarios of zookeeper

Zookeeper is a distributed service management framework designed based on the observer mode. It stores and manages data that everyone cares about, and then accepts the registration of the observer, once the status of the data changes, Zookeeper will be responsible for notifying the observer who has registered on zookeeper to respond accordingly, so as to implement the cluster management mode similar to Master/Slave, for more information about the zookeeper architecture, see the zookeeper source code.

The following describes in detail these typical application scenarios, that is, what problems can zookeeper solve? The answer is given below.

Name Service)

In distributed applications, a complete set of naming rules is usually required, which can generate unique names and make it easy for people to recognize and remember. Generally, a tree name structure is an ideal choice, the tree name structure is a hierarchical directory structure, which is user-friendly and does not duplicate. Speaking of this, you may have thought of JNDI. Yes, the name service of zookeeper is similar to that of JNDI. They all associate hierarchical directory structures with certain resources, however, the name service of zookeeper is more broadly associated. You may not need to associate the name with a specific resource. You may only need one name that does not repeat, just like a database that generates a unique digital primary key.

Name Service is already a built-in function of zookeeper. You only need to call the zookeeper API. If you call the create interface, you can easily create a directory node.

Configuration Management)

Configuration Management is very common in distributed application environments. For example, the same application system requires multiple PC servers to run, but some configuration items of the application systems they run are the same, if you want to modify these identical configuration items, you must modify the PC servers running the application system at the same time, which is very troublesome and error-prone.

Configuration information like this can be managed by zookeeper, save the configuration information in a directory node of zookeeper, and then monitor the status of configuration information of all application machines to be modified, once the configuration information changes, each application machine will receive a notification from zookeeper, and then obtain the new configuration information from zookeeper and apply it to the system.

Figure 2. Configuration Management Structure

Group Membership)

Zookeeper can easily implement cluster management. If multiple servers form a service cluster, you must have a "manager" to know the service status of each machine in the current cluster, once a machine cannot provide services, other clusters in the cluster must know and adjust the re-allocation service policy. Similarly, when the service capability of the cluster is increased, one or more servers will be added, and the "manager" must also be known.

Zookeeper not only helps you maintain the service status of machines in the current cluster, but also helps you to select a "manager" for this manager to manage the cluster. This is another zookeeper function leader election.

They are implemented by creating a directory node of the ephemeral type on zookeeper, and then each server calls
Getchildren (string path, Boolean Watch) method and set watch to true. Because it is an ephemeral directory node, when the server that creates it dies, the directory node is also deleted, so children will change.

Watch on getchildren will be called, so other servers will know that a server is dead. The same principle applies to new servers.

How to Implement leader election by zookeeper is to select a master server. As before, each server creates an ephemeral directory node. The difference is that it is also a sequential directory node, so it is an ephemeral_sequential directory node. It is the ephemeral_sequential directory node because we can number each server. we can select the server with the smallest number as the master node. If the server with the smallest number dies
The node corresponding to the dead server is also deleted from the ephemeral node. Therefore, a node with the minimum number is displayed in the current node list. We will select this node as the current master node. In this way, the dynamic choice of the master is realized, avoiding the problem that a single master is prone to single point of failure in the traditional sense.

Figure 3. Cluster Management Structure

The sample code for this part is as follows. For the complete code, see the attachment:

Listing 3. Key Leader Election Code

 void findLeader() throws InterruptedException {         byte[] leader = null;         try {             leader = zk.getData(root + "/leader", true, null);         } catch (Exception e) {             logger.error(e);         }         if (leader != null) {             following();         } else {             String newLeader = null;             try {                 byte[] localhost = InetAddress.getLocalHost().getAddress();                 newLeader = zk.create(root + "/leader", localhost,                 ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);             } catch (Exception e) {                 logger.error(e);             }             if (newLeader != null) {                 leading();             } else {                 mutex.wait();             }         }     } 


Shared locks are easily implemented in the same process, but they cannot be implemented across processes or between different servers. Zookeeper can easily implement this function. The implementation method is to create an ephemeral_sequential directory node for the server that needs to obtain the lock, and then call

The getchildren method obtains whether the smallest directory node in the current directory node list is a self-created directory node. If it is created by itself, it obtains the lock. If not, it calls

The exists (string path, Boolean Watch) method monitors the changes in the directory node list on zookeeper until the node you create is the directory node with the smallest number in the list to obtain the lock, releasing the lock is simple. You just need to delete the directory node created by the previous one.

Figure 4. flowchart of implementing locks using zookeeper

The synchronization lock implementation code is as follows. For the complete code, see the attachment:

Listing 4. key code of synchronization lock

 void getLock() throws KeeperException, InterruptedException{         List<String> list = zk.getChildren(root, false);         String[] nodes = list.toArray(new String[list.size()]);         Arrays.sort(nodes);         if(myZnode.equals(root+"/"+nodes[0])){             doAction();         }         else{             waitForLock(nodes[0]);         }     }     void waitForLock(String lower) throws InterruptedException, KeeperException {        Stat stat = zk.exists(root + "/" + lower,true);         if(stat != null){             mutex.wait();         }         else{             getLock();         }     } 

Queue Management

Zookeeper can process two types of Queues:

  1. This queue is available only when all the members of a queue are aggregated. Otherwise, it will wait until all the members arrive. This is a synchronization queue.
  2. Queues are queued and queued in FIFO mode, for example, the producer and consumer models are implemented.

The implementation of synchronization queue with zookeeper is as follows:

Create a parent directory/synchronizing. Each member monitors whether the set Watch directory/synchronizing/start exists, and each member joins the queue, the queue is created by creating a temporary directory node for/synchronizing/member_ I, and each member obtains all directory nodes in the/synchronizing directory, that is, member_ I. Determine whether the I value is already the number of Members. If it is smaller than the number of members, wait for/synchronizing/start to appear. If it is already equal, create/synchronizing/start.

The following flowchart is easier to understand:

Figure 5. Synchronization queue Flowchart

The key code of the synchronization queue is as follows. For the complete code, see the attachment:

Listing 5. Synchronization queue

 void addQueue() throws KeeperException, InterruptedException{         zk.exists(root + "/start",true);         zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,         CreateMode.EPHEMERAL_SEQUENTIAL);         synchronized (mutex) {             List<String> list = zk.getChildren(root, false);             if (list.size() < size) {                 mutex.wait();             } else {                 zk.create(root + "/start", new byte[0], Ids.OPEN_ACL_UNSAFE,                 CreateMode.PERSISTENT);             }         }  } 

When the queue is not full, it enters wait () and waits for the notification from watch. The watch code is as follows:

Public void process (watchedevent event) {If (event. getpath (). equals (root + "/start") & event. getType () = event. eventtype. nodecreated) {system. out. println ("Get notification"); super. process (event); doaction ();}}

The implementation of FIFO queues using zookeeper is as follows:

The implementation idea is also very simple, that is, to create a sub-directory/queue_ I of the sequential type under a specific directory, so that all Members can be added to the queue with numbers, when an outbound queue is sent, the getchildren () method can return the elements in all the current queues, and then consume the smallest one to ensure FIFO.

The following is a sample code in the queue format of producer and consumer. For the complete code, see the attachment:

Listing 6. Producer code

 boolean produce(int i) throws KeeperException, InterruptedException{         ByteBuffer b = ByteBuffer.allocate(4);         byte[] value;         b.putInt(i);         value = b.array();         zk.create(root + "/element", value, ZooDefs.Ids.OPEN_ACL_UNSAFE,                     CreateMode.PERSISTENT_SEQUENTIAL);         return true;     } 

Listing 7. Consumer Code

 int consume() throws KeeperException, InterruptedException{         int retvalue = -1;         Stat stat = null;         while (true) {             synchronized (mutex) {                 List<String> list = zk.getChildren(root, true);                 if (list.size() == 0) {                     mutex.wait();                 } else {                     Integer min = new Integer(list.get(0).substring(7));                     for(String s : list){                         Integer tempValue = new Integer(s.substring(7));                         if(tempValue < min) min = tempValue;                     }                     byte[] b = zk.getData(root + "/element" + min,false, stat);                     zk.delete(root + "/element" + min, 0);                     ByteBuffer buffer = ByteBuffer.wrap(b);                     retvalue = buffer.getInt();                     return retvalue;                 }             }         }  } 

Back to Top


Zookeeper, as a sub-project in the hadoop project, is an essential module for hadoop cluster management. It is mainly used to control the data in the cluster, for example, it manages namenode in the hadoop cluster, there are also master election in hbase and State synchronization between servers.

This article introduces the basic knowledge of zookeeper and several typical application scenarios. These are the basic functions of zookeeper. The most important thing is that zoopkeeper provides a good distributed cluster management mechanism, which is a hierarchical directory tree data structure, and effectively manage the nodes in the tree, so that you can design a variety of distributed data management models, not limited to the several common application scenarios mentioned above.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.