Zookeeper Distributed Services Framework Example

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

By this definition we know that zookeeper is a coordinated system, and the object of the action is distributed systems. Why do distributed systems need a coordinated system? The reasons are as follows:
It is difficult to develop distributed system, which is mainly embodied in the "partial failure" of distributed system. "Partial failure" means that information is transmitted between two nodes of the network, if the network fails, the sender cannot know if the recipient received the message, and the reason for the failure is complex, and the receiver may have received the information before the network error, may not have received it, or the recipient's process is dead. The only way for a sender to get a real situation is to reconnect to the receiver and ask the recipient for the wrong reason, which is the "partial failure" problem in distributed system development.

Zookeeper is the framework for solving "partial failures" of distributed systems. Zookeeper not allow distributed system to avoid "partial failure" problem, but let distributed system when encountering partial failure, can handle this kind of problem correctly, let the distributed system run normally.

Zookeeper Typical application scenarios
Zookeeper, from the design pattern perspective, is a distributed service management framework based on the observer pattern, which is responsible for storing and managing the data that everyone cares about, then accepting the observer's registration, and once the status of the data changes, zookeeper will be responsible for notifying Zookeeper of those registered observers to respond to the cluster in order to achieve similar Master/slave management mode, the detailed architecture of zookeeper and other internal details can read zookeeper source code

The following is a detailed description of these typical application scenarios, that is, zookeeper can help us solve those problems? The answers are given below.

Unified Naming Services (name service)
In distributed applications, it is often necessary to have a complete set of naming rules that can produce a unique name and make it easier for people to recognize and remember, usually with a tree-shaped name structure is an ideal choice, the tree-shaped name structure is a hierarchical directory structure, both user-friendly and will not repeat. Speaking of which, you may have thought of Jndi, yes. Zookeeper's name service is similar to the functionality that JNDI can do, both by associating a hierarchical directory structure with a certain resource, but zookeeper's name service is more broadly , perhaps you don't need to associate names with specific resources, you may just need one that does not duplicate the name, just as the database produces a unique numeric primary key.

The Name service is already a zookeeper built-in feature that you can implement simply by calling the zookeeper API. It is easy to create a directory node if you call the Create interface.

Configuration management (Configuration Management)
Configuration management is common in distributed application environments, for example, the same application system requires more than one PC server to run, but the application systems they run have the same configuration items, and if you want to modify these same configuration items, you must modify each PC server that runs the application system at the same time. This is very troublesome and error prone.

Configuration information like this can be given to zookeeper to manage, save the configuration information in a zookeeper directory node, then all the application machines that need to be modified monitor the status of the configuration information, once the configuration information changes, each application machine will receive zookeeper notification , and then get the new configuration information from zookeeper to apply to the system.

Configuration management structure diagram
Cluster Management (group membership)
Zookeeper can easily achieve the functions of cluster management, if multiple servers make up a service cluster, then a "supervisor" must know the service status of each machine in the current cluster, and once a machine fails to provide services, other clusters in the cluster must know to make adjustments to the reallocation service policy. As well as increasing the service capabilities of the cluster, one or more servers will be added, and the same must be known to the superintendent.

Zookeeper not only helps you maintain the service status of the machines in your current cluster, but also helps you choose a "supervisor" to manage the cluster, which is another feature of zookeeper Leader election.

They are implemented by creating a ephemeral type of directory node on the zookeeper, and then each Server invokes the GetChildren (String path, Boolean Watch) method on the parent directory node where they create the directory node and sets the Watch is true, because it is a ephemeral directory node, the directory node is deleted when the Server that created it dies, so Children will change, and watch on GetChildren will be called, so other servers will know There is already a certain server dead. The same principle is added to the Server.

Zookeeper how to implement Leader election, which is to elect a Master Server. As in the previous one, each server creates a ephemeral directory node, which is different from a sequential directory node, so it is a ephemeral_sequential directory node. The reason it is a ephemeral_sequential directory node is because we can give each server number, we can choose the currently smallest number of servers for Master, if this smallest number of servers die, because the ephemeral node, The corresponding node of the dead Server is also deleted, so the node in the current node list appears with a minimum number of nodes, and we select this node as the current Master. In this way, the dynamic selection master is realized, which avoids the problem that single master is easy to appear in the traditional sense.

Cluster Management structure Chart
This part of the sample code is as follows, complete code see attachment:
Leader election key code

The code is as follows

Copy Code

void Findleader () throws Interruptedexception {
byte[] leader = null;
try {
Leader = Zk.getdata (root + "/leader", true, NULL);
catch (Exception e) {
Logger.error (e);
}
if (leader!= null) {
Following ();
} else {
String newleader = null;
try {
byte[] localhost = inetaddress.getlocalhost (). getaddress ();
Newleader = zk.create (root + "/leader", localhost,
ZooDefs.Ids.OPEN_ACL_UNSAFE, createmode.ephemeral);
catch (Exception e) {
Logger.error (e);
}
if (Newleader!= null) {
Leading ();
} else {
Mutex.wait ();
}
}
}

Shared Locks (Locks)
Shared locks are easy to implement in the same process, but are not implemented across processes or between different servers. Zookeeper is very easy to implement this function, but also the implementation of the need to get the lock Server to create a ephemeral_sequential directory node, and then call The GetChildren method gets whether the smallest directory node in the current directory node list is the directory node that is created by itself, and if it is created by itself, it acquires the lock and if not then it calls exists (String path, Boolean watch) Method and monitor the list of directory nodes on the zookeeper, until you create a node that is the smallest numbered directory node in the list, so that you can get a lock and release the lock simply by deleting the directory node that you created earlier.

Zookeeper Current Locks Flow chart
The implementation code of the sync lock is as follows, complete code see attachment:
Key code for synchronizing locks

The code is as follows

Copy Code

void Getlock () throws Keeperexception, interruptedexception{
list<string> list = Zk.getchildren (root, false);
string[] nodes = List.toarray (new string[list.size ());
Arrays.sort (nodes);
if (Myznode.equals (root+ "/" +nodes[0])) {
Doaction ();
}
else{
Waitforlock (Nodes[0]);
}
}
void Waitforlock (String lower) throws Interruptedexception, Keeperexception {
Stat Stat = zk.exists (root + "/" + lower,true);
if (stat!= null) {
Mutex.wait ();
}
else{
Getlock ();
}
}

Queue Management
Zookeeper can handle two types of queues:
This queue is available when the members of a queue are NAND, or it waits for all members to arrive, which is the synchronization queue.
Queues follow the FIFO approach for team and outbound operations, such as implementing the producer and consumer models.

The realization of synchronous queue with Zookeeper is as follows:

Create a parent directory/synchronizing, where each member monitors the presence of the flag (Set Watch) bit directory/synchronizing/start, and then each member joins the queue, and the way to join the queue is to create the/synchronizing/ Member_i The temporary directory node, and then each member gets/synchronizing all directory nodes of the directory, that is, Member_i. Determine if the value of I is already the number of members, if less than the number of members waiting for/synchronizing/start to appear, if already equal to create/synchronizing/start.

It is easier to understand with the following flowchart:

Synchronization Queue Flowchart
The key code for the synchronization queue is as follows: complete code See attached
synchronizing queues

The code is as follows

Copy Code

void Addqueue () throws Keeperexception, interruptedexception{
Zk.exists (root + "/start", true);
Zk.create (root + "/" + Name, new byte[0], Ids.open_acl_unsafe,
Createmode.ephemeral_sequential);
Synchronized (mutex) {
list<string> list = Zk.getchildren (root, false);
if (List.size () < size) {
Mutex.wait ();
} else {
Zk.create (root + "/start", New byte[0], Ids.open_acl_unsafe,
Createmode.persistent);
}
}
}

When the queue is not full into wait () and then waits for Watch's notification, Watch's code is as follows:

The code is as follows	Copy Code
public void process (Watchedevent event) { if (Event.getpath (). Equals (Root + "/start") && Event.gettype () = = Event.EventType.NodeCreated) { System.out.println ("received notice"); Super.process (event); Doaction (); } }

FIFO queue with Zookeeper implementation ideas are as follows:

The idea of implementation is also very simple, that is, in a specific directory to create a sequential type of subdirectory/queue_i, so that all members join the queue is numbered, out of the queue through the GetChildren () method can return all the current queue elements, Then consume one of the smallest, so that you can guarantee FIFO.

The following is a sample code for producers and consumers in the form of queues, complete with code see attachments:

Producer Code

The code is as follows	Copy Code
Boolean Produce (int i) throws Keeperexception, interruptedexception{ Bytebuffer B = bytebuffer.allocate (4); Byte[] value; B.putint (i); Value = B.array (); Zk.create (root + "/element", Value, ZooDefs.Ids.OPEN_ACL_UNSAFE, Createmode.persistent_sequential); return true; }

Consumer Code

The code is as follows

Copy Code

int consume () throws Keeperexception, interruptedexception{
int retvalue =-1;
Stat Stat = null;
while (true) {
Synchronized (mutex) {
list<string> list = Zk.getchildren (root, true);
if (list.size () = = 0) {
Mutex.wait ();
} else {
Integer min = new Integer (list.get (0). SUBSTRING (7));
for (String s:list) {
Integer tempvalue = new Integer (s.substring (7));
if (Tempvalue < min) min = tempvalue;
}
Byte[] B = zk.getdata (root + "/element" + min,false, stat);
Zk.delete (root + "/element" + min, 0);
Bytebuffer buffer = Bytebuffer.wrap (b);
RetValue = Buffer.getint ();
return retvalue;
}
}
}
}

This shows the characteristics of zookeeper:

Zookeeper is a streamlined file system. It's a bit like Hadoop, but zookeeper is a file system that manages small files, and Hadoop manages very large files.
Zookeeper provides a rich "widget" that enables many of the operations that coordinate data structures and protocols. Examples include distributed queues, distributed locks, and a "leader election" algorithm for a set of sibling nodes.
Zookeeper is highly available, its own stability is quite good, distributed cluster can rely on zookeeper cluster management, using zookeeper to avoid the problem of single point of failure of distributed system.
Zookeeper uses a loosely coupled interaction pattern. This is most evident in the zookeeper offer of distributed locks, zookeeper can be used as a dating mechanism for participating
Processes that are not aware of other processes (or networks) can discover and interact with each other without even having to exist at the same time, as long as the zookeeper leaves a message at the end of the process
, another process can also read this message, thus decoupling the relationships between the nodes.
Zookeeper provides a shared repository for the cluster, where the cluster can read and write the shared information centrally, avoid programming the shared operations of each node, and ease the development difficulty of the distributed system.
Zookeeper's design uses the observer's design pattern, zookeeper is primarily responsible for storing and managing the data that you care about, and then accepting the observer's registration once the data's status
Changes, zookeeper will be responsible for notifying those observers who have already registered on the zookeeper to react accordingly, thus implementing a cluster of similar
Master/slave management mode.
This shows that zookeeper is good for distributed system development, it can make distributed system more robust and efficient

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Zookeeper Distributed Services Framework Example

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Zookeeper Distributed Services Framework Example

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support