Zookeeper Open Source client framework curator Introduction

Source: Internet
Author: User
Tags getstat zookeeper client

Curator is a set of zookeeper client frameworks for Netflix open source. Netflix in the process of using zookeeper found that the zookeeper comes with a client too low, the application needs to handle a lot of things when it is used, so it is packaged on its basis and provides a better client framework. Netflix is in the process of using zookeeper, we have encountered the problem, so began to study, first from his on GitHub source code, wiki documents and Netflix technology blog start.

After reading the official documents, curator found that the main solution to the three types of problems:

    • Encapsulates the connection processing between the zookeeper client and the zookeeper server;
    • Provides a set of fluent style of Operation API;
    • Abstract encapsulation of zookeeper various application scenarios (recipe, such as shared Lock service, cluster leader election mechanism) is provided.



Several problems in the process of zookeeper using curator enumeration
Problem initializing connection: During the handshake between client and server, if the handshake fails, all synchronization methods (such as Create, GetData, etc.) will throw an exception
Automatic recovery (failover) Problem: When a client connection to one server is lost and an attempt is made to connect to another server, the client returns to the initial connection mode
Session Expires: In extreme cases, the zookeeper session expires, and the client needs to listen to the state and recreate the zookeeper instance.
Handling of recoverable exceptions: When an ordered Znode is created on the server side and crashes when the node name is returned to the client, a recoverable exception is thrown by the client, and the user needs to catch these exceptions himself and retry
Problem with scenarios: Zookeeper provides some standard usage scenario support, but Zookeeper's use of these features is rarely documented and is easily used incorrectly. In some extreme scenarios, ZK did not give a detailed description of the document. For example, the shared lock service, when the server side creates a temporary order node succeeds, but hangs before the client receives the nodes name, if this situation is not handled well, it will cause a deadlock.

Curator reduces the complexity of ZK usage mainly from the following aspects:
Retry mechanism: Provides a pluggable retry mechanism that configures a retry policy for capturing all recoverable exceptions, and also provides several standard retry policies (such as exponential compensation) internally.
Connection Status monitoring: Curator after initialization will always listen to ZK connection, once the connection status changes, will be processed accordingly.
ZK Client Instance management: Curator manages the ZK client-to-server cluster connection. And in case of need, rebuild the ZK instance to ensure reliable connection with ZK cluster
Various usage Scenarios Support: Curator implements most of the usage scenarios supported by ZK (even scenarios that ZK itself does not support), which follow the best practices of ZK and consider extreme situations.

Curator through the above processing, let the user focus on their own business itself, without spending more energy in the ZK itself.

Some highlights of curator claim:

Log tool
Internal use of SLF4J to output logs
Use the drive (driver) mechanism to allow extended and custom log and trace processing
Provides a tracerdriver interface to integrate the user's own tracking framework by implementing the Addtrace () and Addcount () interfaces

The disadvantage of another zookeeper client--zkclient (https://github.com/sgroschupf/zkclient) compared to curator:
The document has little
Exception handling Weak burst (simple throw runtimeexception)
Retry processing is too hard to use.
No implementations are available for various usage scenarios

"Complaining" to the Zookeeper client (zookeeper Class):
Just a bottom-up implementation
You need to write a lot of code yourself.
It's easy to misuse.
Need to handle connection loss, retry, etc.

Curator several components

    • Client: is a substitute for zookeeper clients, providing some underlying processing and related tool methods.
    • Framework: Used to simplify the use of zookeeper advanced features and add new features, such as managing connections to zookeeper clusters, retrying processing
    • Recipes: Implements the recipe of the universal zookeeper, which is built on the framework
    • Utilities: A variety of zookeeper tool classes
    • Errors: Exception handling, connection, recovery, etc.
    • Extensions:recipe extension



Client
This is a low-level API, the application of basic to this can be ignored, it is best to start directly from the curator framework
Mainly consists of three parts:
Uninterrupted connection Management
Connection Retry Processing

Retry loop (Loop retry)
A typical usage:

Java code
  1. Retryloop Retryloop = Client.newretryloop ();
  2. while (Retryloop.shouldcontinue ())
  3. {
  4. Try
  5. {
  6. //Perform your work
  7. ...
  8. //It's important to re-get the ZK instance as there could have been a error and the instance was re-created
  9. ZooKeeper ZK = Client.getzookeeper ();
  10. Retryloop.markcomplete ();
  11. }
  12. catch (Exception e)
  13. {
  14. Retryloop.takeexception (e);
  15. }
  16. }


If it fails during the operation and the failure is retried, and within the allowable number of times, the curator will guarantee the final completion of the operation.

Another retry approach using the callable interface:

Java code
    1. retryloop.callwithretry (client, new callable ()   
    2. {  
    3.        @Override   
    4.       public void call ()   throws exception  
    5.       {  
    6.            // do your work here - it will get  retried if needed  
    7.            return null;  
    8.       }  
    9. });   



Retry Policy
There is only one method for the Retrypolicy interface (previous versions have two methods):
public boolean allowretry (int retrycount, long elapsedtimems);
The Allowretry method is called before the retry is started, and its parameters specify the current number of retries, and the operation has consumed time. If allowed, the retry is continued, otherwise an exception is thrown.

Curator internal implementation of several retry strategies:

    • Exponentialbackoffretry: Retry the specified number of times and the time to pause between retries increases gradually.
    • Retryntimes: Specifies the retry policy for the maximum number of retries
    • Retryonetime: Retry only
    • Retryuntilelapsed: Retry until the specified time is reached



Framework
Is the zookeeper client's higher abstraction API
Automatic connection management: When an exception occurs inside the zookeeper client, it is automatically reconnected or retried, and the process is almost completely transparent to the outside
Clearer API: Simplifies zookeeper native methods, events, and interfaces that provide processes

The Curatorframeworkfactory class provides two methods, a factory method Newclient, and a build method build. Using the factory method newclient can create a default instance, and the build build method can customize the instance.  When the Curatorframework instance is built, the start () method is called immediately, and the close () method is called at the end of the application. Curatorframework is thread-safe. The curatorframework of the same ZK cluster can be shared in one application.

The Curatorframework API uses a coherent-style interface (Fluent Interface). All operations are returned to the builder, and when all elements are added together, the entire method looks like a complete sentence. For example, the following actions:

Java code
    1. Client.create (). Forpath ("/head", new byte[0]);
    2. Client.delete (). Inbackground (). Forpath ("/head");
    3. Client.create (). Withmode (createmode.ephemeral_sequential). Forpath ("/head/child", new byte[0]);
    4. Client.getdata (). watched (). Inbackground (). Forpath ("/test");



Method Description:

    • Create (): Initiates a create operation. You can combine other methods (such as mode or background) and finally end with the Forpath () method
    • Delete (): Initiates a delete operation. You can combine other methods (V Ersion or background) finally ends with the Forpath () method
    • checkexists (): Initiates an operation that checks for the presence of Znode. Other methods can be combined (watch or background) Finally, end with the Forpath () method
    • GetData (): Initiates an operation to get Znode data. You can combine other methods (watch, background, or get stat) and end with the Forpath () method.
    • SetData (): Initiates an operation that sets Znode data. You can combine other methods (version or background) and end with the Forpath () method
    • GetChildren (): Initiates an operation to get the Znode child node. You can combine other methods (watch, background, or get stat) and end with the Forpath () method
    • intransaction (): Initiates a zookeeper transaction. You can combine create, setData, check, and/or delete as an action, and commit () commit

.  

Notification (Notification)  
Curator's related code has been updated, the interface has been changed from Clientlistener to Curatorlistener, And the Clientcloseduetoerror method is removed from the interface. There is only one way:  
eventreceived ()             The method is called   when a background operation is complete or the specified watch is triggered, and the

Unhandlederrorlistener interface is used to handle the exception.  

Curatorevent ( In previous versions of Clientevent) was a complete encapsulation of the related event objects (POJO) that triggered the various operations, while the content of the event object was related to the event type, the corresponding relationship:  

CREATE Getresultcode () and GetPath ()
DELETE Getresultcode () and GetPath ()
EXISTS Getresultcode (), GetPath () and Getstat ()
Get_data Getresultcode (), GetPath (), Getstat () and GetData ()
Set_data Getresultcode (), GetPath () and Getstat ()
Children Getresultcode (), GetPath (), Getstat (), GetChildren ()
Watched Getwatchedevent ()



Namespaces (Namespace)
Because a ZK cluster is shared by multiple applications, in order to avoid ZK patch conflicts for each application, an namespace (optional) is assigned to each curator framework instance within the curator framework. This will automatically add the namespace as root for this node path when you create Znode. Use the following code:

Java code
    1. Curatorframework client = Curatorframeworkfactory.builder (). Namespace ("MyApp") ... build ();
    2. ...
    3. Client.create (). Forpath ("/test", data);
    4. node is actually written to: "/myapp/test"




Recipe

Curator implement all recipe of zookeeper (except two submissions)
Election
Cluster leadership elections (leader election)

Lock Service
Shared locks: Global synchronization of distributed locks, the same time two machines only one can get the same lock.
Shared read-write lock: Used for distributed read-write mutex processing, while generating two locks: A read lock, a write lock, read lock can be held by multiple applications, and write lock can only one exclusive, when the write lock is not held, multiple read lock holders can simultaneously read operation
Shared semaphores: Each JVM in the distributed system uses the same ZK lock path, which is associated with a given number of leases (lease), and each application obtains the corresponding lease according to the order of the request, which is relatively the fairest way to use the lock service.
Multi-Share Lock: The internal component multiple shared locks (which are associated with a znode path), in the acquire () process, execute all shared locks acquire () methods, and if a failure occurs in the middle, all the shared locks that have been require are freed; When the release () method is executed, the release method for multiple internal shared locks is executed (ignored if a failure occurs)

Queuing (queue)
Distributed queue: Use the persistent order ZK node to implement the FIFO queue, if there are multiple consumers, can be used leaderselector to ensure the queue consumer order
Distributed priority queues: distributed versions of priority queues
BLOCKINGQUEUECONSUMER:JDK distributed version of the blocking queue

Level (Barrier)
Distributed level: a bunch of clients to deal with a bunch of tasks, only all the clients are finished, all clients can continue to work down
Dual distributed levels: start at the same time and end at the same time

Counter (Counter)
Shared counters: All clients listen to the same Znode path and share an up-to-date integer count value
Distributed Atomiclong (Atomicinteger): atomicxxx distributed version, first with optimistic lock update, if the failure and the use of mutual exclusion lock Update, you can configure the retry policy to handle the retry

Tool class

Path Cache
The path cache is used to listen for changes to the child nodes of the Znode, and when the Add, update, remove child nodes change the path cache state, the data and state of all child nodes are returned.
The curator uses the Pathchildrencache class to handle the path Cache, and the state changes using Pathchildrencachelistener to listen.
For related usage See Testpathchildrencache test class

Note: When the data of ZK server changes, the ZK client will appear inconsistent, which requires a version number to identify the change in this state.

Test Server
Used to impersonate a local in-process zookeeper Server in a test.

Test Cluster
Used to simulate a zookeeper server cluster in a test

Zkpaths Tool Class
Provides methods for path processing tools related to Znode:

    • Getnodefrompath: Gets the node name based on the given path. i.e. "/one/two/three", "three"
    • Mkdirs: Recursively creates all node based on a given path
    • Getsortedchildren: Returns a list of child nodes sorted by serial number based on the given path
    • Makepath: Creates a full path based on the given path and child node name



Ensurepath Tool Class

Look directly at the example, specifically called multiple times, only one time to create a node operation.

Java code
    1. ensurepath       ensurepath = < span class= "keyword" >new ensurepath (afullpathtoensure);   
    2. ...   
    3. string           nodepath =  Afullpathtoensure +  "/foo";   
    4. ensurepath.ensure (ZK);    // first time syncs and creates if  needed  
    5. zk.create (nodepath,  ...);   
    6. ...   
    7. ensurepath.ensure (ZK);    //  subsequent times are nops  
    8. zk.create (nodepath,  ...);   



Notification event Handling
Curator The Zookeeper event watcher is encapsulated, then realizes a set of listening mechanism. Several listening interfaces are provided to handle changes in the state of the zookeeper connection
When an exception occurs, the connection is monitored via the Connectionstatelistener interface and processed accordingly, and these state changes include:

    • Pause (SUSPENDED): When the connection is lost, all operations are paused until the connection is re-established, and if the connection cannot be established within the specified time, the lost notification is triggered
    • Reconnection (reconnected): The notification is triggered when a connection is lost and a reconnection is performed
    • Lost (LOST): The notification is triggered when the connection times out



From the Com.netflix.curator.framework.imps.CuratorFrameworkImpl.validateConnection (curatorevent) method we can know that Curator respectively zookeeper disconnected, Expired, syncconnected three states into the above three states.

http://macrochen.iteye.com/blog/1366136

Zookeeper Open Source client framework curator Introduction

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.