The principle of zookeeper and its application in Hadoop and HBase __zookeeper

Source: Internet
Author: User
Tags event listener sessions unique id zookeeper

Zookeeper is an open source distributed Coordination Service , created by Yahoo, and is an open-source implementation of Google Chubby . Distributed applications can implement functions such as data Publishing/subscriptions, load balancing, naming services, distributed coordination/notification, cluster Management, master elections, distributed locks, and distributed queues based on zookeeper. Brief Introduction

Zookeeper is an open source distributed Coordination Service , created by Yahoo, and is an open-source implementation of Google Chubby . Distributed applications can implement functions such as data Publishing/subscriptions, load balancing, naming services, distributed coordination/notification, cluster Management, master elections, distributed locks, and distributed queues based on zookeeper. Basic Concepts

This section describes several of the core concepts of zookeeper. These concepts run through more in-depth explanations of zookeeper, so it is necessary to understand these concepts in advance. Cluster Roles

In zookeeper, there are three kinds of roles: Leader Follower Observer

A zookeeper cluster at the same time there will be only one leader, others are follower or observer.

Zookeeper configuration is very simple, each node's configuration file (zoo.cfg) is the same, only myid files are not the same. The value of myID must be in Zoo.cfg server. {value} portion of {value}.

Zoo.cfg file Content Sample:

 
Maxclientcnxns=0 # The number of milliseconds of each tick ticktime=2000 # The number of ticks ' that initial # Synchronization phase can take initlimit=10 # The number of ticks that can pass between # Sending a request and getting a Acknowledgement Synclimit=5 # The directory where the snapshot is stored. Datadir=/var/lib/zookeeper/data # The port at which the clients'll connect clientport=2181 # The directory where the TRA nsaction logs are stored. Datalogdir=/var/lib/zookeeper/logs server.1=192.168.20.101:2888:3888 server.2=192.168.20.102:2888:3888 server.3= 192.168.20.103:2888:3888 server.4=192.168.20.104:2888:3888 server.5=192.168.20.105:2888:3888 minSessionTimeout= 4000 maxsessiontimeout=100000

Performing zookeeper-server status at the terminal of a zookeeper-equipped machine can look at what role the zookeeper of the current node is (Leader or Follower).

[root@ node-20-103 ~] # zookeeper-server status JMX enabled by default Using config:/etc/zookeeper/conf/zoo.cfg mode:fo Llower
[root@ node-20-104 ~] # zookeeper-server status JMX enabled by default Using config:/etc/zookeeper/conf/zoo.cfg mode:le Ader

As above, node-20-104 is leader,node-20-103 is follower.

Zookeeper defaults to only leader and follower two roles, no observer roles.

In order to use the Observer mode, add to the configuration file for any node that wants to become observer: Peertype=observer
And in all of the server's configuration files, the row of the server configured as observer mode is appended: Observer, for example:
Server.1:localhost:2888:3888:observer

All machines in the Zookeeper cluster Select a machine called "Leader" through a Leader election process ,Leader server provides read and write for clients service.

Follower and Observer both provide read services and do not provide write services. The only difference is that theObserver machine does not participate in the leader election process and does not participate in the "Half write success" strategy of the write operation, so Observer can not affect write performance to improve the read performance of the cluster. Sessions (session)

Session refers to client sessions , before we explain the client session, let's take a look at the client connection . In zookeeper, a client connection refers to a TCP long connection between a client and a zookeeper server. Zookeeper external service port by default is 2181 , when the client starts, first establishes a TCP connection with the server, starts from the first connection establishment, the client session life cycle also starts, through this connection, the client can pass through Heartbeat and the server remain in a valid session, can also send requests to the zookeeper server and accept response , and can also receive from the server through this connection Watch Event notification . The sessiontimeout value of the session is used to set the time-out period for a client conversation. When a client connection is disconnected due to server pressure, network failure, or active disconnect from the client, it is possible to reconnect to the cluster any of the sessiontimeout specified time. server, the session created previously is still valid . Data Node (znode)

When it comes to distribution, the general "node" refers to each machine that makes up the cluster. The data node in zookeeper refers to the data unit in the data Model , called Znode. Zookeeper stores all of the data in memory , the data model is a tree (znode trees), the path that is segmented by a slash (/), is a znode, such as/hbase/master, where HBase and master are znode. Each znode saves its own data content , while preserving a series of attribute information .

Note:
The znode here can be understood as both a file in Unix and a directory in Unix. Because each Znode not only writes data itself (equivalent to files in Unix), it can also have a next level of file or directory (equivalent to a directory in Unix).

In zookeeper, Znode can be divided into two categories: persistent node and temporary node .

Persistent Node

The so-called persistent node is that once the Znode is created, the Znode will be kept on zookeeper unless the active Znode removal operation is initiated.

Temporary Node

The lifecycle of the temporary node is bound to the client session, and once the client session is invalidated, all temporary nodes created by the client are removed.

In addition, zookeeper allows users to add a special attribute for each node: sequential. Once the node is marked with this attribute, when the node is created, the zookeeper automatically follows its node with an integer number, which is a self increment maintained by the parent node. version

The data is stored on each znode of the zookeeper, and each znode,zookeeper maintains a data structure called stat for it, and the stat records three data versions of this znode, respectively, version (current Znode). , Cversion (the version of the current Znode child node) and aversion (the ACL version of the current znode). status information

In addition to storing the contents of the data, each Znode stores some state information of the Znode itself. The Get command allows you to obtain both the content and state information for a znode. As follows:

 
[zk:localhost:2181 (CONNECTED)] Get/yarn-leader-election/appcluster-yarn/activebreadcrumb Appcluster-yarnrm1 Czxid = 0x1b00133dc0//created Zxid, representing the transaction ID when the Znode was created CTime = Tue a few hours 15:44:42 CST 2017//created Time, which indicates that the Znode was created Mzxid = 0x1d00000063//modified Zxid, which indicates that the Znode was last updated with the transaction id mtime = Fri with the previous 08:44:25 CST 2017//mo Dified time, which indicates when the node was last updated Pzxid = 0X1B00133DC0//represents the transaction ID of the node when the list of child nodes was last modified. Note that only the child node list changes Pzxid, and the child node content changes do not affect PZXID. cversion = 0//Sub-node version number Dataversion = 11//Data node version number Aclversion = 0//acl version number Ephemeralowner = 0x0//Seddionid of the session that created the node. If the node is a persistent node, this property value is 0. Datalength = 22//Length of data content Numchildren = 0//number of child nodes

In zookeeper, the version property is used to implement the "write checksum" in the optimistic lock mechanism (to ensure that the distributed data atomicity operation). Transaction Operations

In zookeeper, an operation that can change the state of a zookeeper server is called a transaction operation. Typically includes data node creation and deletion, data content updates, and client session creation and expiration operations. For each transaction request, Zookeeper assigns it a globally unique transaction ID, represented by ZXID, usually a 64-digit number. Each ZXID corresponds to an update operation, from which ZXID can indirectly identify the global order in which zookeeper handles requests for these transaction operations. Watcher

Watcher (event listener) is a very important feature in zookeeper. Zookeeper allows the user to register some watcher on the specified node, and when certain events are triggered, the zookeeper server notifies the client of the event that it is interested. This mechanism is an important feature of zookeeper to realize distributed coordination service. ACL

Zookeeper uses the ACL (access control Lists) policy for permission controls. Zookeeper defines the following 5 types of permissions. Create: Creates permissions for child nodes. READ: Gets the permissions for the node data and the list of child nodes. WRITE: Permissions to update node data. Delete: Deletes the permissions of the child nodes. ADMIN: Set permissions on Node ACLs.

Note: Both CREATE and DELETE are control of permissions for child nodes. Zookeeper Typical application scenario

Zookeeper is a highly available distributed data management and coordination framework . Based on the implementation of the ZAB algorithm, the framework can guarantee the consistency of data in distributed environment well. It is also based on this feature, making zookeeper a powerful tool for solving distributed consistency problems. Data Publishing and Subscriptions (Configuration center)

Data release and subscription, that is, the so-called Configuration center , as the name implies is the publisher of the data to the Zookeeper node, for subscribers to data subscriptions, and then achieve the purpose of dynamic access to data , implementation of the centralized management of configuration information and Dynamic Update .

In our common Application system development, we often encounter such a requirement: the system needs to use some common configuration information, such as machine list information , database configuration information . These global configuration information typically has the following 3 features. The volume of data is usually relatively small. The data content changes dynamically at run time. The machines in the cluster are shared and configured consistently .

Such global configuration information can be posted to the zookeeper, and the client (the cluster machine) subscribes to the message.

There are generally two design modes for a publish/subscribe system, namely push and pull (Pull) mode. Push: The service side proactively sends data updates to all subscribed clients. Pull: The client initiates the request to obtain the latest data, usually the client adopts the method of timing polling pull.

Zookeeper uses a combination of push and pull . As follows:

The client wants the server to register the node that it needs to pay attention to, once the data of the node changes , then the server will send the Watcher event notification to the corresponding client, after receiving this message, the client needs Take the initiative to the server to get the latest data ( push-pull combination ). Naming Services (naming Service)

Naming service is also a common type of scenario in distributed systems. In a distributed system, by using a naming service, a client application can obtain information such as the address of a resource or service, the provider , and so on, based on a specified name . A named entity can usually be a machine in a cluster, a service provided, a remote object, and so on --all of which we can collectively call them names (name). A more common one is the list of service addresses in distributed service frameworks, such as RPC, RMI. By creating sequential nodes in Zookeepr, it is easy to create a globally unique path that can be used as a name .

The Zookeeper naming service generates a globally unique ID. distributed coordination/notification

The unique Watcher registration and asynchronous notification mechanism in zookeeper can realize the notification and coordination between different machines and even different systems in distributed environment so as to realize real-time processing of data change. . The use method is usually different clients are registered with ZK on the same znode, listening to znode changes (including Znode itself content and child nodes), if Znode changed, then all subscribed clients can receive the corresponding watcher notification, and make corresponding processing.

ZK's distributed coordination/notification is a common way of communication between machines in distributed systems .

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.