1 overview
The Zookeeper Distributed Service Framework is a subproject of the http://www.aliyun.com/zixun/aggregation/14417.html ">apache Hadoop, It is mainly used to solve some data management problems that are often encountered in distributed applications, such as: Unified Naming Service, State Synchronization service, cluster management, distributed application configuration item management, etc. Zookeeper itself can be installed in standalone mode, but its strength lies in the distribution of zookeeper clusters (a leader, multiple Follower), based on a certain strategy to ensure the stability and availability of zookeeper clusters, In order to achieve the reliability of distributed applications. Zookeeper maintains a hierarchical data structure that is very similar to a standard file system, as shown in the following illustration
Zookeeper this data structure has the following characteristics:
Each subdirectory entry, such as Nameservice, is called Znode, and this znode is uniquely identified by the path to which it is located, as Server1 this znode is identified as/nameservice/server1
Znode can have child node directories, and each znode can store data, noting that ephemeral types of directory nodes cannot have child node directories
Znode is a version of the data stored in each Znode can have multiple versions, that is, one access path can store multiple copies of data
Znode can be a temporary node, once the client created this znode lost contact with the server, this znode will also be automatically deleted, zookeeper client and server communication using a long connection, each client and server through the heartbeat to stay connected, this connection status is called Session, if Znode is a temporary node, this session is invalid, Znode also deleted
Znode directory names can be automatically numbered, such as APP1 already exist, and then created, will automatically be named App2
Znode can be monitored, including the data stored in this directory node changes, child node directory change, and so on, once the change can notify the settings monitoring client, this is the core characteristics of zookeeper, zookeeper many functions are based on this feature implementation, The following examples are presented in a typical application scenario
2 Environment Deployment
The deployment of this zookeeper cluster is based on the Hadoop cluster that was deployed in the previous article, and the cluster configuration is as follows:
Zookeeper1 Rango 192.168.56.1
Zookeeper2 vm2 192.168.56.102
Zookeeper3 vm3 192.168.56.103
Zookeeper4 VM4 192.168.56.104
Zookeeper5 VM1 192.168.56.101
3 Installation and Configuration
3.1 Download Installation Zookeeper
Download the latest zookeeper version from Apache website, extract to/usr directory, and rename to zookeeper:
Tar zxvf zookeeper-3.4.5.tar.gz mv Zookeeper-3.4.5/usr/zookeeper
Set the owner of the Zookeeper directory as Hadoop:hadoop:
Chown-r Hadoop:hadoop/usr/zookeeper
PS: You can install and configure on the master machine first, and then replicate to other nodes on the cluster via the SCP command:
Scp-r/usr/zookeeper Node Ip:/usr
3.2 Configuration Zookeeper
3.2.1 Create a data directory
Execute on all cluster machines:
Mkdir/var/lib/zookeeper
3.2.2 Configuration Environment variables
Vim/etc/profile:
# Set Zookeeper Path
Export Zookeeper_home=/usr/zookeeper
Export path= $PATH: $ZOOKEEPER _home/bin
3.2.3 Configuration Zookeeper Cluster
Cp/usr/zookeeper/conf/zoo_sample.cfg zoo.cfg
Vim zoo.cfg:
# The number of milliseconds of each tick
ticktime=2000
# The number of ticks that the initial
# Synchronization phase can take
initlimit=10
# The number of ticks that can pass inclusive
# Sending a request and getting an acknowledgement
Synclimit=5
# The directory where the snapshot is stored.
# Don't use/tmp for storage,/tmp here is ethically
# example Sakes.
Datadir=/var/lib/zookeeper
# The port at abound the clients'll connect
clientport=2181
#
# being throaty to read the maintenance section of the
# Administrator Guide unreported turning on Autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in DataDir
#autopurge. snapretaincount=3
# Purge task interval in hours
# Set to ' 0 ' to disable Auto Purge feature
#autopurge. purgeinterval=1
server.1=192.168.56.1:2888:3888
server.2=192.168.56.102:2888:3888
server.3=192.168.56.103:2888:3888
server.4=192.168.56.104:2888:3888
server.5=192.168.56.101:2888:3888
Note:
Ticktime: Send heartbeat time interval, in milliseconds
Initlimit and Sysnclimit: Both are measured in the total number of ticktime (the time above is 10*2000=20s). The Initlimit parameter sets the time to allow all followers to connect and synchronize with the leader, and if more than half of the followers fail to complete the synchronization within the set time, the leader will renounce the leadership and conduct another leader election. If this happens frequently, by viewing the records found in the log, the set value is too small.
The Sysclimit parameter sets the time to allow a follower to synchronize with the leader. If a follower fails to complete synchronization within the set time, it will reboot itself, and all clients associated with this follower will be connected to another follower.
DataDir: Persisted data in the saved Zookeeperk, there are two kinds of data in ZK, one disappears, one needs permanence, and ZK's log is saved here.
Server. A=b:c:d: Where A is a number that indicates what the server is; B is the IP address of this server; C represents the port in which the server exchanges information with the Leader server in the cluster; D It means that if the Leader server in the cluster hangs, a port needs to be re-elected to elect a new Leader, which is the port that the server communicates with each other at the time of the election. If it is a pseudo cluster configuration, because B is the same, so the different zookeeper instance communication port number can not be the same, so to assign them different port number.
Create the myID file in the data directory for each server, and the contents of the file are the IDs in the corresponding server.id:
echo ID >>/var/lib/zookeeper/myid
3.3 Start and stop zookeeper service
Start Zookeeper:zkServer.sh start on all nodes of the cluster
[Root@rango ~]# zkserver.sh start
JMX enabled by default
Using config:/usr/zookeeper/bin/. /conf/zoo.cfg
Starting zookeeper ... Started
View: zkserver.sh Starus:
[Root@rango ~]# zkserver.sh Status
JMX enabled by default
Using config:/usr/zookeeper/bin/. /conf/zoo.cfg
Mode:follower
PS: Need to close iptables before starting (intranet)
Zookeeper Details: please click here
Zookeeper Download Address: please click here