Zookeeper Concept
Zookeeper is a distributed, open-source distributed Application Coordination Service that contains a simple set of primitives that can be used by distributed applications to implement synchronization services, configure maintenance and naming services, and so on. Zookeeper is a sub-project of Hadoop, and its evolution is not to be mentioned. In distributed applications, a reliable, extensible, distributed, and configurable coordination mechanism is needed to unify the state of the system, because the engineers do not use the lock mechanism well and the message-based coordination mechanism is not suitable for use in some applications. This is the purpose of zookeeper.
1. Roles
There are three main categories of roles in Zookeeper, as shown in the following table:
System Model:
2. Design Purpose
- 1. Final consistency: No matter which server the client connects to, it is the same view that is presented to it, which is the most important performance of zookeeper.
- 2. Reliability: With simple, robust, good performance, if the message M is accepted to a server, then it will be accepted by all servers.
- 3. Real-time: zookeeper to ensure that the client will be in a time interval to obtain updates to the server, or server failure information. However, due to network delay and other reasons, zookeeper cannot guarantee that two clients can get the newly updated data at the same time, if you need the latest data, you should call the sync () interface before reading the data.
- 4. Wait unrelated (Wait-free): Slow or invalid client must not intervene in the fast client request, so that each client can effectively wait.
- 5. Atomicity: Updates can only succeed or fail with no intermediate state.
- 6. Sequence: including global order and partial order: Global order is that if the message a on a server is published before message B, on all servers, message A will be published in front of message B; The partial order is that if a message B is published by the same sender after message A, a must precede B.
Zookeeper Cluster Construction
The following is a brief introduction of zookeeper cluster construction, the installation of a single zookeeper is simpler, the following cluster construction as an example.
We set up and deploy a ZooKeeper aggregate with three nodes, and we must follow the steps below to start the ZooKeeper server on each node.
1. Environmental preparedness
- Test server (2n+1) odd table
192.168.181.128 centos6.4
192.168.181.129 centos6.4
192.168.181.130 centos6.4
- Download Zookeeper installation package
http://mirrors.cnnic.cn/apache/zookeeper/zookeeper-3.4.8/
192.168.181.128 test1
192.168.181.129 test2
192.168.181.130 test3
- Install the JDK, unzip the installation files, check the firewall
2. Design the installation directory
- Installation directory:/home/rtmap
- Datadir:/data/zookeeper
- Storage Snapshot:/data/zookeeper/zkdata
- Transaction present:/data/zookeeper/zkdatalog
- Presence log:/data/zookeeper/logs
3. Modify the configuration file
# go to conf directory /home/rtmap/zookeeper-3.4.8/conf# view [root@192.168.181.1128]$ ll -rw-rw-r--. 1 535 configuration.xsl-rw-rw-r--. 1-2161- Log4j.pro Perties-rw-rw-r--. 1 922 zoo_sample.cfg
#zoo_sample. cfg This file is the official template file for our zookeeper, give him a copy of the name Zoo.cfg,Zoo.cfg is the official designated file naming rules.
ticktime=2000initlimit=10synclimit=5datadir=/data/zookeeper/zkdatadatalogdir =/data/zookeeper/zkdatalogclientport# service port number server.1=192.168.181.128:2888:3888 server. 2=192.168.181.129:2888:3888server. 3=192.168.181.130:2888:3888#Server.1 This 1 is the server's identity can also be other numbers, indicating this is the number of server, used to identify the server, this identity to write to the snapshot directory under the myID file #192.168.181.128 is the IP address in the cluster, the first port is the communication port between master and slave, the default is 2888, the second port is the port of the leader election, When the cluster was first started, the election or leader hung up, and the port by default was 3888 .
Configuration file Explanation:
#Ticktime:this time is the interval between Zookeeper servers or between the client and server, which means that each ticktime time sends a heartbeat. #Initlimit:This configuration item is used to configure the Zookeeper accept client (the client here is not the client that connects the Zookeeper server, but the Follower that is connected to Leader in the Zookeeper server cluster) Server) The maximum number of heartbeat intervals that can be tolerated when a connection is initialized. The client connection failed when the Zookeeper server has not received the return information of the client after 5 heartbeats (that is, ticktime) length. The total length of time is 5*2000=10seconds#Synclimit:This configuration entry identifies the length of time that a message is sent between Leader and follower, the duration of the request and the response, and the maximum number of ticktime, the total length of time is 5*2000=10 seconds#DataDir:storage path for snapshot logs#Datalogdir:the storage path of the thing log, if you do not configure this then the thing log will be stored by default to the DataDir-developed directory, which will seriously affect the performance of ZK, when ZK throughput is large, the resulting thing log, snapshot log too many#ClientPort:This port is the port where the client connects to the Zookeeper server, and Zookeeper listens to the port, accepting requests for client access. Change his port to bigger.
Under DataDir directory/data/zookeeper/, write a myID file with the following command:
# Server1 " 1 " >/data/zookeeper/zkdata/myid#server2"2" >/data/zookeeper/zkdata/myid#server3"3" >/data/zookeeper/zkdata/myid
Note: This ID is the host of zookeeper, each host ID is different from the second station is 2 third is 3, respectively, and the above Zoo.cfg file server.1,server.2,server.3 correspondence.
4. Modify the log
If you do not make changes, the default zookeeper log output information is printed to the Zookeeper.out file, so the output path and size cannot be controlled because the log file is not rotated. Therefore, you need to modify the log output mode. Here's how:
1, modify the zkenv.sh file in the $zookeeper_home/bin directory, zoo_log_dir specify which directory you want to output to, Zoo_log4j_prop, specify Info,rollingfile log Appender.
2. Modify the $zookeeper_home/conf/log4j.properties file: The value of Zookeeper.root.logger is consistent with the zoo_log4j_prop of the previous file, and the log configuration is rotated in the log file size , if you want to rotate by day, you can modify it to daliyrollingfileappender.
5. Start the service and view
# go to Zookeeper Bin directory under cd/opt/zookeeper/zookeeper-3.4.6/bin# start Service (all 3 units require operation). zkserver.sh start
Note: If you start the station to check the log, you may be error-free, that is because there is no link on the other two 3888 ports.
# Check the server status .//opt/zookeeper/zookeeper-3.4.6/bin/. /conf/zoo.cfg # profile mode:follower # is he the leader
Note:ZK cluster generally has only one leader, multiple follower, the master is generally the corresponding client read and write requests, and from the main synchronization data, when the master hangs up will vote from follower to elect a leader out.
Reference Document: Http://www.cnblogs.com/luotianshuai/p/5206662.html
Kafka Study (ii)-zookeeper cluster construction