I. Basic concepts of ZooKeeper
1. What is ZooKeeper?
Zookeeper website address: http://zookeeper.apache.org/
Zookeeper official website document address: http://zookeeper.apache.org/doc/trunk/index.html
ZooKeeper is a sub-project under Hadoop, it is a reliable coordination system for large distributed systems, it provides the functions of configuration maintenance, name services, distributed synchronization, group services, etc., its goal is to encapsulate the complex and error-prone key services, the easy-to-use interface and performance efficiency, Functionally stable systems are available to users.
Zookeeper one of the most commonly used scenarios is to serve as a registry for service producers and service consumers, and service producers register their services with the Zookeeper Center, and the service consumers look for services in zookeeper when making service calls. After getting the details of the service producer, call the service producer's content and data, a simple example diagram is as follows:
2, Zookeeper Design objectives:
Zookeeper allows distributed processes to coordinate with each other through a shared hierarchy namespace, similar to a standard file system. Namespaces are made up of data registers in zookeeper-called Znode, which resemble files and directories. Unlike typical file systems designed for storage, zookeeper data is kept in memory, which means that zookeeper can achieve high throughput and low latency.
The Zookeeper hierarchy namespace is as follows:
Through the data model of the tree graph structure, it is easy to find a specific service.
3, zookeeper main features:
1)、最终一致性:为客户端展示同一视图,这是 ZooKeeper 最重要的性能。2)、可靠性:如果消息被一台服务器接受,那么它将被所有的服务器接受。3)、实时性:ZooKeeper 不能保证两个客户端同时得到刚更新的数据,如果需要最新数据,应该在读数据之前调用sync()接口。4client 不干预快速的client的请求。5)、原子性:更新只能成功或者失败,没有中间其它状态。6)、顺序性:对于所有Server,同一消息发布顺序一致。
Ii. Basic principles of ZooKeeper
1. ZooKeeper System Architecture
First look at the architecture diagram of the ZooKeeper.
What we need to know and master in the ZooKeeper architecture diagram are:
(1) The zookeeper is divided into the server side (server) and the client (client), which can be connected to any server of the entire Zookeeper service (unless the leaderserves parameter is explicitly set, Leader is not allowed to accept client connections).
(2) The client uses and maintains a TCP connection that sends requests, accepts responses, gets observed events, and sends heartbeats through this connection. If this TCP connection is interrupted, the client will automatically attempt to connect to another zookeeper server. When the client connects to the zookeeper service for the first time, the zookeeper server that accepts the connection establishes a session for the client. When the client connects to another server, the session is re-established by the new server.
(3) Each server represents a machine that installs the Zookeeper service, which is the entire cluster that provides the zookeeper service (or is composed of pseudo-clusters);
(4) The servers that make up the zookeeper service must understand each other. They maintain an in-memory state image, as well as transaction logs and snapshots in persistent storage, as long as most servers are available and the Zookeeper service is available;
(5) When the ZooKeeper starts, an leader,leader is elected from the instance to handle the data updates, and an update operation is successful when and only if most servers successfully modify the data in memory. Each server stores a copy of the data in memory.
(6) Zookeeper can be replicated by cluster, and the Zab protocol (Zookeeper Atomic broadcast) is used to maintain the consistency of data.
(7) The Zab protocol consists of two stages:leader election stage and Atomic brodcast stage .
- A) The cluster will elect a leader, other machines are called follower, all writes are sent to leader, and all updates are told to follower through Brodcast.
- b) When leader crashes or leader loses most of the follower, a new leader needs to be re-elected to restore all servers to a proper state.
- C) When leader is elected and most servers are synchronized with the leader state, the process of Leadder election is over and will enter the atomic brodcast process.
- d) Atomic Brodcast synchronizes the information between leader and follower to ensure that leader and follower have the same system state.
2. Zookeeper role
After starting the Zookeeper server cluster environment, multiple Zookeeper servers will elect a Leader before working. Before the election of leader, all servers do not distinguish between roles, all require equal participation in voting (except ObServer, do not participate in voting);
After the main selection process is complete, there are several roles:
Thinking:
1. Why do I need server?
①ZooKeeper 需保证高可用和强一致性;②为了支持更多的客户端,需要增加更多的Server
2. What role does observer play in zookeeper?
①ObServer 不参与投票过程,只同步 leader的状态 ;②Observers 接受客户端的连接,并将写请求转发给 leader节点 ;③加入更多ObServer 节点,提高伸缩性,同时还不影响吞吐率。
3, why the number of servers in the zookeeper is generally odd?
Server 写成功,则任务数据写成功。 ①如果有3个Server,则最多允许1个Server 挂掉。 ②如果有4个Server,则同样最多允许1个Server挂掉。 既然3个或者4个Server,同样最多允许1个ServerServer即可,这里选择3个Server。
3, ZooKeeper Write Data flow
The flowchart for ZooKeeper writing data is shown below.
ZooKeeper's write data flow is mainly divided into the following steps:
A), such as Client to ZooKeeper Server1 write data, send a write request.
b), if Server1 is not leader, then Server1 will forward the received request to leader further, because there is one zookeeper in each leader server. This leader will broadcast the write request to each server, such as Server1 and Server2, and the server will notify leader when it is written successfully.
c), when leader received most of the Server data written successfully, then the data is written successfully. If there are three nodes here, as long as there are two nodes to write the data successfully, then it is believed that the data is written successfully. After successful writing, leader will tell Server1 that the data was written successfully.
D), Server1 will further inform the Client that the data has been written successfully, then the entire write operation is considered successful.
4. ZooKeeper Components
The zookeeper component shows the advanced components of the Zookeeper service. In addition to the request processor, each server that makes up the zookeeper service replicates its own copy of each component.
replicated database is an in-memory databases that contain the entire data tree. The update is logged to the disk for recoverability, and the write operation is serialized to disk before it is placed in the memory database.
Each Zookeeper Server service client. The client connects to a server to commit the irequest. Read requests from the local copy service of each server database. Requests to change service status (write requests) are handled by the protocol.
As part of the Protocol protocol, all write requests from the client are forwarded to a single server, called leader. The remaining zookeeper servers (called followers) receive message proposals from the leader and agree to message delivery. The message layer is responsible for replacing the leader on failure and synchronizing the followers with leader.
Iii. Summary of ZooKeeper application scenarios
1. Unified Naming Service
The naming structure diagram for the unified Naming service is as follows:
1, in the distributed environment, often need to the application/service for the unified naming, easy to identify different services.
A) similar to the corresponding relationship between the domain name and IP, IP is not easy to remember, and the domain name is easy to remember.
b) Obtain information about the address, provider, etc. of the resource or service by name.
2. Organize the service/application name according to the hierarchy structure.
A) The service name and address information can be written to zookeeper, and the client obtains the available Services list class through zookeeper.
2. Configuration Management
The configuration management structure diagram looks like this:
1, in a distributed environment, profile management and synchronization is a common problem.
A) in a cluster, the configuration information for all nodes is consistent, such as a Hadoop cluster.
b) After modifying the configuration file, you want to be able to quickly sync to each node.
2, configuration management can be implemented by zookeeper.
A) The configuration information can be written to a znode on zookeeper.
b) Each node listens to this znode.
c) Once the data in the Znode has been modified, zookeeper will notify the individual nodes.
3. Cluster Management
The cluster management structure diagram looks like this:
1, in a distributed environment, it is necessary to master the state of each node in real time.
A) Some adjustments can be made based on the real-time status of the node.
2, can be handed over to zookeeper implementation.
A) The node information can be written to a znode on the zookeeper.
b) Listen to the Znode to get its real-time status changes.
3. Typical application
A) Master status monitoring and election in HBase.
4. Distributed notification and coordination
1. In a distributed environment, there is often a service that needs to know the state of the Sub-service it manages.
A) Namenode need to know the status of each datanode.
b) Jobtracker need to know the status of each tasktracker.
2, heartbeat detection mechanism can be achieved through zookeeper.
3, information push can be realized by zookeeper, zookeeper is equivalent to a publish/subscribe system.
5. Distributed lock
Different services on different nodes, they may require sequential access to some resources, where a distributed lock is required.
Distributed locks have the following characteristics:
1、ZooKeeper是强一致的。比如各个节点上运行一个ZooKeeper客户端,它们同时创建相同的Znode,但是只有一个客户端创建成功。2、实现锁的独占性。创建Znode成功的那个客户端才能得到锁,其它客户端只能等待。当前客户端用完这个锁后,会删除这个Znode,其它客户端再尝试创建Znode,获取分布式锁。3、控制锁的时序。各个客户端在某个Znode下创建临时Znode,这个类型必须为CreateMode.EPHEMERAL_SEQUENTIAL,这样该Znode可掌握全局访问时序。
6. Distributed queue
Distributed queues are divided into two types:
1. This queue is available when a member of a queue is NAND, otherwise it waits for all members to arrive, which is the synchronization queue.
A) a job consists of more than one task, and the job does not run until all tasks have been completed.
b) You can create a/job directory for the job, and then, in that directory, create a temporary znode for each completed task, which, once the number of temporary nodes reaches the total number of tasks, indicates that the job run is complete.
2, queue in a FIFO manner and team operations, such as the implementation of producer and consumer models.
Iii. ZooKeeper Installation and deployment
There are three installation modes for the zookeeper:
- Stand-alone mode (stand-alone): Single server;
- Cluster mode: Multi-machine multi-server, forming a cluster;
- Pseudo-Cluster mode: single-machine multiple servers, forming pseudo-cluster;
Environment: Cent OS 7.0
1. Stand-alone mode
(1) Create a directory as needed, for example my directory is:/home/xuliugen/Desktop/zookeeper-install
(2) Enter the catalogue, use wget to download zookeeper,
: https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
Other versions: https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/
Complete the following:
(3) Use: tar -xvf zookeeper-3.4.6.tar.gz
unzip the file;
(4) Create the Zookeeper configuration file:
Under the Conf file in the Zookeeper installation directory, the default is:
Use cp zoo_sample.cfg zoo.cfg
the: command to copy an zoo.cfg file, which is because the zookeeper is used by default when it is started zoo.cfg this configuration file.
(5) Modify the contents of the configuration file as required:
The general default configuration file can be used to demonstrate startup, with the following configuration file:
# Zookeeper Service Heartbeat detection time, unit of MsTicktime= -# Vote for the initial time of the new leaderinitlimit=Ten# Leader with follower heartbeat detection maximum tolerance time, response over Synclimit*ticktime,# leader thinks follower is dead, remove the follower from the list of serverssynclimit=5# The directory where the snapshot is stored.# do not use/tmp for storage,/tmp here are just# example Sakes.Datadir=/tmp/zookeeper# The port at which the clients would connectclientport=2181# The maximum number of the client connections.# Increase this if you need to handle more clients#maxClientCnxns =60# The number of snapshots to retain in DataDir#autopurge. snapretaincount=3# Purge Task interval in hours# Set to ' 0 ' to disable Auto Purge feature#autopurge. Purgeinterval=1
Tips:
A small experience with performance tuning in the official zookeeper document is that there are several other configuration parameters that can greatly improve performance:
为了获得更新时的低延迟,重要的是有一个专用的事务日志目录。 默认情况下,事务日志与数据快照和myid文件放在同一目录中。 dataLogDir参数指示用于事务日志的不同目录。
This means that it is better to separate the catalog and log directories, thus improving the efficiency of reading and updating data.
(6) Start zookeeper
Under the bin directory of the Zookeeper installation directory:
Use the command: to ./zkServer.sh start
open the service:
Use: ./zkCli.sh
command to go to the command line management interface:
In this stand-alone mode, the installation is over!
2, cluster mode 3, pseudo-cluster mode
About the configuration of cluster mode and pseudo-cluster mode, there are a lot of content on the net, here no longer demonstrates, please visit:
Http://www.open-open.com/lib/view/open1454043410245.html
Appendix:
Zoo.cfg Configuration parameter Explanation:
Reference article:
1, "Large-scale distributed Web site architecture-Design and practice Chen Kang-"
2, "Zookeeper-3.3.5 Source analysis Shaowei Liu"
3, http://m.blog.csdn.net/article/details?id=51209939
4, http://mt.sohu.com/20160527/n451709612.shtml
5, http://www.open-open.com/lib/view/open1454043410245.html
Brief analysis of Zookeeper fundamentals and installation Deployment