Principles and installation of zookeeper

Source: Internet
Author: User

Zookeeper is a high-availability and high-performance Coordination Service.

Problems Solved
In distributed applications, some failures often occur. That is, when messages are transmitted between nodes, the sender cannot know whether the receiver has received the message due to reasons such as the failure of the network or receiver process.

Because some failures are inherent in the distributed system, Zookeeper cannot avoid some failures, but it can help you correctly handle some failures.

To solve this problem, Zookeeper has the following features:
1: zookeeper provides a wide range of building blocks to implement many data structures and protocols.
2: The client either reads all the data or fails to read the data.
3: zookeeper runs on a group of machines and has high availability. It helps the system avoid spof and delete faulty servers.
4: ordered consistency: any client's update request will be submitted in the order of sending
5. Single System Image: When a server fails and its client needs to connect to other servers, all servers whose updates are later than the faulty server will not receive requests until the updates catch up with the faulty Server
6: Timeliness: any client can see a limited latency, no more than dozens of seconds, and the sync operation is provided to force the server connected to the client to synchronize with the leader
7. Session: each client tries to connect to one server in the configuration list when it connects. If it fails, it automatically connects to the other server and so on, knowing that it is successfully connected to one server to create a session, the client can set the timeout time for each session. Once the session expires, all short-lived znodes will be lost because zookeeper will automatically send heartbeat packets, so it rarely happens.
8: The Appointment Mechanism (rendezvous). During the interaction process, the coordinated parties are not allowed to know each other in advance, or even do not have to exist at the same time.
9: ACL: zookeeper provides three authentication modes: Digest (by user name and password), host (by host name), and IP (by IP address, depending on the authentication mechanism of zookeeper, each ACL is an identity that corresponds to a set of permissions. If we want to give the client domain of demo.com a read permission, we can create it in Java:
New ACL (perms. Read, new ID ("host", "demo.com "));
IDs. open_acl_unsafe is used to grant permissions other than admin permissions to everyone.
In addition, Zookeeper can be integrated with third-party authentication systems.

10: provides open-source shared resource library for common coordination modes
11: high-performance (official data) for write-dominated workloads, the benchmark throughput of five good machines reaches 10000 +

Principle
Zookeeper uses the Zab protocol, which is similar to the paxos algorithm but is different in terms of operation. The Protocol includes two repeated phases.
Leader Election: all the machines in the cluster select a leader. Other machines become followers. If more than half of the followers synchronize the status, this stage is complete (the official data is in 200 milliseconds)
Atomic broadcast: All machines forward write operations to the leader, and the leader broadcasts updates to the followers. Only half of the followers submit updates after the modifications are synchronized, the client can receive the updated information.

Its core is a streamlined file system, which forms a tree-like data structure. The concept of znode is used in a unified manner. nodes can have subnodes or store data, in addition, there is an associated ACL. Because zookeeper is designed to implement coordination services, it usually uses small data files, so znode can store data within 1 MB.
Zookeeper uses the Unicode string separated by a slash to reference a file system path, but must be a standard. special characters such as./are not supported. Use the/zookeeper subtree to save management information.
The client communicates with the server over TCP, and the client and server maintain the seesion connection through heartbeat. When the session fails, the temporary node is deleted.
Functions are implemented by monitoring node and node changes, such as cluster management, centralized configuration management, and distributed locks.
Zookeeper achieves high availability through replication. As long as more than half of the machines in the cluster are available, Zookeeper can provide services. Therefore, a cluster usually has an odd number of machines.

The lifecycle of zookeeper has three states: connection, connected, and closed.
The newly generated zookeeper instance is in the connection state and enters the connected state by establishing a connection. When the zookeeper instance is disconnected and reconnected, the zookeeper instance is converted between connected and coonection, when you call the close method or the session times out, it enters the close State and cannot be restored.

 

Znode features
There are two types of znode: short-term node and persistent node. They are determined at the time of creation and cannot be modified. A short-term node will be removed when the client session ends.
You cannot create any types of subnodes.
If a sequence ID is set when znode is created, znode adds a sequence number through a monotonically increasing counter maintained by the parent node. This sequence number can be used for global sorting.
The watch mechanism allows the client to get the znode change. The observation can only be triggered once. In order to receive the notification multiple times, the client needs to re-register the required observation.

Installation and configuration: (simulate a cluster on a single machine)
Download the latest zookeeper version.
Create three folders: server1 server2 server3
Decompress the package to three folders.
Configure zoo. cfg (do not show Chinese characters in the configuration path)
Is a Java property File
It can be placed under Conf.
Or in the/etc/zookeeper subdirectory
If the zoo1_dir environment variable is configured, it can also be saved in the directory specified by the environment variable.

# Basic time unit in milliseconds
Ticktime = 2000
# Time for all followers to connect and synchronize with leaders
# If more than half of the followers fail to complete synchronization at this time, the leader will give up his leadership and proceed to another
# Leader election
Initlimit = 5
# The synchronization time between a follower and a leader. If the followers fail to complete the synchronization within this time period, the system restarts on their own. All clients associated with the followers will link to the other followers.
Synclimit = 2
# Local file system location for storing persistent data
Datadir = xxxx/zookeeper/server1/Data
Datalogdir = xxx/zookeeper/server1/datalog
# Listening to the client connection Port
Clientport = 2181
# The first port is the leader link, and the second port is the link to other servers used for the leader election stage.
Server .1 = 127.0.0.1: 2888: 3888
Server.2 = 127.0.0.1: 2889: 3889
Server.3 = MAID: 2890: 3890

Create a myid file under datadir, which is written with numbers. The numbers following the remainder server are consistent.

Java sample code http://zookeeper.apache.org/doc/r3.4.2/javaExample.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.