What is the role of Zookeeper,zookeeper, what is the specific role in Hadoop and HBase

Source: Internet
Author: User
Tags apache solr

What is the role of zookeeper,zookeeper and how does it collaborate with Namenode and Hmaster? In the absence of contact with zookeeper students, may have these questions. Here's a summary for you.

first, what is zookeeper

ZooKeeper, the zoo administrator, is the administrator of the Elephant (Hadoop), Bee (Hive), Piglet (pig), Apache HBase and Apache Solr, and LinkedIn sensei, all using Z Ookeeper. Zookeeper is a distributed, open source distributed Application Coordination Service, zookeeper is a distributed application based on the fast Paxos algorithm that implements synchronization services, configuration maintenance, and naming services.



The above explanation feels not enough, too official.ZookeeperFrom the programmer's point of view, it can be understood as the overall monitoring system of Hadoop. If Namenode,hmaster is down,ZookeeperThe re-election of leader. This is where it works the most. Here is a detailed description of the role of zookeeper




second, the role of zookeeper

1.Zookeeper Enhanced cluster stability
Zookeeper enables distributed processes to work together through a hierarchical namespace that resembles a file system. These namespaces consist of a series of data registers, which we also call the data registers as znodes. These znodes are a bit like files and folders in the file system. Unlike file systems, file system files are stored on storage, and zookeeper data is stored in memory. At the same time, this means that the zookeeper has high throughput and low latency.


The zookeeper enables high performance, high reliability, and orderly access. High performance ensures that the zookeeper can be applied to large distributed systems. High reliability ensures that it does not cause any problems due to a single node failure. An orderly access ensures that the client can achieve more complex synchronization operations.


2.Zookeeper strengthens cluster continuity
ZooKeeper Service






Each server that makes up the zookeeper must be able to communicate with each other. They saved the server state in memory, saved the log of the operation, and persisted the snapshot. As long as most of the servers are available, then zookeeper is available.


The client connects to a zookeeper server and maintains a TCP connection. and sends the request, gets the reply, gets the event, and sends the connection signal. If the TCP connection is broken, the client can connect to another server.


Zookeeper guarantee the orderly of the cluster
Zookeeper uses numbers to mark each update. This ensures an orderly zookeeper interaction. Subsequent operations can implement a higher and more abstract service, such as synchronous operations, based on this order.


zookeeper ensure efficient cluster
Zookeeper is more efficient in reading-based systems. The zookeeper can perform well on a distributed system consisting of approximately 10:1 read and write ratios of thousands of servers.


Data structures and Hierarchical namespaces
The zookeeper namespace is structured like a file system. A name is the same as the file used/path representation, each node of the zookeeper is uniquely identified by the path


third, zookeeper in Hadoop and hbase specific role
Hadoop hasNamenode,hbase have hmaster, why do you needZookeeper, let us give you an example to introduce.
In a zookeeper cluster, 3 zookeeper nodes. A leader, two follower case, stop leader, Then two follower elected a leader. The data obtained is not changed. I think zookeeper can help Hadoop do:

Hadoop, using Zookeeper event handling ensures that the entire cluster has only one namenode, storage configuration information, and so on.
HBase, using Zookeeper event handling ensures that the entire cluster has only one hmaster, perceives hregionserver online and down, stores access control lists, and so on.

What is the role of Zookeeper,zookeeper, what is the specific role in Hadoop and HBase

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.