Talking about the function of zookeeper in HBase cluster

Source: Internet
Author: User
Tags apache solr

One, what is zookeeper?

ZooKeeper, the zoo administrator, is the administrator of the Elephant (Hadoop), Bee (Hive), Piglet (pig), Apache HBase and Apache Solr, and LinkedIn sensei, all using Z Ookeeper. Zookeeper is a distributed, open source distributed Application Coordination Service, zookeeper is a distributed application based on the fast Paxos algorithm that implements synchronization services, configuration maintenance, and naming services.
The above explanation feels not enough, too official. Zookeeper can be understood as the overall monitoring system for Hadoop from a programmer's point of view. If the namenode,hmaster down, this time zookeeper re-elected leader. This is where it works the most.

Second, the role of zookeeper

1.Zookeeper Enhanced cluster stability
Zookeeper enables distributed processes to work together through a hierarchical namespace that resembles a file system. These namespaces consist of a series of data registers, which we also call the data registers as znodes. These znodes are a bit like files and folders in the file system. Unlike file systems, file system files are stored on storage, and zookeeper data is stored in memory. At the same time, this means that the zookeeper has high throughput and low latency.

The zookeeper enables high performance, high reliability, and orderly access. High performance ensures that the zookeeper can be applied to large distributed systems. High reliability ensures that it does not cause any problems due to a single node failure. An orderly access ensures that the client can achieve more complex synchronization operations.

2.Zookeeper strengthens cluster continuity
ZooKeeper Service

Each server that makes up the zookeeper must be able to communicate with each other. They saved the server state in memory, saved the log of the operation, and persisted the snapshot. As long as most of the servers are available, then zookeeper is available.

The client connects to a zookeeper server and maintains a TCP connection. and sends the request, gets the reply, gets the event, and sends the connection signal. If the TCP connection is broken, the client can connect to another server.

Zookeeper guarantee the orderly of the cluster
Zookeeper uses numbers to mark each update. This ensures an orderly zookeeper interaction. Subsequent operations can implement a higher and more abstract service, such as synchronous operations, based on this order.

Zookeeper ensure efficient cluster
Zookeeper is more efficient in reading-based systems. The zookeeper can perform well on a distributed system consisting of approximately 10:1 read and write ratios of thousands of servers.

Data structures and Hierarchical namespaces
The zookeeper namespace is structured like a file system. A name is the same as the file used/path representation, each node of the zookeeper is uniquely identified by the path

Third, zookeeper in Hadoop and hbase specific role
1,hadoop have namenode,hbase have hmaster, why also need zookeeper, below to everyone through examples to introduce.
In a zookeeper cluster, 3 zookeeper nodes. A leader, two follower case, stop leader, Then two follower elected a leader. The data obtained is not changed. I think zookeeper can help Hadoop do:
Hadoop, using Zookeeper event handling ensures that the entire cluster has only one namenode, storage configuration information, and so on.
HBase, using Zookeeper event handling ensures that the entire cluster has only one hmaster, perceives hregionserver online and down, stores access control lists, and so on.

2,hbase Regionserver Register with zookeeper to provide hbase regionserver status information (online)3,hmaster the HBase system table-root-is loaded into zookeeper cluster at boot time, and the current system table can be obtained through zookeeper cluster. META. The regionserver information corresponding to the store. the main function of Hmaster is to maintain the system table-root by Hmaster-,. META., records the region change information for Regionserver. It is also responsible for monitoring Regionserver status change information in the current HBase cluster. hbase Regionserver is used for multiple/individual maintenance region. Region corresponds to the table partition data maintenance for the HBase data table.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Talking about the function of zookeeper in HBase cluster

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.