Hbase multi-Master

Source: Internet
Author: User

Configuration of a single master

Hbase. mastermaster: 60000 is usually configured, which defines the IP address and port of the master. However, when we need to configure multiple masters, we only need to provide the port, because zookeeper will handle the case when we select the real master. Configure hbase. master. Port 60000 for multiple masters to CP this configuration to other backup master servers. Assume that the current architecture a: Master, Zookeeper, hregionserverb: backup-master, Zookeeper, hregionserver on a directly start the hbase-start.sh on B start the hbase-daemon.sh master on B so that we start the master on both A and B, don't worry about starting 2 at the same time, because zookeeper switches Master B only when the master of A is down. Let's take a look at Port A as the main TCP 0 0: 60000 master process port tcp 0 0: 60010 masterweb background port B is from TCP 0 0 ::: 60000 master process port although B has started the master, but zookeeper has determined that there is already a surviving master in the network, so B is assigned as the slave. Let's take a look at how zookeeper works. Zookeeper log: 2012-09-07 14:56:53, 073 warn Org. apache. zookeeper. server. nioservercnxn: Caught end of stream exceptionendofstreamexception: unable to read additional data from client sessionid 0x1399f8281420000, likely client has closed socketat Org. apache. zookeeper. server. nioservercnxn. doio (nioservercnxn. java: 220) at Org. apache. zookeeper. server. nioservercnxnfactory. run (nioservercnxnfactory. java: 224) T Java. lang. thread. run (thread. java: 662) 14:56:53, 074 info Org. apache. zookeeper. server. nioservercnxn: Closed socket connection for client/192.168.1.149: 56188 which had sessionid 0x1399f82814200002012-09-07 14:57:54, 002 info Org. apache. zookeeper. server. zookeeperserver: expiring session 0x399f76c05b0003, timeout of 180000 Ms exceeded2012-09-07 14:57:54, 002 info Org. apache. zookeeper. server. Z Ookeeperserver: expiring session 0x399f76c05b0005, timeout of 180000 Ms exceeded2012-09-07 14:57:54, 002 info Org. apache. zookeeper. server. preprequestprocessor: processed session termination for sessionid: 0x399f76c05b00032012-09-07 14:57:54, 002 info Org. apache. zookeeper. server. preprequestprocessor: processed session termination for sessionid: 0x399f76c05b00052012-09-07 14:59:18, 002 info Org. apache. Z Ookeeper. server. zookeeperserver: expiring session 0x399f8287a20002, timeout of 180000 Ms exceeded2012-09-07 14:59:18, 002 info Org. apache. zookeeper. server. preprequestprocessor: processed session termination for sessionid: 0x399f8287a200022012-09-07 14:59:23, 679 info Org. apache. zookeeper. server. nioservercnxnfactory: accepted socket connection from/192.168.1.253: 345072012-09-07 14:59:23, 680 info Org. Apache. zookeeper. server. zookeeperserver: client attempting to establish new session at/192.168.1.253: 345072012-09-07 14:59:23, 690 info Org. apache. zookeeper. server. zookeeperserver: established session 0x1399f8281420004 with negotiated timeout 180000 for client/192.168.1.253: 345072012-09-07 14:59:24, 002 info Org. apache. zookeeper. server. zookeeperserver: expiring session 0x1399f8281420000, timeout o F 180000 Ms exceeded2012-09-07 14:59:24, 002 info Org. apache. zookeeper. server. preprequestprocessor: processed session termination for sessionid: 0x1399f8281420000 when B switches to the master node and enables TCP 0 0 ::: 60010, B's master has taken over.

In the previous article aboutHbaseHas been described inHbaseThis article will introduce the distributed architecture.HbaseHow to eliminate SPFO in a distributed environment is described in a small experiment.HbaseHigh Availability in a distributed environment allows you to see some phenomena with your own eyes and extend your thinking.

Let's reviewHbaseMain components:
1. hbasemaster
2. hregionserver
3.HbaseClient
4.HbaseThrift Server
5.HbaseRest Server

Hbasemaster
Hmaster is responsible for allocating regions to hregionserver and load balancing hreginserver in the cluster environment. hmaster is also responsible for monitoring the running status of hreginserver in the cluster environment. If an hreginserver is down, hbasemaster will re-allocate the unavailable hreginserver to the hlog and table providing services to other hreginservers. hbasemaster is also responsible for managing data and tables, process Table Structure and table data changes because all related table information is stored in the meta system table. The hmaster implements the zookeeper watcher interface to interact with the zookeeper cluster.

Hregionserver
Hreginserver is responsible for processing user read and write operations. The hreginserver communicates with hbasemaster to obtain the data tables that require services and report the running status to the hmaster. When a write request arrives, it will first be written to a write-ahead log called hlog. Hlog is cached in memory, called memcache. Each hstore can have only one memcache. When memcache reaches the configured size, a mapfile is created and written to the disk. This reduces the memory pressure on hreginserver. When a read request arrives, hreginserver first searches for the data in memcache. When the data cannot be found, it searches for the data in mapfiles.

HbaseClient
HbaseThe client is responsible for finding the hreginserver that provides the required data. In this process,HbaseThe client first communicates with the hmaster and finds the root region. This operation is the only communication operation between the client and the master. Once the root region is found, the client can find the corresponding meta region by scanning the root region to locate the hreginserver that actually provides the data. After locating the hreginserver that provides data, the client can use this hreginserver to find the required data. This information will be cached by the client, and you do not need to proceed to this process the next request.

HbaseService Interface
HbaseThrift server andHbaseThe rest server uses a non-Java programHbaseA Method of access.
 

Enter the subject

First, let's look atHbaseCluster simulation environment. There are 4 machines in this environment, including zookeeper, hbasemaster, hreginserver, and hdsf services. In order to demonstrate the effect of failed forwarding, hbasemaster and hreginserver have two servers, respectively, hbasemaster and hreginserver are only run on one machine.
Note,HbaseIn the cluster environment, hbasemaster only has the function of failure forwarding without load balancing, while hreginserver provides failure forwarding and pressure load balancing.

The server list is as follows:
1,Zookeeper192.168.20.214
2,Hbasemaster192.168.255.213/192.168.255.215
3,Hreginserver192.168.255.213/192.168.255.215
4,Hdsf192.168.255.212

Architecture of the entire simulated environment:

Note that only a simulated environment is created here, because the focus of this environment isHbaseTherefore, both zookeeper and HDFS services are single.

Although the entireHbaseOnly one hmaster is allowed in the cluster environment, but multiple hmaster can be started in the cluster environment, but only one hmaster server is actually used, other started hmaster servers do not work until they determine that the communication with the currently running hmaster has timed out with the zookeeper server. The running hmaster server is down, zookeeper will connect to the next hmaster server.

To put it simply, if the hmaster server is down, Zookeeper selects the next hmaster server from the list for access, so that it can take over the down hmaster task. In other words, use a Java clientHbaseThe operation is performed through zookeeper, that is, if all nodes in the zookeeper cluster are downHbaseThe cluster is also suspended. ItselfHbaseThe real data that is not stored is stored on HDFS, soHbaseThe data is consistent, but the HDFS file system is down,HbaseCluster.

After an hmaster fails, the clientHbaseWhen the cluster environment is accessed, the client first identifies the hmaster running exception through zookeeper, and connects to the next hmaster after confirmation multiple times. At this time, the backup hmaster service takes effect, effect in IDE environment ,:

Some exceptions and names: javahttp: // www.javabloger.com and name: javahttp: // www. javabloger. the result set of COM1, because I used the killall Java command on the serv215 machine to disable both the hmaster and hreginserver, and immediately used the Java clientHbaseAn exception is thrown during access in the cluster environment, but the results are queried after a certain number of retries.HbaseZookeeper is used to deal with real data. That is to say, Zookeeper takes over a standby hmaster and takes over the original standby hmaster to take over the failed hmaster task, the hbasemaster takes over and allocates hreginserver tasks. When the hreginserver fails, Zookeeper notifies the hmaster to allocate hreginserver tasks. This fully demonstratesHbaseThe forwarding function is effective.
:

 

Saliva:
1,HbaseThe efficiency of failed Forwarding is relatively slow, and it is not expected to be able to switch and restore in 1-2 seconds. Maybe it is because I have not found any parameters that can speed up the process of failed forwarding and recovery, we will continue to pay attention to this issue in the future.
2. on the official websiteHbase0.89.20100924 has an article about data synchronization. I tried to run the so-calledHbaseIn a virtual cluster environment, but switching to a distributed environment of multiple machines, spof Forwarding is slow.Hbase0.20.6 is still slow. I checked whether there is a network problem. Currently, no correct answer is found.HbasePrinciples of data synchronization in the new version 0.89.20100924: (more information)


You can leave a message or send an email to me. My contact information is njthnet # gmail.com.

Related Articles:
 HbaseEntry 4
 HbaseEntry 3
 HbaseEntry 2
 HbaseEntry
Hive entry 3-hive andHbaseIntegration

Hbase multi-Master

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.