Hadoop serial Three: hbase distributed installation

Last Update:2014-12-22 Source: Internet

Author: User

Keywords dfs 56 installation dress

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1 overview

HBase is a distributed, column-oriented, extensible open source database based on Hadoop. Use HBase when large data is required for random, real-time reading and writing. Belong to NoSQL. HBase uses Hadoop/hdfs as its file storage system, uses Hadoop/mapreduce to deal with the massive data in HBase, and uses zookeeper to provide distributed collaboration, distributed synchronization and configuration management.

HBase Architecture:

LSM-Solve disk random write problem (in order to write the King);

hfile-resolves data indexing problems (only indexes can be read efficiently);

WAL-Resolve Data persistence (persistent solution in the face of failure);

Zookeeper-resolve core data consistency and cluster recovery;

Replication-Introduce mysql-like data replication solutions to address usability;

In addition: Automatic split split, automatic compression (COMPACTION,LSM associated technology), automatic load balancing, automatic region migration.

HBase clusters need to rely on a zookeeper ensemble. HBase all nodes in the cluster and to access HBase

Clients need to be able to access the zookeeper ensemble. HBase with zookeeper, but for convenience

Other applications use zookeeper, preferably with a separately installed zookeeper ensemble. In addition, the zookeeper ensemble is typically configured as an odd number of nodes, and Hadoop clusters, zookeeper ensemble,

The HBase cluster is three separate clusters that do not need to be deployed on the same physical nodes, between them through the network

Communication.

2 Installation and Configuration

2.1 Download Installation HBase

Download hbase-0.96.1.1-hadoop1-bin.tar.gz and unzip to/usr below and rename to HBase directory. The hbase version needs to be mapped to Hadoop, to see if it corresponds only to see if the version number after Hbase/lib/hadoop-core corresponds to the version of Hadoop, and if not, you can copy Hadoop hadoop-core files. But there is no guarantee that there will be no problem.

2.2 Setting environment variables

Vim/etc/profile:

# Set HBase path

Export Hbase_home=/usr/hbase

Export path= $PATH: $HBASE _home/bin

2.3 Configuration HBase

Edit configuration file Hbase-site.xml:vim/usr/hbase/conf/hbase-site.xml

Single:

Hbase.rootdir

File:///tmp/hbase-${user.name}/hbase

Pseudo Distribution:

Hbase.rootdir

Hdfs://localhost:9000/hbase

Dfs.replication

Full distribution:

1) Configure Hbase-site.xml

Hbase.rootdir

Hdfs://192.168.56.1:9000/hbase

HBase Data Storage Directory

hbase.cluster.distributed

True

Specifies the mode to run HBase: false: Stand-alone/pseudo distribution; true: Full distribution

Hbase.master

hdfs://192.168.56.1:60000

Specify Master Location

Hbase.zookeeper.property.dataDir

/var/lib/zookeeper

Hbase.zookeeper.quorum

192.168.56.1,192.168.56.101,192.168.56.102,192.168.56.103,192.168.56.104

Specify Zookeeper Cluster

Hbase.master.info.bindAddress

192.168.56.1

The bind address for the HBase Master Web UI

2 Edit configuration file regionservers:

192.168.56.101

192.168.56.102

192.168.56.103

192.168.56.104

3 Set the environment variable hbase-env.sh:

Export java_home=/usr/java/jdk1.7.0_45/

Export hbase_classpath=/usr/hadoop/conf

Export hbase_heapsize=2048

Export Hbase_manages_zk=false

Note：

Where Java_home represents the JAVA installation directory, Hbase_classpath points to the directory where the Hadoop configuration file is stored, so HBASE can find HDFs configuration information, because this article Hadoop and HBASE are deployed on the same physical node. So it points to the Conf directory under the Hadoop installation path. The hbase_heapsize unit is MB, and can be set to the desired and actual remaining memory defaults to 1000. Hbase_manages_zk=false instructs the HBASE to use an existing zookeeper instead of a self-contained one.

2.4 Replicate to each node, and then configure the environment variables for each node

Scp-r/usr/hbase Node Ip:/usr

3 Start and stop HBase

Start HBase: Need to start HDFs and zookeeper in advance, start sequence for hdfs-zookeeper-"

Start all nodes on Server1: start-hbase.sh

Stop hbase:stop-hbase.sh

Connect hbase CREATE TABLE: HBase shell

HBase Shell; Enter ' help ' for list of keyword commands.

Type "Exit" to leave the HBase Shell

Version 0.96.1.1-hadoop1, Runknown, Tue Dec 11:52:14 PST 2013

HBase (Main):001:0>

View Status: HBase (main):001:0> status

4 servers, 0 dead, 2.2500 average load

4 Testing and Web viewing

4.1 Creating a Table test

Create a table named Sgt, which has only one column accessibility as cf. You can list all tables to check the creation, and then insert some values.

HBase (main):003:0> create ' Sgt ', ' CF '

0 row (s) in 1.2200 seconds

HBase (main):003:0> list

Sgt

1 row (s) in 0.0550 seconds

HBase (main):004:0> put ' Sgt ', ' Row1 ', ' cf:a ', ' value1 '

0 row (s) in 0.0560 seconds

HBase (main):005:0> put ' Sgt ', ' Row2 ', ' cf:b ', ' value2 '

0 row (s) in 0.0370 seconds

HBase (main):006:0> put ' Sgt ', ' row3 ', ' cf:c ', ' value3 '

0 row (s) in 0.0450 seconds

Check insert: Scan this table

HBase (main):005:0> Scan ' Sgt '

Get row, action as follows

HBase (main):008:0> get ' Sgt ', ' Row1 '

Disable again drop this table, you can clear the action you just

HBase (Main):012:0> Disable ' Sgt '

0 row (s) in 1.0930 seconds

HBase (Main):013:0> drop ' Sgt '

0 row (s) in 0.0770 seconds

Exporting and importing

HBase Org.apache.hadoop.hbase.mapreduce.Driver Export Sgt Sgt

The exported table, in the current user directory of the Hadoop file system, in the Sgt folder. For example, the directory structure after export in the Hadoop file system:

Hadoop Dfs-ls

Found 1 Items

Drwxr-xr-x-hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small

Hadoop Dfs-ls./small

Found 3 Items

-rw-r--r--2 hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small/_success

Drwxr-xr-x-hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small/_logs

-rw-r--r--2 hadoop supergroup 285 2013-10-22 10:44/user/hadoop/small/part-m-00000

When you import this table into a hbase in another cluster, you need to put part-m-00000 in another hadoop, assuming the path to put is also:

/user/hadoop/small/

Also, the hbase to be imported is already built with the same form.

Then import data from Hadoop to HBase:

HBase Org.apache.hadoop.hbase.mapreduce.Driver Import Sgt part-m-00000

In this way, the hbase data can be imported to another HBase database without any surprises.

4.2 Web View

For accessing and monitoring the running state of the Hadoop system

Daemon Default port configuration parameters

HDFSNamenode50070dfs.http.address

Datanodes50075dfs.datanode.http.address

Secondarynamenode50090dfs.secondary.http.address

Backup/checkpoint node*50105dfs.backup.http.address

MRJobracker50030mapred.job.tracker.http.address

Tasktrackers50060mapred.task.tracker.http.address

HBaseHMaster60010hbase.master.info.port

HRegionServer60030hbase.regionserver.info.port

Http://192.168.56.1:60010/master-status

5 Summary

This paper introduces the configuration of HBase installation and configuration, including single machine, pseudo distribution, fully distributed three modes, and focuses on the installation and configuration of HBase distributed cluster. The following will introduce Chukwa cluster, pig, etc.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More