Hadoop serial Three: hbase distributed installation

Source: Internet
Author: User
Keywords dfs 56 installation dress

1 overview

HBase is a distributed, column-oriented, extensible open source database based on Hadoop. Use HBase when large data is required for random, real-time reading and writing. Belong to NoSQL. HBase uses Hadoop/hdfs as its file storage system, uses Hadoop/mapreduce to deal with the massive data in HBase, and uses zookeeper to provide distributed collaboration, distributed synchronization and configuration management.

HBase Architecture:

LSM-Solve disk random write problem (in order to write the King);

hfile-resolves data indexing problems (only indexes can be read efficiently);

WAL-Resolve Data persistence (persistent solution in the face of failure);

Zookeeper-resolve core data consistency and cluster recovery;

Replication-Introduce mysql-like data replication solutions to address usability;

In addition: Automatic split split, automatic compression (COMPACTION,LSM associated technology), automatic load balancing, automatic region migration.

HBase clusters need to rely on a zookeeper ensemble. HBase all nodes in the cluster and to access HBase

Clients need to be able to access the zookeeper ensemble. HBase with zookeeper, but for convenience

Other applications use zookeeper, preferably with a separately installed zookeeper ensemble. In addition, the zookeeper ensemble is typically configured as an odd number of nodes, and Hadoop clusters, zookeeper ensemble,

The HBase cluster is three separate clusters that do not need to be deployed on the same physical nodes, between them through the network

Communication.

2 Installation and Configuration

2.1 Download Installation HBase

Download hbase-0.96.1.1-hadoop1-bin.tar.gz and unzip to/usr below and rename to HBase directory. The hbase version needs to be mapped to Hadoop, to see if it corresponds only to see if the version number after Hbase/lib/hadoop-core corresponds to the version of Hadoop, and if not, you can copy Hadoop hadoop-core files. But there is no guarantee that there will be no problem.

2.2 Setting environment variables

Vim/etc/profile:

# Set HBase path

Export Hbase_home=/usr/hbase

Export path= $PATH: $HBASE _home/bin

2.3 Configuration HBase

Edit configuration file Hbase-site.xml:vim/usr/hbase/conf/hbase-site.xml

Single:

  

  

Hbase.rootdir

File:///tmp/hbase-${user.name}/hbase

  

  

Pseudo Distribution:

  

  

Hbase.rootdir

Hdfs://localhost:9000/hbase

  

  

Dfs.replication

1

  

  

Full distribution:

1) Configure Hbase-site.xml

  

  

Hbase.rootdir

Hdfs://192.168.56.1:9000/hbase

HBase Data Storage Directory

  

  

hbase.cluster.distributed

True

Specifies the mode to run HBase: false: Stand-alone/pseudo distribution; true: Full distribution

  

  

Hbase.master

hdfs://192.168.56.1:60000

Specify Master Location

  

  

Hbase.zookeeper.property.dataDir

/var/lib/zookeeper

  

Hbase.zookeeper.quorum

192.168.56.1,192.168.56.101,192.168.56.102,192.168.56.103,192.168.56.104

Specify Zookeeper Cluster

  

  

Hbase.master.info.bindAddress

192.168.56.1

The bind address for the HBase Master Web UI

  

  

2 Edit configuration file regionservers:

192.168.56.101

192.168.56.102

192.168.56.103

192.168.56.104

3 Set the environment variable hbase-env.sh:

Export java_home=/usr/java/jdk1.7.0_45/

Export hbase_classpath=/usr/hadoop/conf

Export hbase_heapsize=2048

Export Hbase_manages_zk=false

Note:

Where Java_home represents the JAVA installation directory, Hbase_classpath points to the directory where the Hadoop configuration file is stored, so HBASE can find HDFs configuration information, because this article Hadoop and HBASE are deployed on the same physical node. So it points to the Conf directory under the Hadoop installation path. The hbase_heapsize unit is MB, and can be set to the desired and actual remaining memory defaults to 1000. Hbase_manages_zk=false instructs the HBASE to use an existing zookeeper instead of a self-contained one.

2.4 Replicate to each node, and then configure the environment variables for each node

Scp-r/usr/hbase Node Ip:/usr

3 Start and stop HBase

Start HBase: Need to start HDFs and zookeeper in advance, start sequence for hdfs-zookeeper-"

Start all nodes on Server1: start-hbase.sh

Stop hbase:stop-hbase.sh

Connect hbase CREATE TABLE: HBase shell

HBase Shell; Enter ' help ' for list of keyword commands.

Type "Exit" to leave the HBase Shell

Version 0.96.1.1-hadoop1, Runknown, Tue Dec 11:52:14 PST 2013

HBase (Main):001:0>

View Status: HBase (main):001:0> status

4 servers, 0 dead, 2.2500 average load

4 Testing and Web viewing

4.1 Creating a Table test

Create a table named Sgt, which has only one column accessibility as cf. You can list all tables to check the creation, and then insert some values.

HBase (main):003:0> create ' Sgt ', ' CF '

0 row (s) in 1.2200 seconds

HBase (main):003:0> list

Sgt

1 row (s) in 0.0550 seconds

HBase (main):004:0> put ' Sgt ', ' Row1 ', ' cf:a ', ' value1 '

0 row (s) in 0.0560 seconds

HBase (main):005:0> put ' Sgt ', ' Row2 ', ' cf:b ', ' value2 '

0 row (s) in 0.0370 seconds

HBase (main):006:0> put ' Sgt ', ' row3 ', ' cf:c ', ' value3 '

0 row (s) in 0.0450 seconds

Check insert: Scan this table

HBase (main):005:0> Scan ' Sgt '

Get row, action as follows

HBase (main):008:0> get ' Sgt ', ' Row1 '

Disable again drop this table, you can clear the action you just

HBase (Main):012:0> Disable ' Sgt '

0 row (s) in 1.0930 seconds

HBase (Main):013:0> drop ' Sgt '

0 row (s) in 0.0770 seconds

Exporting and importing

HBase Org.apache.hadoop.hbase.mapreduce.Driver Export Sgt Sgt

The exported table, in the current user directory of the Hadoop file system, in the Sgt folder. For example, the directory structure after export in the Hadoop file system:

Hadoop Dfs-ls

Found 1 Items

Drwxr-xr-x-hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small

Hadoop Dfs-ls./small

Found 3 Items

-rw-r--r--2 hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small/_success

Drwxr-xr-x-hadoop supergroup 0 2013-10-22 10:44/user/hadoop/small/_logs

-rw-r--r--2 hadoop supergroup 285 2013-10-22 10:44/user/hadoop/small/part-m-00000

When you import this table into a hbase in another cluster, you need to put part-m-00000 in another hadoop, assuming the path to put is also:

/user/hadoop/small/

Also, the hbase to be imported is already built with the same form.

Then import data from Hadoop to HBase:

HBase Org.apache.hadoop.hbase.mapreduce.Driver Import Sgt part-m-00000

In this way, the hbase data can be imported to another HBase database without any surprises.

4.2 Web View

For accessing and monitoring the running state of the Hadoop system

Daemon Default port configuration parameters

HDFSNamenode50070dfs.http.address

Datanodes50075dfs.datanode.http.address

Secondarynamenode50090dfs.secondary.http.address

Backup/checkpoint node*50105dfs.backup.http.address

MRJobracker50030mapred.job.tracker.http.address

Tasktrackers50060mapred.task.tracker.http.address

HBaseHMaster60010hbase.master.info.port

HRegionServer60030hbase.regionserver.info.port

Http://192.168.56.1:60010/master-status

5 Summary

This paper introduces the configuration of HBase installation and configuration, including single machine, pseudo distribution, fully distributed three modes, and focuses on the installation and configuration of HBase distributed cluster. The following will introduce Chukwa cluster, pig, etc.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.