Integration of Hadoop Hive and Hbase

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

1. Hadoop and Hbase have been installed successfully.
2. Copy the hbase-0.90.4.jar and zookeeper-3.3.2.jar to hive/lib.
3. Modify the hive-site.xml file in hive/conf and add the following content at the bottom:
4. Copy the hbase-0.90.4.jar to hadoop/lib on all hadoop nodes (including the master.
1. Start a Single Node
2. Start the cluster:
1. Create a database identified by hbase:
2. Use SQL to import data

Integration of Hadoop Hive and Hbase
I. Introduction

Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides a complete SQL query function, you can convert SQL statements to MapReduce tasks for running. The advantage is that the learning cost is low. You can use SQL-like statements to quickly implement simple MapReduce statistics without having to develop special MapReduce applications. This is suitable for the statistical analysis of data warehouses.

Hive and HBase integration function implementation is to use the two itself external API interface to communicate with each other, mutual communication is mainly rely on hive_hbase-handler.jar tool class, roughly meaning:

Ii. installation steps: 1. Hadoop and Hbase have been successfully installed.

Hadoop cluster configuration-http://blog.csdn.net/hguisu/article/details/723739

Hbase installation configuration: http://blog.csdn.net/hguisu/article/details/7244413

2. Copy the hbase-0.90.4.jar and zookeeper-3.3.2.jar to hive/lib.

NOTE: If another version of the two files already exists under hive/lib (such as the zookeeper-3.3.2.jar), we recommend that you delete it and use the relevant version under hbase.

3. Modify the hive-site.xml file in hive/conf and add the following content at the bottom:

<!--  <property>    <name>hive.exec.scratchdir</name>     <value>/usr/local/hive/tmp</value>   </property>   -->    <property>     <name>hive.querylog.location</name>     <value>/usr/local/hive/logs</value>   </property>     <property>    <name>hive.aux.jars.path</name>     <value>file:///usr/local/hive/lib/hive-hbase-handler-0.8.0.jar,file:///usr/local/hive/lib/hbase-0.90.4.jar,file:///usr/local/hive/lib/zookeeper-3.3.2.jar</value>  </property>

Note: If the hive-site.xml does not exist, create it on your own, or rename the hive-default.xml.template file and use it.

4. Copy the hbase-0.90.4.jar to hadoop/lib on all hadoop nodes (including the master. 5. Copy the hbase-site.xml file under hbase/conf to hadoop/conf on all hadoop nodes (including master.

Note: If you skip step 3 or 4, the following error may occur during hive running:

[html] view plaincopyorg.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately.   This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and   then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. at org.apache.hadoop.  hbase.zookeeper.ZooKeeperWatcher.

3. Start Hive1. start a Single Node

#bin/hive -hiveconf hbase.master=master:490001

2. Start the cluster:

#bin/hive -hiveconf hbase.zookeeper.quorum=node1,node2,node3

You can start as follows if hive. aux. jars. path is not configured in the hive-site.xml file.

bin/hive --auxpath /usr/local/hive/lib/hive-hbase-handler-0.8.0.jar, /usr/local/hive/lib/hbase-0.90.5.jar, /usr/local/hive/lib/zookeeper-3.3.2.jar -hiveconf hbase.zookeeper.quorum=node1,node2,node3

Iv. Test: 1. Create a database identified by hbase:

CREATE TABLE hbase_table_1(key int, value string)STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")TBLPROPERTIES ("hbase.table.name" = "xyz");

Hbase. table. name defines the table name in hbase.

Hbase. columns. mapping defines the columnfamily in hbase.

2. Use SQL to import data

1) create a hive data table:

Create table pokes (foo INT, bar STRING );
2) Batch insert data:

Hive> load data local inpath'./examples/files/kv1.txt 'Overwrite INTO TABLE

3) use SQL to import hbase_table_1:

Hive> insert overwrite table hbase_table_1 SELECT * FROM pokes WHERE foo = 86;

3. View data

Hive> select * from hbase_table_1;

Now you can log on to Hbase to view the data.
# Bin/hbase shell
Hbase (main): 001: 0> describe 'xyz'
Hbase (main): 002: 0> scan 'xyz'
Hbase (main): 003: 0> put 'xyz', '2013', 'cf1: val', 'www .360buy.com'

In Hive, we can see the data inserted in Hbase.

4. Access an existing hbase through hive

Use create external table:

CREATE EXTERNAL TABLE hbase_table_2(key int, value string)      STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")TBLPROPERTIES("hbase.table.name" = "some_existing_table");

Content reference: http://wiki.apache.org/hadoop/Hive/HBaseIntegration

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Integration of Hadoop Hive and Hbase

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Integration of Hadoop Hive and Hbase

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support