Data import (i): Hive on HBase

Source: Internet
Author: User
Tags mapr

Hive Integrated HBase effectively leverages the storage attributes of the HBase database, such as row updates and column indexes. Keep in mind the consistency of the HBase jar packages during integration. the implementation of the integration function of hive and hbase is to communicate with each other by using the API interface between them, and communicate with each other mainly by relying on the Hive_hbase-handler.jar tool class.
The process of integrating hive with HBase is as follows:
1. Copy (overwrite) the Hbase-common-0.96.2-hadoop2.jar and Zookeeper-3.4.5.jar under hbase_home to the hive_home/lib folder
2. Modify the Hive-site.xml file under hive_home/conf and add the following (depending on the actual modification):

< Property><name>Hive.querylog.location</name><value>$HIVE _home/logs</value></ Property>< Property><name>Hive.aux.jars.path</name> <value>File:///hive-0.7.1/lib/hive-hbase-handler-0.7.1.jar,file:///hive-0.7.1/lib/hbase-common-0.96.2-hadoop2.jar, File:///hive-0.7.1/lib/zookeeper-3.3.2.jar</value></ Property>

3. Copy Hbase-common-0.96.2-hadoop2.jar to all Hadoop nodes (including master) under Hadoop/lib
4. Copy the Hbase-site.xml file under Hbase/conf to the hadoop/conf of all Hadoop nodes (including master).

note: If 3, 42 steps skipped, The following error is likely to occur when you run hive :
Org.apache.hadoop.hbase.ZooKeeperConnectionException: hbase is able-connect to ZooKeeper But the connection closes immediately.
this could is a sign that the server has too many connections (the default). Consider inspecting your ZK server logs for that error and
then Make sure is reusing hbaseconfiguration as often as you can. See Htable's Javadoc for more information. At Org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.

5. Start hive
: bin/hive -hiveconf Hbase.master=master : 60000
If Hive.aux.jars.path is not configured in the Hive-site.xml file, it can be started as follows.
hive--auxpath/opt/mapr/hive/hive-0.7.1/lib/ hive-hbase-handler-0.7.1.jar,/opt/mapr/hive/hive-0.7.1/lib/hbase-0.90.4.jar,/opt/mapr/hive/hive-0.7.1/lib/ Zookeeper-3.3.2.jar -hiveconf hbase.master=localhost:60000

cluster boot : bin/hive -hiveconf hbase.zookeeper.quorum=node1,node2,node3 (All zookeeper nodes)
Tested to modify hive configuration file Hive-site.xml, you can start hive federated HBase without adding parameters

< Property><name>Hive.zookeeper.quorum</name><value>Node1,node2,node3</value><Description>The list of zookeeper servers to talk to. This is a needed for read/write locks.</Description></ Property>

6. Test after startup
(1). Build HBase Table Hbase_student

HBase>Create'hbase_student' 'info'

(2). Build hive façade hive_studentand correspond to Hbase_student table

hive integrated hbase requires a mapping between the hive table and the HBase table, which is the column of the Hive table ( Columns) and the column type (columns types) are associated with the column family (column families) of the HBase table and the column qualifier (columns qualifiers).

HBase corresponds to the use of  :key , to select a field in Hive, where the columns in the column family are in hive. Span style= "color: #ff00ff;" >cf:q .

 create  EXTERNAL table  hive_student (Rowkey string, name string, age int  , phone string ) STORED  by   " org.apache.hadoop.hive.hbase.hbasestoragehandler   " with  serdeproperties ("hbase.columns.mapping" =  ": key  ,info:name,info:age,info:phone ") tblproperties (" HBase.  table . Name "= " hbase_student ");    

7. Data import and Validation:
data_student

CREATE TABLE int  by '\ t' '/test/hbase/tsv/input/' ;  

(2). Data is imported into the Hbase_student table via Hive_student

SET hive.hbase. Bulk = true; INSERT TABLE SELECT  from Data_student;

Note : If you encounter Java.lang.IllegalArgumentException:Property value must not being null exception, need to hive-0.13.0 and above version support

Data import (i): Hive on HBase

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.