Hive Integrated HBase effectively leverages the storage attributes of the HBase database, such as row updates and column indexes. Keep in mind the consistency of the HBase jar packages during integration. the implementation of the integration function of hive and hbase is to communicate with each other by using the API interface between them, and communicate with each other mainly by relying on the Hive_hbase-handler.jar tool class.
The process of integrating hive with HBase is as follows:
1. Copy (overwrite) the Hbase-common-0.96.2-hadoop2.jar and Zookeeper-3.4.5.jar under hbase_home to the hive_home/lib folder
2. Modify the Hive-site.xml file under hive_home/conf and add the following (depending on the actual modification):
< Property><name>Hive.querylog.location</name><value>$HIVE _home/logs</value></ Property>< Property><name>Hive.aux.jars.path</name> <value>File:///hive-0.7.1/lib/hive-hbase-handler-0.7.1.jar,file:///hive-0.7.1/lib/hbase-common-0.96.2-hadoop2.jar, File:///hive-0.7.1/lib/zookeeper-3.3.2.jar</value></ Property>
3. Copy Hbase-common-0.96.2-hadoop2.jar to all Hadoop nodes (including master) under Hadoop/lib
4. Copy the Hbase-site.xml file under Hbase/conf to the hadoop/conf of all Hadoop nodes (including master).
note: If 3, 42 steps skipped, The following error is likely to occur when you run hive :
Org.apache.hadoop.hbase.ZooKeeperConnectionException: hbase is able-connect to ZooKeeper But the connection closes immediately.
this could is a sign that the server has too many connections (the default). Consider inspecting your ZK server logs for that error and
then Make sure is reusing hbaseconfiguration as often as you can. See Htable's Javadoc for more information. At Org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.
5. Start hive
: bin/hive -hiveconf Hbase.master=master : 60000
If Hive.aux.jars.path is not configured in the Hive-site.xml file, it can be started as follows.
hive--auxpath/opt/mapr/hive/hive-0.7.1/lib/ hive-hbase-handler-0.7.1.jar,/opt/mapr/hive/hive-0.7.1/lib/hbase-0.90.4.jar,/opt/mapr/hive/hive-0.7.1/lib/ Zookeeper-3.3.2.jar -hiveconf hbase.master=localhost:60000
cluster boot : bin/hive -hiveconf hbase.zookeeper.quorum=node1,node2,node3 (All zookeeper nodes)
Tested to modify hive configuration file Hive-site.xml, you can start hive federated HBase without adding parameters
< Property><name>Hive.zookeeper.quorum</name><value>Node1,node2,node3</value><Description>The list of zookeeper servers to talk to. This is a needed for read/write locks.</Description></ Property>
6. Test after startup
(1). Build HBase Table Hbase_student
HBase>Create'hbase_student' 'info'
(2). Build hive façade hive_studentand correspond to Hbase_student table
hive integrated hbase requires a mapping between the hive table and the HBase table, which is the column of the Hive table ( Columns) and the column type (columns types) are associated with the column family (column families) of the HBase table and the column qualifier (columns qualifiers).
HBase corresponds to the use of :key , to select a field in Hive, where the columns in the column family are in hive. Span style= "color: #ff00ff;" >cf:q .
create EXTERNAL table hive_student (Rowkey string, name string, age int , phone string ) STORED by " org.apache.hadoop.hive.hbase.hbasestoragehandler " with serdeproperties ("hbase.columns.mapping" = ": key ,info:name,info:age,info:phone ") tblproperties (" HBase. table . Name "= " hbase_student ");
7. Data import and Validation:
data_student
CREATE TABLE int by '\ t' '/test/hbase/tsv/input/' ;
(2). Data is imported into the Hbase_student table via Hive_student
SET hive.hbase. Bulk = true; INSERT TABLE SELECT from Data_student;
Note : If you encounter Java.lang.IllegalArgumentException:Property value must not being null exception, need to hive-0.13.0 and above version support
Data import (i): Hive on HBase