Http://www.it165.net/admin/html/201406/3239.html
Https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
Create Table: hbase_hive_1
REATE TABLE hbase_hive_1(key int, value string) STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler‘ WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz");
Create a partitioned table: hbase_hive_2
Create Table hbase_hive_2 (Key int, value string) partitioned by (day string) stored by 'org. apache. hadoop. hive. hbase. hbasestoragehandler' with serdeproperties ("hbase. columns. mapping "=": Key, CF1: Val ") tblproperties (" hbase. table. name "=" xyz2 ");
Create a table pokes:
Create Table pokes (FOO int, bar string) Row format delimited fields terminated ',';
hivedata:101, zhanggsan1001, lisi 102, wangwu
load data local inpath ‘/out/hivedata‘ overwrite into table pokes;
Use SQL to import hbase_table_1
SET hive.hbase.bulk=true;insert overwrite table hbase_hive_1 select * from pokes;
Import Partitioned Tables
insert overwrite table hbase_hive_2 partition (day=‘2014-07-29‘) select * from pokes;
Run:
insert overwrite table hbase_hive_1 select * from pokes;
The error is as follows:
Task with the most failures(4): -----Task ID: task_1406541025007_0008_m_000000URL: http://master:8088/taskdetails.jsp?jobid=job_1406541025007_0008&tipid=task_1406541025007_0008_m_000000-----Diagnostic Messages for this Task:Container [pid=9545,containerID=container_1406541025007_0008_01_000005] is running beyond virtual memory limits. Current usage: 263.0 MB of 1 GB physical
memory used; 3.6 GB of 2.1 GB virtual memory used. Killing container.Dump of the process-tree for container_1406541025007_0008_01_000005 :|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE|- 9545 2012 9545 9545 (java) 598 19 3909980160 67320 /home/hadoop/jdk1.7.0_51/bin/java -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN
-Xmx3072m -Djava.io.tmpdir=/data/tmp/nm-local-dir/usercache/hadoop/appcache/application_1406541025007_0008/
container_1406541025007_0008_01_000005/tmp -Dlog4j
.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/home/hadoop/hadoop-2.0.0-cdh4.5.0/
logs/userlogs/application_1406541025007_0008
/container_1406541025007_0008_01_000005 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild
192.168.1.153 52225 attempt_1406541025007_0008_m_000000_3 5
Exception solution:
Copy the hbase-site.xml file under hbase/conf to hadoop/conf on all hadoop nodes
Hive> insert overwrite table hbase_hive_1 select * From pokes; 13:40:46, 601 stage-0 map = 0%, reduce = 0% 13:40:52, 862 stage-0 map = 100%, reduce = 0%, cumulative CPU 1.92 secjob 0: Map: 1 Cumulative CPU: 1.92 sec HDFS read: 244 HDFS write: 0 successtotal mapreduce CPU time spent: 1 seconds 920 msecoktime taken: 12.742 secondshive> select * From hbase_hive_1; ok1001 Lisi 101 zhanggsan102 wangwutime taken: 0.135 seconds
Query hbase:
Hbase (main): 023: 0> scan 'xyz'
Row column + cell
1001 column = CF1: Val, timestamp = 1406612452333, value = Lisi \ x09
101 column = CF1: Val, timestamp = 1406612452333, value = zhanggsan
102 column = CF1: Val, timestamp = 1406612452333, value = wangwu