Export data from hive to MySQL

Source: Internet
Author: User
Tags key string null null sqoop hadoop fs

Export data from hive to MySQL

Http://abloz.com

2012.7.20

Author: Zhou Haihan


In the previous article, "Data interoperability between MySQL and HDFs systems using Sqoop", it was mentioned that Sqoop can interoperate data between RDBMS and HDFs, and also support importing from MySQL to hbase, but importing MySQL directly from HBase is not directly supported. But indirect support. Either export HBase to an HDFs flat file, or export it to hive and export it to MySQL. This article is about exporting from hive to MySQL.
Export data from hive to MySQL

First, create MySQL table
Mysql> CREATE TABLE Award (Rowkey varchar (255), ProductID int, Matchid varchar (255), rank varchar (255), Tourneyid Varch AR (255), userid bigint, GameID int, gold int, loginID varchar (255), Nick varchar (255), plat varchar (255)); Query OK, 0 rows affected (0.01 sec)
Second, try to use hive as an external database to connect hbase, import MySQL
 hive> CREATE EXTERNAL TABLE hive_award (key string, ProductID Int,matchid string, rank string, Tourneyid string, US Erid Bigint,gameid int,gold int,loginid string,nick String,plat string) STORED by ' Org.apache.hadoop.hive.hbase.HBaseStorageHandler ' with serdeproperties ("hbase.columns.mapping" = ": Key,info:mpid, Info:matchid,info:rank,info:tourneyid,info:userid,info:gameid,info:gold,info:loginid,info:nickname,info: Platform ") tblproperties (" Hbase.table.name "=" award "); 
Hive> desc Hive_award;key string from Deserializerproductid int from Deserializermatchid string from Deserializerrank s Tring from Deserializertourneyid string from Deserializeruserid bigint to Deserializergameid int from Deserializergold i NT from Deserializerloginid string from Deserializernick string from Deserializerplat string from Deserializer[[email prot Ected] ~]$ Hadoop fs-ls/user/hive/warehouse/found 3 itemsdrwxr-xr-x-zhouhh supergroup 0 2012-07-16 14:08/user/hive/wa Rehouse/hive_awarddrwxr-xr-x-zhouhh supergroup 0 2012-07-16 14:30/user/hive/warehouse/nnnondrwxr-xr-x-zhouhh SuperGroup 0 2012-07-16 13:53/user/hive/warehouse/test222
[Email protected] ~]$ Sqoop export--connect jdbc:mysql://hadoop48/toplists-m 1--table award--export-dir/user/hive/wa Rehouse/hive_award--input-fields-terminated-by ' \0001 ' 12/07/19 16:13:06 INFO Manager. Mysqlmanager:preparing to use a MySQL streaming resultset.
12/07/19 16:13:06 INFO tool. Codegentool:beginning code GENERATION12/07/19 16:13:06 INFO Manager. Sqlmanager:executing SQL statement:select t.* from ' award ' as T LIMIT 112/07/19 16:13:06 INFO orm. Compilationmanager:hadoop_home is/home/zhouhh/hadoop-1.0.0/libexec/. Note:/tmp/sqoop-zhouhh/compile/4366149f0b6dd311c5b622594744fbb0/award.java uses or overrides outdated APIs. Note: For more information, please recompile using-xlint:deprecation. 12/07/19 16:13:08 INFO Orm. compilationmanager:writing jar File:/tmp/sqoop-zhouhh/compile/4366149f0b6dd311c5b622594744fbb0/award.jar12/07/19 16:13:08 INFO MapReduce. Exportjobbase:beginning export of Award12/07/19 16:13:09 WARN MapReduce. Exportjobbase:input path Hdfs://hadoop46:9200/user/hive/warehouse/hive_award contains no FILES12/07/19 16:13:11 INFO Input. Fileinputformat:total input paths to PROCESS:012/07/19 16:13:11 INFO input. Fileinputformat:total input paths to Process:012/07/19 16:13:13 INFO mapred. Jobclient:running job:job_201207191159_005912/07/19 16:13:14 INFO mapred. Jobclient:Map 0% reduce 0%12/07/19 16:13:26 INFO mapred. Jobclient:job complete:job_201207191159_005912/07/19 16:13:26 INFO mapred. Jobclient:counters:412/07/19 16:13:26 INFO mapred. Jobclient:job counters12/07/19 16:13:26 INFO mapred. Jobclient:slots_millis_maps=799312/07/19 16:13:26 INFO mapred. Jobclient:total time spent by all reduces waiting after reserving slots (ms) =012/07/19 16:13:26 INFO mapred. Jobclient:total time spent by all maps waiting after reserving slots (ms) =012/07/19 16:13:26 INFO mapred. Jobclient:slots_millis_reduces=012/07/19 16:13:26 INFO MapReduce. exportjobbase:transferred 0 bytes in 16.9678 seconds (0 bytes/sec) 12/07/19 16:13:26 INFO MapReduce. exportjobbase:exported 0 Records.
Direct external table not successful, Input path Hdfs://hadoop46:9200/user/hive/warehouse/hive_award contains no files
Create a table in hive that links hbase, and inserting in hive will cause data changes to HBase:
CREATE TABLE hive_award_data (key string,productid int,matchid string,rank String,tourneyid string,userid Bigint,gameid Int,gold Int,loginid string,nick String,plat string) STORED by ' Org.apache.hadoop.hive.hbase.HBaseStorageHandler ' with Serdeproperties ("hbase.columns.mapping" = ": Key,info:mpid,info:matchid,info:rank,info:tourneyid,info:userid,info : Gameid,info:gold,info:loginid,info:nickname,info:platform ") tblproperties (" hbase.table.name "=" award_test "); hive> Insert Overwrite table hive_award_data Select * from Hive_award limit 2;hbase (main):014:0> scan ' award_test ' RO W Column+cell 2012-04-27 06:55:00:402713629 column=info:mpid, timestamp=1342754799918, value=5947 2012-04-27 06:55:00:402713629 Column=info:matchid, timestamp=1342754799918, value=433203828 2012-04-27 06:55:00:402713629 Column=info:rank, timestamp=1342754799918, value=2 2012-04-27 06:55:00:402713629 Column=info:tourneyid, timestamp= 1342754799918, value=4027102 2012-04-27 06:55:00:402713629 Column=info:userid, Timestamp=1342754799918, value=402713629 2012-04-27 06:55:00:402713629 Column=info:gameid, timestamp=1342754799918, value =1001 2012-04-27 06:55:00:402713629 Column=info:loginid, timestamp=1342754799918, value=715878221 2012-04-27 06:55:00:402713629 Column=info:nickname, timestamp=1342754799918, value=xxx 2012-04-27 06:55:00:402713629 column= Info:platform, timestamp=1342754799918, Value=ios 2012-04-27 06:55:00:402713629 Column=info:userid, timestamp= 1342754445451, value=402713629 2012-04-27 06:55:00:406788559 column=info:mpid, timestamp=1342754799918, value=778 2012-04-27 06:55:00:406788559 Column=info:matchid, timestamp=1342754799918, value=433203930 2012-04-27 06:55:00:406788559 Column=info:rank, timestamp=1342754799918, value=19 2012-04-27 06:55:00:406788559 column=info: Tourneyid, timestamp=1342754799918, value=4017780 2012-04-27 06:55:00:406788559 Column=info:userid, timestamp= 1342754799918, value=406788559 2012-04-27 06:55:00:406788559 Column=info:gameid, timestamp=1342754799918, value=1001 2012-04-27 06:55:00:406788559 Column=info:gold, timestamp=1342754799918, value=1 2012-04-27 06:55:00:406788559 Column=info:loginid, timestamp=1342754799918, value=13835155880 2012-04-27 06:55:00:406788559 column=info:nickName, timestamp=1342754799918, value=xxx 2012-04-27 06:55:00:406788559 column=info:platform, timestamp=1342754799918, Value=android2 row (s) in 0.0280 seconds[[email protected] ~]$ sqoop export--connect jdbc:mysql://hadoop48/ TOPLISTS-M 1--table award--export-dir/user/hive/warehouse/hive_award_data--input-fields-terminated-by ' \0001 ' 12/ 07/20 11:32:01 WARN MapReduce. Exportjobbase:input path Hdfs://hadoop46:9200/user/hive/warehouse/hive_award_data contains no files

Create a table that connects HBase, or you cannot import it.

Iv. Creating a Hive table to import data from hbase external tables
Hive> CREATE TABLE hive_myaward (key string,productid int,matchid string,rank String,tourneyid string,userid bigint, GameID int,gold Int,loginid string,nick string,plat string);hive> Insert Overwrite table Hive_myaward SELECT * from HIV E_award limit 2;
Hive> SELECT * from hive_myaward;ok2012-04-27 06:55:00:402713629 5947 433203828 2 4027102 402713629 1001 NULL 715878221 Kill Day a ios2012-04-27 06:55:00:406788559 778 433203930 19 4017780 406788559 1001 1 13835155880 Pro New New Dandan androidtime taken:2.25 7 Seconds[[email protected] ~]$ sqoop export--connect jdbc:mysql://hadoop48/toplists-m 1--table award--export-dir/use R/hive/warehouse/hive_myaward--input-fields-terminated-by ' \0001 ' java.io.IOException: Com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException:Access denied for user "@ ' Hadoop48 ' to database ' toplists '

Permission issues, then authorize

Mysql> GRANT All privileges on * * to ' @ ' Hadoop48 '; Query OK, 0 rows affected (0.03 sec) mysql> GRANT all privileges on * * to ' @ ' localhost '; Query OK, 0 rows Affected (0.00 sec)
V. Troubleshoot NULL NULL values encountered in hive:
[Email protected] ~]$ Sqoop export--connect jdbc:mysql://hadoop48/toplists-m 1--table award--export-dir/user/hive/wa Rehouse/hive_myaward--input-fields-terminated-by ' \0001 ' ... 12/07/20 11:49:25 INFO mapred. Jobclient:map 0% reduce 0%12/07/20 11:49:37 INFO mapred. Jobclient:task Id:attempt_201207191159_0227_m_000000_0, Status:FAILEDjava.lang.NumberFormatException:For input string: "\ n" at Java.lang.NumberFormatException.forInputString (numberformatexception.java:65)

What the hell is that?

[Email protected] ~]$ Hadoop fs-cat/user/hive/warehouse/hive_myaward/000000_0 2012-04-27 06:55:00:4027136295947433203828240271024027136291001\n715878221 Kill the Day aios2012-04-27 06:55:00:4067885597784332039301940177804067885591001113835155880 Pro New New Dandan Android
Hive> SELECT * from hive_myaward;ok2012-04-27 06:55:00:402713629 5947 433203828 2 4027102 402713629 1001 NULL 715878221 Kill Day a ios2012-04-27 06:55:00:406788559 778 433203930 19 4017780 406788559 1001 1 13835155880 Pro New New Dandan androidtime taken:2.25 7 seconds

Since Hive's null is denoted by \ n, the field is split with \ 01来, newline is wrapped with \ n, so you need to add the appropriate instructions, note the escape character \:
See: https://issues.cloudera.org/browse/SQOOP-188

[Email protected] ~]$ Sqoop export--connect jdbc:mysql://hadoop48/toplists-m 1--table award--export-dir/user/hive/wa Rehouse/hive_myaward/000000_0--input-null-string "\\\\n"--input-null-non-string "\\\\N"-- Input-fields-terminated-by "\\01"--input-lines-terminated-by "\\n"
12/07/20 12:53:56 INFO Manager. Mysqlmanager:preparing to use a MySQL streaming resultset.12/07/20 12:53:56 INFO tool. Codegentool:beginning code GENERATION12/07/20 12:53:56 INFO Manager. Sqlmanager:executing SQL statement:select t.* from ' award ' as T LIMIT 112/07/20 12:53:56 INFO orm. Compilationmanager:hadoop_home is/home/zhouhh/hadoop-1.0.0/libexec/. Note:/tmp/sqoop-zhouhh/compile/4427d3db678bb145c995073e0924dc0b/award.java uses or overrides outdated APIs. Note: For more information, please recompile using-xlint:deprecation. 12/07/20 12:53:57 ERROR Orm. Compilationmanager:could not Rename/tmp/sqoop-zhouhh/compile/4427d3db678bb145c995073e0924dc0b/award.java to/home/ ZHOUHH/./AWARD.JAVA12/07/20 12:53:57 INFO Orm. compilationmanager:writing jar File:/tmp/sqoop-zhouhh/compile/4427d3db678bb145c995073e0924dc0b/award.jar12/07/20 12:53:57 INFO MapReduce. Exportjobbase:beginning export of award12/07/20 12:53:58 INFO input. Fileinputformat:total input paths to process:112/07/20 12:53:58 INFO input. Fileinputformat:total input paths to PROCESS:112/07/20 12:53:58 INFO mapred. Jobclient:running job:job_201207191159_023212/07/20 12:53:59 INFO mapred. Jobclient:map 0% reduce 0%12/07/20 12:54:12 INFO mapred. Jobclient:map 100% reduce 0%12/07/20 12:54:17 INFO mapred. Jobclient:job complete:job_201207191159_023212/07/20 12:54:17 INFO mapred. JOBCLIENT:COUNTERS:1812/07/20 12:54:17 INFO mapred. Jobclient:job counters12/07/20 12:54:17 INFO mapred. JOBCLIENT:SLOTS_MILLIS_MAPS=1211412/07/20 12:54:17 INFO mapred. Jobclient:total time spent by all reduces waiting after reserving slots (ms) =012/07/20 12:54:17 INFO mapred. Jobclient:total time spent by all maps waiting after reserving slots (ms) =012/07/20 12:54:17 INFO mapred. Jobclient:rack-local map tasks=112/07/20 12:54:17 INFO mapred. jobclient:launched map tasks=112/07/20 12:54:17 INFO mapred. JOBCLIENT:SLOTS_MILLIS_REDUCES=012/07/20 12:54:17 INFO mapred. Jobclient:file Output Format counters12/07/20 12:54:17 INFO mapred. Jobclient:bytes written=012/07/20 12:54:17 INFO mapreD.JOBCLIENT:FILESYSTEMCOUNTERS12/07/20 12:54:17 INFO mapred. JOBCLIENT:HDFS_BYTES_READ=33512/07/20 12:54:17 INFO mapred. JOBCLIENT:FILE_BYTES_WRITTEN=3017212/07/20 12:54:17 INFO mapred. Jobclient:file Input Format counters12/07/20 12:54:17 INFO mapred. Jobclient:bytes read=012/07/20 12:54:17 INFO mapred. Jobclient:map-reduce framework12/07/20 12:54:17 INFO mapred. Jobclient:map input records=212/07/20 12:54:17 INFO mapred. Jobclient:physical memory (bytes) snapshot=7869644812/07/20 12:54:17 INFO mapred. jobclient:spilled records=012/07/20 12:54:17 INFO mapred. Jobclient:cpu Time Spent (ms) =39012/07/20 12:54:17 INFO mapred. Jobclient:total committed heap usage (bytes) =5662310412/07/20 12:54:17 INFO mapred. Jobclient:virtual memory (bytes) snapshot=89178112012/07/20 12:54:17 INFO mapred. Jobclient:map output records=212/07/20 12:54:17 INFO mapred. JOBCLIENT:SPLIT_RAW_BYTES=12312/07/20 12:54:17 INFO MapReduce. Exportjobbase:transferred 335 bytes in 19.6631 seconds (17.037 bytes/sec) 12/07/20 12:54:17 INFO MapReduce. Exportjobbase:exported 2 records.

Export to MySQL success

mysql> use toplists; Reading table information for completion of table and column namesyou can turn off this feature to get a quicker startup W Ith-a
Database changedmysql> SELECT * from award;+-------------------------------+-----------+-----------+------+----- ------+-----------+--------+------+-------------+-------+---------+| Rowkey | ProductID | Matchid | Rank | Tourneyid | UserID | GameID | Gold | loginID | Nick | Plat |+-------------------------------+-----------+-----------+------+-----------+-----------+--------+------+-- -----------+-------+---------+| 2012-04-27 06:55:00:402713629 | 5947 | 433203828 | 2 | 4027102 | 402713629 | 1001 | NULL | 715878221 |??? A | iOS | | 2012-04-27 06:55:00:406788559 | 778 | 433203930 | 19 | 4017780 | 406788559 | 1001 | 1 | 13835155880 |????? | Android |+-------------------------------+-----------+-----------+------+-----------+-----------+--------+------ +-------------+-------+---------+2 rows in Set (0.00 sec)

Although there is data in MySQL, but the import is garbled
In the "Hive export to MySQL in the Chinese garbled problem" This article continues to resolve.

Excerpt from: http://abloz.com/2012/07/20/export-data-to-mysql-from-the-hive.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.