The text of this text connection is: http://blog.csdn.net/freewebsys/article/details/47722393 not allowed to reprint without the Bo master.
1, about Sqoop
Sqoop is a tool that transfers data from Hadoop and relational databases to each other, and can import data from a relational database such as MySQL, Oracle, Postgres, etc. into Hadoop's HDFs. You can also import HDFs data into a relational database.
Official website: http://sqoop.apache.org/
A 1.4.6 version, one is 1.99 version (development version has not been completed, production environment is not recommended)
Document:
http://sqoop.apache.org/docs/1.4.6/
2, installation
Refer to the previous article Hadoop and hive are used for 2.0, so Sqoop here also uses 2.0, but Alpha. Unzip directly.
Configuring Environment variables
export java_home=/usr/java/ Defaultexport class_path= $JAVA _home /lib< Span class= "Hljs-keyword" >export path= $JAVA _home /bin: $PATH export hadoop_home=/data/hadoop export Path= $HADOOP _home /bin: $PATH export hive_home=/data/apache-hiveexport PATH= $HIVE _home /bin: $PATH Export sqoop_home=/data/sqoopexport path=$ Sqoop_home /bin: $PATH
Sqoop will check the HBASE environment variable when it is started, no need, direct comment
/data/sqoop/bin/configure-sqoop 128 lines to 147 rows.
-# # Moved toBe a runtime checkinchSqoop.129#if[!-D"${hbase_home}"]; Then the# echo"Warning: $HBASE _home does not exist! HBase imports would fail. " 131# echo' pleaseSet $HBASE _home toThe root ofYour HBase installation. ' the#fi133 134# # Moved toBe a runtime checkinchSqoop.135#if[!-D"${hcat_home}"]; Then 136# echo"Warning: $HCAT _home does not exist! Hcatalog jobs would fail. " 137# echo' pleaseSet $HCAT _home toThe root ofYour hcatalog installation. '138#fi139 $#if[!-D"${accumulo_home}"]; Then 141# echo"Warning: $ACCUMULO _home does not exist! Accumulo imports would fail. " 142# echo' pleaseSet $ACCUMULO _home toThe root ofYour Accumulo installation. '143#fi144#if[!-D"${zookeeper_home}"]; Then 145# echo"Warning: $ZOOKEEPER _home does not exist! Accumulo imports would fail. " 146# echo' pleaseSet $ZOOKEEPER _home toThe root ofYour Zookeeper installation. '147#fi
Pre-written blogs for Hadoop and hive references:
http://blog.csdn.net/freewebsys/article/details/47617975
The Sqoop command is mainly divided into data import into Hadoop, and data exported from Hadoop to MySQL.
First create a MySQL database blog, create a Hadoop user Operation Blog Library, the blog library to create a msg table, insert 6 records
CREATE DATABASE blog DEFAULT CHARACTER SET UTF8 COLLATE UTF8 _general_ci; GRANT all privileges the blog.* to [email protected]"%" Identified by "Sqoop";FLUSH privileges;# #创建msg和 msg_hive Data sheet:CREATE TABLE ' msg_hive '(' id 'bigint -) not NULL,' gid 'bigint -)DEFAULT NULL,' content ' varchar(4000),' Create_time 'DatetimeDEFAULT NULLCOMMENT' Create Time ',PRIMARY KEY(' id ',' gid ')) Engine=myisamDEFAULTCharset=utf8partition by KEY(' gid ');CREATE TABLE ' msg '(' id 'bigint -) not NULL,' gid 'bigint -)DEFAULT NULL,' content ' varchar(4000),' Create_time 'DatetimeDEFAULT NULLCOMMENT' Create Time ',PRIMARY KEY(' id ',' gid ')) Engine=myisamDEFAULTCharset=utf8partition by KEY(' gid ');#插入测试数据. Insert into ' msg '(id,gid,content,create_time) values(1,1 ,' Zhang San One ', now ()); Insert into ' msg '(id,gid,content,create_time) values(1, 2,' Zhang San ', Now ()); Insert into ' msg '(id,gid,content,create_time) values(1,3 ,' Zhang San ', Now ()); Insert into ' msg '(id,gid,content,create_time) values(2, 1,' li si one ', now ()); Insert into ' msg '(id,gid,content,create_time) values(2,2 ,' li si ', now ()); Insert into ' msg '(id,gid,content,create_time) values(2, 3,' li si ', now ());
3,sqoop use, import, export
First test the next database connection execution Select.
‘select now()‘##执行结果:-----------------------| now() | -----------------------| 2015-08-18 17:22:26.0 | -----------------------
importing MySQL data into hive is actually imported into Hadoop, where you need to specify that the output directory is the warehouse directory for hive:
SqoopImport --Direct--Connect Jdbc:mysql://127.0.0.1:3306/blog--username sqoop--password sqoop--table msg \--Fields-terminated-by "\001" --Lines-terminated-by "\ n" --Delete-target-dir --NULL-string ' \\n ' --NULL-non-string ' \\n ' --Target-dir/user/hive/warehouse/msg
Parameter a large heap, set the delimiter, set NULL. Finally, specify the warehouse directory for hive.
However, Hive does not recognize the table and must be created in hive.
Sqoop Create-Hive-Table --Hive-Table msg --Connect jdbc:mysql://127.0.0.1:3306/blog --username Sqoop --Password Sqoop --Table msg
Import the hive data into MySQL.
Sqoop Export --Direct --Connect jdbc:mysql://127.0.0.1:3306/blog --username Sqoop --Password Sqoop --Table msg_hive \-- Fields-terminated- by "\001" --Lines-terminated- by "\ n" --Export-dir /user/hive/warehouse/msg
The same need to configure Export-dir, configure MySQL data table msg_hive
Check Results:(view data in hive and MySQL, respectively)
Hive> SELECT * from Msg;ok1 1 Zhang San 2015-08-17 12:11:321 2 Zhang San 22 2015-08-17 12:11:331 3 Zhang San 2015-08-17 12:11:332 1 li si 11 2015-08-17 12:11:332 2 Li si 2015-08-17 12:11:332 3 li si 2015-08-17 12:11:33time taken:0.105 seconds, fetc Hed:6 Row (s)mysql> SELECT * FROM msg_hive;+----+-----+--------------+---------------------+| id | gid | content | create_time |+----+-----+--------------+---------------------+| 2 | 1 | Li Si 11 | 2015-08-17 12:11:33 | | 2 | 2 | Li Si 22 | 2015-08-17 12:11:33 | | 2 | 3 | Li Si 33 | 2015-08-17 12:11:33 | | 1 | 3 | Zhang San 33 | 2015-08-17 12:11:33 | | 1 | 1 | Zhang San 11 | 2015-08-17 12:11:32 || 1 | 2 | Zhang San 22 | 2015-08-17 12:11:33 |+----+-----+--------------+---------------------+6 rows in Set (0.00 sec)
4, summary
The text of this text connection is: http://blog.csdn.net/freewebsys/article/details/47722393 not allowed to reprint without the Bo master.
The Sqoop provides import and export features, which makes it easy to migrate MySQL and hive data.
Business data needs to be migrated to Hadoop for calculation, and the results are placed in a MySQL database for statistical display.
The data can be easily flowed.
Reference:
http://segmentfault.com/a/1190000002532293
The parameters written are very detailed.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Hadoop (2): Install & Use Sqoop