Hadoop (2): Install & Use Sqoop

Source: Internet
Author: User
Tags sqoop accumulo

The text of this text connection is: http://blog.csdn.net/freewebsys/article/details/47722393 not allowed to reprint without the Bo master.

1, about Sqoop

Sqoop is a tool that transfers data from Hadoop and relational databases to each other, and can import data from a relational database such as MySQL, Oracle, Postgres, etc. into Hadoop's HDFs. You can also import HDFs data into a relational database.

Official website: http://sqoop.apache.org/
A 1.4.6 version, one is 1.99 version (development version has not been completed, production environment is not recommended)

Document:
http://sqoop.apache.org/docs/1.4.6/

2, installation

Refer to the previous article Hadoop and hive are used for 2.0, so Sqoop here also uses 2.0, but Alpha. Unzip directly.

Configuring Environment variables

export  java_home=/usr/java/ Defaultexport  class_path= $JAVA _home /lib< Span class= "Hljs-keyword" >export  path= $JAVA _home /bin: $PATH  export  hadoop_home=/data/hadoop export  Path= $HADOOP _home /bin: $PATH  export  hive_home=/data/apache-hiveexport  PATH= $HIVE _home /bin: $PATH   Export  sqoop_home=/data/sqoopexport  path=$ Sqoop_home /bin: $PATH   

Sqoop will check the HBASE environment variable when it is started, no need, direct comment
/data/sqoop/bin/configure-sqoop 128 lines to 147 rows.

     -# # Moved toBe a runtime checkinchSqoop.129#if[!-D"${hbase_home}"]; Then     the# echo"Warning: $HBASE _home does not exist! HBase imports would fail. "    131# echo' pleaseSet $HBASE _home toThe root ofYour HBase installation. ' the#fi133     134# # Moved toBe a runtime checkinchSqoop.135#if[!-D"${hcat_home}"]; Then    136# echo"Warning: $HCAT _home does not exist! Hcatalog jobs would fail. "    137# echo' pleaseSet $HCAT _home toThe root ofYour hcatalog installation. '138#fi139      $#if[!-D"${accumulo_home}"]; Then    141# echo"Warning: $ACCUMULO _home does not exist! Accumulo imports would fail. "    142# echo' pleaseSet $ACCUMULO _home toThe root ofYour Accumulo installation. '143#fi144#if[!-D"${zookeeper_home}"]; Then    145# echo"Warning: $ZOOKEEPER _home does not exist! Accumulo imports would fail. "    146# echo' pleaseSet $ZOOKEEPER _home toThe root ofYour Zookeeper installation. '147#fi

Pre-written blogs for Hadoop and hive references:
http://blog.csdn.net/freewebsys/article/details/47617975

The Sqoop command is mainly divided into data import into Hadoop, and data exported from Hadoop to MySQL.

First create a MySQL database blog, create a Hadoop user Operation Blog Library, the blog library to create a msg table, insert 6 records

 CREATE DATABASE blog DEFAULT CHARACTER SET UTF8 COLLATE UTF8 _general_ci;  GRANT  all privileges  the blog.*  to [email protected]"%" Identified  by "Sqoop";FLUSH privileges;# #创建msg和 msg_hive Data sheet:CREATE TABLE ' msg_hive '(' id 'bigint -) not NULL,' gid 'bigint -)DEFAULT NULL,' content ' varchar(4000),' Create_time 'DatetimeDEFAULT NULLCOMMENT' Create Time ',PRIMARY KEY(' id ',' gid ')) Engine=myisamDEFAULTCharset=utf8partition by KEY(' gid ');CREATE TABLE ' msg '(' id 'bigint -) not NULL,' gid 'bigint -)DEFAULT NULL,' content ' varchar(4000),' Create_time 'DatetimeDEFAULT NULLCOMMENT' Create Time ',PRIMARY KEY(' id ',' gid ')) Engine=myisamDEFAULTCharset=utf8partition by KEY(' gid ');#插入测试数据. Insert  into ' msg '(id,gid,content,create_time) values(1,1 ,' Zhang San One ', now ()); Insert  into ' msg '(id,gid,content,create_time) values(1, 2,' Zhang San ', Now ()); Insert  into ' msg '(id,gid,content,create_time) values(1,3 ,' Zhang San ', Now ()); Insert  into ' msg '(id,gid,content,create_time) values(2, 1,' li si one ', now ()); Insert  into ' msg '(id,gid,content,create_time) values(2,2 ,' li si ', now ()); Insert  into ' msg '(id,gid,content,create_time) values(2, 3,' li si ', now ());
3,sqoop use, import, export

First test the next database connection execution Select.

‘select now()‘##执行结果:-----------------------| now()               | -----------------------| 2015-08-18 17:22:26.0 | -----------------------

importing MySQL data into hive is actually imported into Hadoop, where you need to specify that the output directory is the warehouse directory for hive:

SqoopImport --Direct--Connect Jdbc:mysql://127.0.0.1:3306/blog--username sqoop--password sqoop--table msg \--Fields-terminated-by "\001" --Lines-terminated-by "\ n" --Delete-target-dir --NULL-string ' \\n ' --NULL-non-string ' \\n ' --Target-dir/user/hive/warehouse/msg

Parameter a large heap, set the delimiter, set NULL. Finally, specify the warehouse directory for hive.

However, Hive does not recognize the table and must be created in hive.

Sqoop Create-Hive-Table --Hive-Table msg --Connect jdbc:mysql://127.0.0.1:3306/blog --username Sqoop --Password Sqoop --Table msg

Import the hive data into MySQL.

Sqoop Export --Direct --Connect jdbc:mysql://127.0.0.1:3306/blog --username Sqoop --Password Sqoop --Table msg_hive \-- Fields-terminated- by "\001" --Lines-terminated- by "\ n" --Export-dir /user/hive/warehouse/msg

The same need to configure Export-dir, configure MySQL data table msg_hive

Check Results:(view data in hive and MySQL, respectively)

Hive> SELECT * from Msg;ok1 1 Zhang San 2015-08-17 12:11:321 2 Zhang San 22 2015-08-17       12:11:331 3 Zhang San 2015-08-17 12:11:332 1 li si 11 2015-08-17 12:11:332 2 Li si 2015-08-17 12:11:332 3 li si 2015-08-17 12:11:33time taken:0.105 seconds, fetc Hed:6 Row (s)mysql> SELECT * FROM msg_hive;+----+-----+--------------+---------------------+| id | gid | content | create_time |+----+-----+--------------+---------------------+|   2 | 1 | Li Si 11 |  2015-08-17 12:11:33 | |   2 | 2 | Li Si 22 |  2015-08-17 12:11:33 | |   2 | 3 | Li Si 33 |  2015-08-17 12:11:33 | |   1 | 3 | Zhang San 33 |  2015-08-17 12:11:33 | |   1 | 1 | Zhang San 11 | 2015-08-17 12:11:32 ||   1 | 2 | Zhang San 22 | 2015-08-17 12:11:33 |+----+-----+--------------+---------------------+6 rows in Set (0.00 sec)
4, summary

The text of this text connection is: http://blog.csdn.net/freewebsys/article/details/47722393 not allowed to reprint without the Bo master.

The Sqoop provides import and export features, which makes it easy to migrate MySQL and hive data.
Business data needs to be migrated to Hadoop for calculation, and the results are placed in a MySQL database for statistical display.
The data can be easily flowed.

Reference:
http://segmentfault.com/a/1190000002532293
The parameters written are very detailed.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Hadoop (2): Install & Use Sqoop

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.