Installation of 1.sqoop
1.1 Integration with Hadoop and hive, modifying the/opt/cdh/sqoop-1.4.5-cdh5.3.6/conf/sqoop-env.sh file
1.2 Verifying that the installation is successful Bin/sqoop version view Sqoop versions
2.sqoop Basic Operation
2.1 View Sqoop available commands Bin/sqoop Help
2.2 Viewing the specific usage of a command bin/sqoop help command
2.3 Viewing the list of available databases
Bin/sqoop list-databases \
--connect jdbc:mysql://life-hadoop.life.com:3306/\
--username root \
--password Root
2.4 Importing data from a MySQL table into HDFs
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--table my_user \
--target-dir/user/yanglin/sqoop/import/my_user \
--num-mappers 1
Note: If you do not specify an imported directory--target-dir will be imported to the user root directory by default/user/yanglin/
The default export format is Textfile, with fields in the middle, separated by
The export format can be obtained by
--as-avrodatafile
--as-parquetfile
--as-sequencefile
--as-textfile Default
You can also customize the export of only the specified columns--columns id,account
The import data can be cleaned by SQL statements--query ' select ID, account from My_user ' where statement must contain $conditions
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--query ' select ID, account from my_user where $CONDITIONS ' \
--target-dir/user/yanglin/sqoop/import/my_user_query \
--num-mappers 1 \
--as-parquetfile
You can specify the compression format--compress and--compression-codec
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--query ' select ID, account from my_user where $CONDITIONS ' \
--target-dir/user/yanglin/sqoop/import/my_user_compress \
--num-mappers 1 \
--as-parquetfile \
--compress \
--compression-codec Org.apache.hadoop.io.compress.SnappyCodec
You can specify the delimiter between the fields in which the data is imported, and it is implied that--fields-terminated-by ' \ t '
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--table my_user \
--target-dir/user/yanglin/sqoop/import/my_user_terminated \
--num-mappers 1 \
--fields-terminated-by ' \ t '
You can delete the output directory if it exists before you run it by specifying--DELETE-TARGET-DIR
2.4.2 Incremental Import
Method One: Use--query to import the specified input
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--query ' select * from My_user where $CONDITIONS and id>3 and id<7 ' \
--target-dir/user/yanglin/sqoop/import/my_user_query_where \
--num-mappers 1 \
--fields-terminated-by ' \ t '
Method Two: Use the Sqoop command, specify the--check-column (based on which column for the incremental import),--incremental (increment is appended),--last-value (the last data ID value)
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--table my_user \
--target-dir/user/yanglin/sqoop/import/my_user_table_where \
--num-mappers 1 \
--fields-terminated-by ' \ t ' \
--check-column ID \
--incremental append \
--last-value 3 \
--direct
2.4.3 by specifying--direct, we can specify that data will be exported via MySQL, faster
2.5 Data export to RDMS
Bin/sqoop export \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--username root \
--password root \
--table my_user \
--export-dir/user/yanglin/sqoop/export/user \
--num-mappers 1 \
--fields-terminated-by ' \ t ' \
--direct
2.6 Importing data from RDMB into a hive table
Bin/sqoop import \
--connect jdbc:mysql://life-hadoop.life.com:3306/sqoop \
--table my_user \
--username root \
--password root \
--num-mappers 1 \
--fields-terminated-by ' \ t ' \
--hive-import \
--hive-database chd_hive \
--hive-table User_import
3. Using scripts for Sqoop import and export
3.1 Scripting Files Sqoop_import.sql
#使用文件对sqoop进行导入
Import
#mysql数据库连接
--connect
Jdbc:mysql://life-hadoop.life.com:3306/sqoop
#mysql数据库用户名
--username
Root
#mysql数据库密码
--password
Root
#mysql数据库要导出的表名
--table
My_user
#导出到hdfs系统中路径
--target-dir
/user/yanglin/sqoop/import/my_user_options
#导出数据中字段的分割符
--fields-terminated-by
' \ t '
3.2 Running the script--options-file
Bin/sqoop--options-file/opt/data/sqoop_import.sql--num-mappers 1
Use of Sqoop