Alex's Hadoop cainiao Tutorial: 7th Sqoop2 import tutorial, hadoopsqoop2

Source: Internet
Author: User
Tags hdfs dfs sqoop

Alex's Hadoop cainiao Tutorial: 7th Sqoop2 import tutorial, hadoopsqoop2

For details about the installation and jdbc driver preparation, refer to section 6th. Now I will use an example to explain how to use sqoop2.

Data Preparation

There is a mysql table named worker, which contains three pieces of data. We want to import it to hadoop.

This is a table creation statement.

CREATE TABLE `workers` (    `id` int(11) NOT NULL AUTO_INCREMENT,    `name` varchar(20) NOT NULL,    PRIMARY KEY (`id`)  ) ENGINE=MyISAM  DEFAULT CHARSET=utf8; 

Insert three data items

insert into workers (name) values ('jack');insert into workers (name) values ('vicky');insert into workers (name) values ('martin');

Next, we use the sqoop client to import data.

Import data to establish database connection

$ sqoop2Sqoop home directory: /usr/lib/sqoop2Sqoop Shell: Type 'help' or '\h' for help.sqoop:000> create connection --cid 1

This sentence means to establish a connection with id 1, and then sqoop will let you enter some necessary parameters

Creating connection for connector with id 1Please fill following values to create new connection objectName: First connectionConfiguration configurationJDBC Driver Class: com.mysql.jdbc.DriverJDBC Connection String: jdbc:mysql://mysql.server/databaseUsername: sqoopPassword: *****JDBC Connection Properties:There are currently 0 values in the map:entry#Security related configuration optionsMax connections: 0New connection was successfully created with validation status FINE and persistent id 1

Remember to replace jdbc: mysql: // mysql. server/database with your actual database connection.

Create a job and create a job with id 1. The type is import.
sqoop:000> create job --xid 1 --type import

Next, sqoop will let you enter the required parameters, just enter the job name and Table name, and select 0 for several storage options. Press ENTER
Creating job for connection with id 1Please fill following values to create new job objectName: First jobDatabase configurationTable name: workersTable SQL statement:Table column names:Partition column name:Boundary query:Output configurationStorage type:  0 : HDFSChoose: 0Output format:   0 : TEXT_FILE  1 : SEQUENCE_FILEChoose: 0Compression format:   0 : NONE  1 : DEFAULT  2 : DEFLATE  3 : GZIP  4 : BZIP2  5 : LZO  6 : LZ4  7 : SNAPPYChoose: 0Output directory: /user/jarcec/usersNew job was successfully created with validation status FINE and persistent id 1

Run the start job command to execute the task. Use -- jid to input the task id.
sqoop:000> start job --jid 1Submission detailsJob ID: 1Server URL: http://localhost:12000/sqoop/Created by: rootCreation date: 2014-11-26 16:41:30 CSTLastly updated by: rootExternal ID: job_1406097234796_0006N/A2014-11-26 16:41:30 CST: BOOTING  - Progress is not available

Open an ssh terminal and run the hdfs command to view the result.
$ hdfs dfs -ls /user/jarcec/workers/Found 3 items-rw-r--r--   2 sqoop2 supergroup          0 2014-11-26 16:42 /user/jarcec/workers/_SUCCESS-rw-r--r--   2 sqoop2 supergroup          9 2014-11-26 16:41 /user/jarcec/workers/part-m-00000-rw-r--r--   2 sqoop2 supergroup         21 2014-11-26 16:42 /user/jarcec/workers/part-m-00001

We can see that three results files are generated, and then we cat to view the file content.
$ hdfs dfs -cat /user/jarcec/workers/part-m-000001,'jack'$ hdfs dfs -cat /user/jarcec/workers/part-m-000012,'vicky'3,'martin'

Write it here today. Next lesson: Export



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.