Using Sqoop1.4.4 to import data from a MySQL database table into HDFs

Source: Internet
Author: User
Tags deprecated relational database table sqoop hadoop fs

Questions Guide:

1 、--function of the Connect parameter?

2. Which parameter is used to read the database access password from the console?

3, Sqoop the relational database table data into the HDFS basic parameters requirements and commands?

4. The data is imported by default to the path in the HDFs file system?

5 、--The function of the columns parameter?

6 、--The WHERE parameter works?

I. Introduction of some key parameters

Parameter Description--connect <jdbc-uri>             Specify relational database JDBC Connection string--connection-manager <class-name>    Specifies the management class used by the database--driver <class-name>             manually specifies that you want to use the JDBC driver class--hadoop-mapred-home <dir>     overwrite overrides $hadoop_ Mapred_home--help      using the prompt Help prompt--password-file                     to set the path file containing the authentication password-p                             reads the database access password from the console--password <password >             Set Database authentication password--username <username>             Set Database access user--verbose                     print more program execution process information--connection-param-file <filename>   Optional Properties file for providing connection parameters
second, the MySQL database to be exported

[[email protected] ~]$ mysql-uhive-phive spicereading table information for completion of table and column namesyou  Can turn off this feature to get a quicker startup With-awelcome to the MySQL Monitor. Commands End With;  or \g.your MySQL connection ID is 419Server version:5.1.73 Source distributioncopyright (c) (+), Oracle and/or its Affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names trademarks of their respectiveowners. Type ' help ', ' or ' \h ' for help. Type ' \c ' to clear the current input statement.mysql> select * FROM users;+----+----------+----------+-----+---------+ ------------+-------+------+| ID | Username | password | sex | Content | datetime | vm_id | Isad |+----+----------+----------+-----+---------+------------+-------+------+| 56 | Hua | Hanyun | Male | Open |     2013-12-02 |    0 | 1 | | 58 | Feng | 123456 | Male | Open |     2013-11-22 |    0 | 0 | | 59 | Test |123456 | Male | Open |    2014-03-05 |    58 | 0 | | 60 | User1 | 123456 | Male | Open |    2014-06-26 |    66 | 0 | | 61 | User2 | 123 | Male | Open |    2013-12-13 |    56 | 0 | | 62 | User3 | 123456 | Male | Open |     2013-12-14 |    0 | 0 | | 64 | Kai.zhou |   123456 |?      | ?? |    2014-03-05 |    65 | 0 |+----+----------+----------+-----+---------+------------+-------+------+7 rows in Set (0.00 sec)

third, import the data from the users table in the above database into HDFs

To execute the import command, Specify a minimum database connection string, Access user name, access password, and table name to import , which by default will import the data into the /user/hadoopuser/< table name >/ directory under the HDFs root directory. You can also specify the import directory using the --target-dir parameter. As follows:

[[email protected] ~]$ sqoop import--connect jdbc:mysql://secondmgt:3306/spice--username hive--password Hive-- Table users--target-dir/output/sqoop/warning:/usr/lib/hcatalog does not exist! Hcatalog jobs would fail. Please set $HCAT _home to the root of your hcatalog installation.15/01/17 20:28:16 WARN tool. Basesqooptool:setting your password on the command-line is insecure. Consider Using-p instead.15/01/17 20:28:16 INFO Manager. Mysqlmanager:preparing to use a MySQL streaming resultset.15/01/17 20:28:16 INFO tool. Codegentool:beginning code GENERATION15/01/17 20:28:16 INFO Manager. Sqlmanager:executing SQL statement:select t.* from ' users ' as T LIMIT 115/01/17 20:28:16 INFO Manager. Sqlmanager:executing SQL statement:select t.* from ' users ' as-t LIMIT 115/01/17 20:28:16 INFO orm. Compilationmanager:hadoop_mapred_home Is/home/hadoopuser/cloud/hadoop/programs/hadoop-2.2.0note:/tmp/ Sqoop-hadoopuser/compile/c010e7410ec7339ef9b4d9dc2ddaac80/users.java uses or overrides a dePrecated API. Note:recompile with-xlint:deprecation for DETAILS.15/01/17 20:28:18 INFO ORM. compilationmanager:writing jar File:/tmp/sqoop-hadoopuser/compile/c010e7410ec7339ef9b4d9dc2ddaac80/users.jar15/ 01/17 20:28:18 WARN Manager. Mysqlmanager:it looks like you is importing from MYSQL.15/01/17 20:28:18 WARN Manager. Mysqlmanager:this transfer can be faster! Use the--DIRECT15/01/17 20:28:18 WARN Manager. Mysqlmanager:option to exercise a mysql-specific fast PATH.15/01/17 20:28:18 INFO Manager. Mysqlmanager:setting Zero DATETIME behavior to Converttonull (MySQL) 15/01/17 20:28:18 INFO MapReduce. importjobbase:beginning import of Users15/01/17 20:28:18 INFO Configuration.deprecation:mapred.job.tracker is Deprecated. Instead, use Mapreduce.jobtracker.addressSLF4J:Class path contains multiple slf4j bindings. Slf4j:found Binding in [jar:file:/home/hadoopuser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/common/lib/ Slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/staticloggerbinder.class]slf4j: Found binding in [jar:file:/home/hadoopuser/cloud/hbase/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/ Slf4j/impl/staticloggerbinder.class]slf4j:see http://www.slf4j.org/codes.html#multiple_bindings for an explanation . Slf4j:actual binding is of type [Org.slf4j.impl.log4jloggerfactory]15/01/17 20:28:18 INFO configuration.deprecation: Mapred.jar is deprecated. Instead, use MAPREDUCE.JOB.JAR15/01/17 20:28:19 INFO Configuration.deprecation:mapred.map.tasks is deprecated. Instead, use MAPREDUCE.JOB.MAPS15/01/17 20:28:19 INFO client. Rmproxy:connecting to ResourceManager at SECONDMGT/192.168.2.133:803215/01/17 20:28:20 INFO db. DataDrivenDBInputFormat:BoundingValsQuery:SELECT MIN (' id '), MAX (' id ') from ' users ' 15/01/17 20:28:20 INFO mapreduce. Jobsubmitter:number of Splits:415/01/17 20:28:20 INFO Configuration.deprecation:mapred.job.classpath.files is Deprecated. Instead, use MAPREDUCE.JOB.CLASSPATH.FILES15/01/17 20:28:20 INFO Configuration.deprecation:user.name is deprecated. INstead, use MAPREDUCE.JOB.USER.NAME15/01/17 20:28:20 INFO Configuration.deprecation:mapred.cache.files.filesizes is Deprecated. Instead, use MAPREDUCE.JOB.CACHE.FILES.FILESIZES15/01/17 20:28:20 INFO Configuration.deprecation:mapred.cache.files is deprecated. Instead, use MAPREDUCE.JOB.CACHE.FILES15/01/17 20:28:20 INFO Configuration.deprecation:mapred.reduce.tasks is Deprecated. Instead, use MAPREDUCE.JOB.REDUCES15/01/17 20:28:20 INFO Configuration.deprecation:mapred.output.value.class is Deprecated. Instead, use MAPREDUCE.JOB.OUTPUT.VALUE.CLASS15/01/17 20:28:20 INFO Configuration.deprecation:mapreduce.map.class is Deprecated. Instead, use MAPREDUCE.JOB.MAP.CLASS15/01/17 20:28:20 INFO Configuration.deprecation:mapred.job.name is deprecated. Instead, use MAPREDUCE.JOB.NAME15/01/17 20:28:20 INFO Configuration.deprecation:mapreduce.inputformat.class is Deprecated. Instead, use MAPREDUCE.JOB.INPUTFORMAT.CLASS15/01/17 20:28:20 INFO Configuration.deprecation:mapred.output.dir is Deprecated. Instead, use Mapreduce.output.fileoutputformat.outputdir15/01/17 20:28:20 INFO configuration.deprecation: Mapreduce.outputformat.class is deprecated. Instead, use Mapreduce.job.outputformat.class15/01/17 20:28:20 INFO configuration.deprecation: Mapred.cache.files.timestamps is deprecated. Instead, use Mapreduce.job.cache.files.timestamps15/01/17 20:28:20 INFO configuration.deprecation: Mapred.output.key.class is deprecated. Instead, use MAPREDUCE.JOB.OUTPUT.KEY.CLASS15/01/17 20:28:20 INFO Configuration.deprecation:mapred.working.dir is Deprecated. Instead, use Mapreduce.job.working.dir15/01/17 20:28:21 INFO mapreduce. Jobsubmitter:submitting tokens for JOB:JOB_1421373857783_000215/01/17 20:28:21 INFO impl. yarnclientimpl:submitted application application_1421373857783_0002 to ResourceManager at secondmgt/ 192.168.2.133:803215/01/17 20:28:21 INFO MapReduce. Job:the URL to track the JOB:HTTP://SECONDMGT:8088/PROXY/APPLICATION_1421373857783_0002/15/01/17 20:28:21 INFO Mapreduce. JOb:running job:job_1421373857783_000215/01/17 20:28:34 INFO mapreduce. Job:job job_1421373857783_0002 running in Uber mode:false15/01/17 20:28:34 INFO MapReduce. Job:map 0% reduce 0%15/01/17 20:28:44 INFO mapreduce. Job:map 25% reduce 0%15/01/17 20:28:49 INFO mapreduce. Job:map 75% reduce 0%15/01/17 20:28:54 INFO mapreduce. Job:map 100% reduce 0%15/01/17 20:28:54 INFO mapreduce. Job:job job_1421373857783_0002 completed SUCCESSFULLY15/01/17 20:28:54 INFO mapreduce. job:counters:27 File System Counters file:number of bytes read=0 File:number of Byte                s written=368040 file:number of Read operations=0 file:number of large read operations=0  File:number of Write Operations=0 hdfs:number of bytes read=401 Hdfs:number of bytes written=288 Hdfs:number of Read operations=16 hdfs:number of large read operatio Ns=0 Hdfs:number ofWrite operations=8 Job Counters launched map tasks=4 other local map tasks=4 Total time spent by all maps in occupied slots (MS) =174096 total time spent by all reduces in Occupi                Ed Slots (ms) =0 map-reduce Framework map input records=7 map output records=7 Input split bytes=401 spilled records=0 Failed Shuffles=0 merged Map Outputs=0 GC Time Elapsed (ms) =205 CPU Time Spent (ms) =10510 physical memory (bytes) snapshot=599060480 Virtual Memory (bytes) snapshot=3535720448 total committed heap U                Sage (bytes) =335544320 file Input format Counters bytes read=0 file Output format Counters Bytes written=28815/01/17 20:28:54 INFO mapreduce. Importjobbase:transferred 288 bytes in 35.2792 seconds (8.1635 bytes/sec) 15/01/1720:28:54 INFO MapReduce. Importjobbase:retrieved 7 Records.
Iv. viewing data on an import HDFs

[Email protected] ~]$ Hadoop Fs-cat/output/sqoop/*56,hua,hanyun, Male, open, 2013-12-02,0,158,feng,123456, male, open, 2013-11-22,0,059,test,123456, male, open, 2014-03-05,58,060,user1,123456, male, open, 2014-06-26,66,061,user2,123, male, open, 2013-12-13,56,062,user3,123456, male, open, 2013-12-14,0,064,kai.zhou,123456,?,??, 2014-03-05,65,0

As with the records in the original database, the import was successful.

V. Select part of data import

1 、--The columns parameter specifies the column

Sqoop default is to import all of the property values for each record in the table, sometimes we just need to import some property values, you can use the--columns parameter, specify the name of the column you want to import, and separate the columns with commas . The user name, gender, and time in the users table are imported into HDFs as follows:

[Email protected] ~]$ sqoop import--connect jdbc:mysql://secondmgt:3306/spice--username hive--password hive >--tab Le users--columns "username,sex,datetime"--target-dir/output/sqoop/
View results:

[Email protected] ~]$ Hadoop Fs-cat/output/sqoop/*hua, male, 2013-12-02feng, male, 2013-11-22test, male, 2014-03-05user1, male, 2014-06-26user2, male, 2013-12-13user3, male, 2013-12-14kai.zhou,?, 2014-03-05
2 、--where parameter filter rows

Another parameter ,--where, allows you to filter the rows to get some records that match the criteria, rather than all the records in the table. as follows, import data from the Users table with an ID value greater than 60 into HDFs:

[Email protected] conf]$ sqoop import--connect jdbc:mysql://secondmgt:3306/spice--username hive--password hive \  >--table users  --where "ID >"  --target-dir/output/sqoop/
View results:

[Email protected] conf]$ Hadoop fs-cat/output/sqoop/*61,user2,123, Male, open, 2013-12-13,56,062,user3,123456, male, open, 2013-12-14,0,064,kai.zhou,123456,?,??, 2014-03-05,65,0

Recommended reading:

Previous article: Sqoop1.4.4 installation on the Hadoop2.2.0 cluster

Next article:sqoop Using SQL statements to import data from MySQL database tables into HDFs

Using Sqoop1.4.4 to import data from a MySQL database table into HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.