The sqoop& of large data acquisition engine captures data from Oracle database

Source: Internet
Author: User
Tags sqoop

Welcome to the big Data and AI technical articles released by the public number: Qing Research Academy, where you can learn the night white (author's pen name) carefully organized notes, let us make a little progress every day, so that excellent become a habit!

First, Sqoop's introduction:

Sqoop is a data acquisition engine/data exchange engine that captures database in relational databases (RDBMS) primarily for data transfer between RDBMS and Hdfs/hive/hbase and can be sqoop The import command imports data from the RDBMS into the hdfs/hive/hbase, and can also import data from the Hdfs/hive/hbase into the RDBMS through the Sqoop Export command, featuring: bulk data acquisition, The underlying dependency on the MapReduce program works by connecting to a relational database (RDBMS) through JDBC.

Second, the experimental conditions of Sqoop:

Experimental condition: Install the Windows XP operating system and Oracle database.

Why choose an Oracle database in a relational database?

Cause: 1. It is easier to install an Oracle database on a Windows system than on a Linux system. 2. The SH user in the Oracle database contains the sales order form, which contains 920,000 records, and the Scott user contains the existing employee table Emp.csv and Department table Dept.csv.

Iii. the driver class name and URL format for each database:

Database driver class name URL format port number

Oracle Oracle.jdbc.OracleDriver Jdbc:oracle:thin: @IP: 1521:ORCL 1521

MySQL com.mysql.jdbc.Driver jdbc:mysql://ip:3306/dbname?name=value 3306

Hive Org.apache.hive.jdbc.HiveDriver Jdbc:hive2://ip:10000/dbname 10000

Iv. Installation and Configuration sqoop:

Note: You do not need to modify the configuration file

1, installation SQOOP:TAR-ZXVF sqoop-1.4.5bin_hadoop-0.23.tar.gz-c ~/training

2. Configure Sqoop_home Environment variables:

Export sqoop_home=/root/training/sqoop-1.4.5bin_hadoop-0.23

Export path= $SQOOP _home/bin: $PATH

Use the Sqoop statement to collect data from the RDBMS:

1. Import all data from the employee table EMP:

Sqoop import--connect jdbc:oracle:thin:@192.168.182.157:1521:orcl--username SCOTT--password Tiger--table EMP-- Target-dir/sqoop/import/emp1

2. Import the specified column in the Employee table EMP:

Sqoop import--connect jdbc:oracle:thin:@192.168.182.157:1521:orcl--usrname SCOTT--password Tiger--table emp-column E Name,sal--TARGET-DIR/SQOOP/IMPORT/EMP2

3. Import all data from the sales table:

Sqoop Import--connect jdbc:oracle:thin:@192.168.182.157:1521:orcl--username sh--password sh--table SALES-- Target-dir/sqoop/import/sales-m 1

4. Import all the tables under the Scott user into HDFs:

Sqoop import-all-tables--connect jdbc:oracle:thin:@192.168.182.157:1521:orcl--usernmae SCOTT--password Tiger

5. Export the data in HDFs into the RDBMS:

Sqoop export--connect jdbc:oracle:thin:@192.168.182.157:1521:orcl--username SCOTT--password Tiger--table STUDENTS-- Export-dir/students

Six, the difference between Oracle database and MySQL database:

1, the Oracle database is case-sensitive, you need to capitalize the: User name, table name, column name, MySQL database is not case-sensitive.

2. The Oracle database has only one database: ORCL, which is created automatically when the Oracle database is installed, and the MySQL database has many databases.

3, the Oracle database has many users, the table belongs to the user, MySQL database has many databases, the table belongs to the database, the database set different access rights for different users.

Seven, Sqoop and flume the same and different:

Same: Sqoop and Flume are data acquisition engines.

Different: Sqoop features: Batch data acquisition, flume characteristics: real-time data acquisition, mainly used in real-time acquisition system.

Li Jinze Allenli, Tsinghua University in the master's degree, Research direction: Big data and artificial intelligence.

The sqoop& of large data acquisition engine captures data from Oracle database

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.