SQOOP Load Data from Oracle to Hive Table

Source: Internet
Author: User
Tags sqoop

Sqoop import-d oraoop.disabled=true --connect"jdbc:oracle:thin:@ (description= (address= (protocol=tcp) (Host=hostname) (Port=port) (connect_data=) (Service_ Name=service_name )))" --USERNAME username--table table_name--NULL-string '\\n'--NULL-non-string '\\n' --hive-import--hive-table Hivedb. Hivetalbename--num-mappers1--verbose--password PWD--hive-drop-import-delims--hive-Overwrite--fetch-size -

-D is not the parameter for Sqoop, it's used for Hadoop.

Oraoop.disabled=true

If not set this parameter, the command report a issue:table or view does not exists.

Oraoop is a special plugin for Sqoop This provides faster access to Oracle's RDBMS by using custom protocols Available publicly. Quest Software partnered with Oracle to get those protocols, implemented them and created Oraoop.

In we test environment, without this parameter setting, it works fine. For another environment, encounter this issue, before this, I see one log message is:it can ' t is recognized a valid thin Url. Maybe the driver issue.

Another thing need to take care are, you ' d better write table_name (VIEW) and username in UPPER case. Or else encounter same issue:table or view not exists.

--hive-drop-import-delims

This parameter used to address the known issue, when your fields in the RDBMS table have new line (\ r \ n or special cha R such as \001) in the content.

It would break the hive rule. Hive use \001 as Default field separator and \ n as the row terminator in default.

If you specify the separator or row terminator by yourself, Hive would report a error. Hive now just support \ n as the row terminator. So you can replace or drop the special char or \ r \ n in the fields.

--hive-overwrite

This would overwrite the data in the Hive table

--fetch-size

This parameter ' s default value is 1000.

One time, when we load a width view, have about columns. The Sqoop command report a error:out of memory.

The Java file isn't generated now. I don ' t know why, but this is the error occurs before the fetch size setting, so I.

The root cause may need get more information from source code.

--null-string ' \\n '--null-non-string ' \\n '

For this parameter, the hive would parse null in RDBMS to string ' null ', with this parameter, it'll keep null in Hive tab Le.

If the Sqoop command would generate the Hadoop jar file in temp path, and then execute the mapreduce job.

First, it'll load data to HDFs and then the CREATE table for hive and then use load command load data from HDFS to Datawarehouse Folder.

If the command execute successfully, it'll clean the staging file.

If it fails when the load data to hive or create hive table. The HDFs folder and file would keep in the HDFs.

If you rerun the same command again, it would fail, report the output directory has exists. So just drop it or load the data by self.

If You use--query (-e), use the free query to load data.

Demo:--query "select *from table where \ $conditions", in double quote, should add \, ' in ' quote, not needed fo R this.

And you should the add parameter--target-dir/hdfspath, if you use--query.

When load data is from an RDBMS to hive, if the Sqoop create the table for you. You'll find the integer type would convert to double.

So you need does something for this. Take care.

SQOOP Load Data from Oracle to Hive Table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.