sqoop1.4.4 from Oracle-guided data to hive

Source: Internet
Author: User
Tags chmod sqoop

Sqoop importing data from Oracle timed increments to hive

Thank:

Http://blog.sina.com.cn/s/blog_3fe961ae01019a4l.html
Http://f.dataguru.cn/thread-94073-1-1.html Sqoop.metastore.client.record.password
http://blog.csdn.net/ryantotti/article/details/14226635 Open Sqoop Metastore

Step 1 Create Sqoop Job
A. Configuring the Sqoop Metastore Service
Modify the Sqoop/conf/sqoop-site.xml file

Related properties:

Sqoop.metastore.server.location
Sqoop.metastore.server.port
Sqoop.metastore.client.autoconnect.url

The above three parameters are for sharing (shared) Metastore, "By default, job descriptions is saved to a private repository stored in $HOME/.sqoop/ ." can configure Sqoop to instead use a gkfx metastore which makes saved jobs available to multiple users Across a shared cluster. Starting the Metastore is covered by the sections on sqoop-metastore the tool. " This allows you to share the job and execute it on other cluster.

If you do not need to share a job, just use the above three properties in the configuration file <!----> comment out.

Sqoop.metastore.client.enable.autoconnect
Sqoop.metastore.client.record.password This property is to save the database password, by default it is security required, does not save the password to Metastore, This will require re-entering the password for the database each time the job is executed. In order to execute it periodically, we modify this property so that he can save the password.

Modify the following:

< Property> <name>Sqoop.metastore.server.location</name> <value>/tmp/sqoop-metastore/shared.db</value></ Property>< Property> <name>Sqoop.metastore.server.port</name> <value>16000</value></ Property>< Property>  <name>Sqoop.metastore.client.autoconnect.url</name> <value>Jdbc:hsqldb:hsql://118.228.197.115:16000/sqoop</value></ Property>< Property>  <name>Sqoop.metastore.client.record.password</name>  <value>True</value></ Property><!--comment out this attribute <property> <name>sqoop.metastore.client.enable.autoconnect</name> <value> False</value></property> -

B. Start Metasotre, the console executes the Sqoop metastore command (skip this step if the first three properties are not configured )
C. Creating a sqoop job

(To facilitate execution, write the following script to the file save, and then use chmod u+x filename to modify the permissions through the./filename execution file, create the job)

Sqoop Job--meta-connect Jdbc:hsqldb:hsql://hostip:16000/sqoop--create JOBNAME--Import--hive-import-- Incremental append--connect jdbc:oracle:thin: @DatabaseIP: 1521/instancename--username username--password PASSWD-- Verbose-m 1--bindir/opt/sqoop/lib--table TABLENAME--check-column COLUMNNAME--last-value value


Attention:

1) If you have not configured the shared Metastore (that is, "sqoop.metastore.server.location", "Sqoop.metastore.server.port", " Sqoop.metastore.client.autoconnect.url "Three properties have been commented on in the configuration file), you will need to"--meta-connect "in the script above" Jdbc:hsqldb:hsql://hostip : 16000/sqoop "Remove.

2) "--create JOBNAME--Import" "--" followed by a space and then write the import command, otherwise the execution error
3)--check-column column cannot be char varchar, etc., can be date,int,
Reference Official website: http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html find Check-column field on official web page, quickly navigate to relevant explanations

Step 2 perform Sqoop job to see if it can proceed smoothly

<!--look at the job list to see if itwas created successfully--list<!--perform the job, and if the test is performing properly, it can be time-consuming to import a large amount of data-- exec JOBNAME


Step 3 After you have determined that the Sqoop job can execute properly, write a script to execute it periodically

Write the following script to a text file, such as Execjob, and then execute the chmod u+x execjob command to add the executable permission

source/etc/Profilerm Tablename.java--exec JOBNAME

Step 4 timing Execution with crontab tools

Execute the CRONTAB-E command, add the following line of script, save exit

# daily 1 o'clock Perform data import job 0 1 * * * execjob 1>/root/execlogs 2>&1

Note: Execjob is the script file created by Step3 and requires a specific path, such as/root/execjob. The phrase "1>/root/execlogs 2>&1" is the redirect stdout and stderr output object to the specified file, which can be viewed in the file for information on the execution output.

Crontab Command Tool usage Reference:

Http://www.cnblogs.com/jiafan/articles/1153066.html

Http://baike.baidu.com/view/1229061.htm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.