Sqoop importing data from Oracle timed increments to hive
Thank:
Http://blog.sina.com.cn/s/blog_3fe961ae01019a4l.html
Http://f.dataguru.cn/thread-94073-1-1.html Sqoop.metastore.client.record.password
http://blog.csdn.net/ryantotti/article/details/14226635 Open Sqoop Metastore
Step 1 Create Sqoop Job
A. Configuring the Sqoop Metastore Service
Modify the Sqoop/conf/sqoop-site.xml file
Related properties:
Sqoop.metastore.server.location
Sqoop.metastore.server.port
Sqoop.metastore.client.autoconnect.url
The above three parameters are for sharing (shared) Metastore, "By default, job descriptions is saved to a private repository stored in $HOME/.sqoop/
." can configure Sqoop to instead use a gkfx metastore which makes saved jobs available to multiple users Across a shared cluster. Starting the Metastore is covered by the sections on sqoop-metastore
the tool. " This allows you to share the job and execute it on other cluster.
If you do not need to share a job, just use the above three properties in the configuration file <!----> comment out.
Sqoop.metastore.client.enable.autoconnect
Sqoop.metastore.client.record.password This property is to save the database password, by default it is security required, does not save the password to Metastore, This will require re-entering the password for the database each time the job is executed. In order to execute it periodically, we modify this property so that he can save the password.
Modify the following:
< Property> <name>Sqoop.metastore.server.location</name> <value>/tmp/sqoop-metastore/shared.db</value></ Property>< Property> <name>Sqoop.metastore.server.port</name> <value>16000</value></ Property>< Property> <name>Sqoop.metastore.client.autoconnect.url</name> <value>Jdbc:hsqldb:hsql://118.228.197.115:16000/sqoop</value></ Property>< Property> <name>Sqoop.metastore.client.record.password</name> <value>True</value></ Property><!--comment out this attribute <property> <name>sqoop.metastore.client.enable.autoconnect</name> <value> False</value></property> -
B. Start Metasotre, the console executes the Sqoop metastore command (skip this step if the first three properties are not configured )
C. Creating a sqoop job
(To facilitate execution, write the following script to the file save, and then use chmod u+x filename to modify the permissions through the./filename execution file, create the job)
Sqoop Job--meta-connect Jdbc:hsqldb:hsql://hostip:16000/sqoop--create JOBNAME--Import--hive-import-- Incremental append--connect jdbc:oracle:thin: @DatabaseIP: 1521/instancename--username username--password PASSWD-- Verbose-m 1--bindir/opt/sqoop/lib--table TABLENAME--check-column COLUMNNAME--last-value value
Attention:
1) If you have not configured the shared Metastore (that is, "sqoop.metastore.server.location", "Sqoop.metastore.server.port", " Sqoop.metastore.client.autoconnect.url "Three properties have been commented on in the configuration file), you will need to"--meta-connect "in the script above" Jdbc:hsqldb:hsql://hostip : 16000/sqoop "Remove.
2) "--create JOBNAME--Import" "--" followed by a space and then write the import command, otherwise the execution error
3)--check-column column cannot be char varchar, etc., can be date,int,
Reference Official website: http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html find Check-column field on official web page, quickly navigate to relevant explanations
Step 2 perform Sqoop job to see if it can proceed smoothly
<!--look at the job list to see if itwas created successfully--list<!--perform the job, and if the test is performing properly, it can be time-consuming to import a large amount of data-- exec JOBNAME
Step 3 After you have determined that the Sqoop job can execute properly, write a script to execute it periodically
Write the following script to a text file, such as Execjob, and then execute the chmod u+x execjob command to add the executable permission
source/etc/Profilerm Tablename.java--exec JOBNAME
Step 4 timing Execution with crontab tools
Execute the CRONTAB-E command, add the following line of script, save exit
# daily 1 o'clock Perform data import job 0 1 * * * execjob 1>/root/execlogs 2>&1
Note: Execjob is the script file created by Step3 and requires a specific path, such as/root/execjob. The phrase "1>/root/execlogs 2>&1" is the redirect stdout and stderr output object to the specified file, which can be viewed in the file for information on the execution output.
Crontab Command Tool usage Reference:
Http://www.cnblogs.com/jiafan/articles/1153066.html
Http://baike.baidu.com/view/1229061.htm