Installation configuration of Oozie scheduling system on Hadoop platform

Source: Internet
Author: User
Keywords installation JDBC port

Oozie is the open source scheduling tool on the Hadoop platform, which has been used Oozie for nearly a year in the project, and the Oozie installation configuration is quite complex. In order to use it conveniently, a lot of configuration needs to be done. The following is a set of steps for Oozie installation configuration, for the use of Hadoop and Oozie children's shoes for reference, but also easy to see their own.

1 Decompression installation Package

Tar-xzf oozie-3.3.2-distro.tar.gz

2 Modifying addtowar.sh scripts

Because the version of Hadoop supported by Oozie declaration does not include 1.0.4, it can actually work with Hadoop-1.0.4. So you need to modify the addtowar.sh script under $oozie_home/bin to add at the end of the function Gethadoopjars ():

elif ["${version}" = "1.0.4"]; Then

#List is separated by ":"

Hadoopjars= "Hadoop-ant-1.0.4.jar:hadoop-client-1.0.4.jar:hadoop-core-1.0.4.jar:hadoop-minicluster-1.0.4.jar: Hadoop-tools-1.0.4.jar:jackson-core-asl-1.8.8.jar:jackson-mapper-asl-1.8.8.jar:log4j-1.2.15.jar: Commons-configuration-1.6.jar "

3 Copy Required Reliance

Because OOZIE needs to deploy all the dependencies into a war package on a built-in tomcat, it is necessary to copy the appropriate jar packs and JS to the $oozie_home/libext directory, including the following dependencies:

4 environment variable Settings

VI ~/.bash_profile

Add to

Export oozie_url=http://localhost:11000/oozie/

Export oozie_home=/home/hadoop/oozie-3.3.2

Path add: $OOZIE _home/bin

5 Proxy settings

If you do not make proxy settings, you will experience a similar error when submitting a task:

Hadoop isn't even to impersonate Hadoop

Hadoop does not allow you to imitate Hadoop, which means that Hadoop does not have the right to submit tasks in lieu of Hadoop.

This problem occurs because Oozie itself does not perform any tasks and does not distribute tasks to tasktracker. The only interaction between the Oozie and the Hadoop cluster is to submit tasks to the Jobtracker and get task execution by callback URLs or polling.

We assume that the Hadoop cluster is installed under the a account, and Oozie is installed under the B account of a node, which belongs to the C user group. The proxy setting represents the following meaning: A account has the right to submit a task in lieu of the C user group in the node.

Add in Core-site.xml

  

  

Hadoop.proxyuser.hadoop.hosts

IP

  

  

Hadoop.proxyuser.hadoop.groups

Hadoop

  

In the configuration item, The two Hadoop in Hadoop.proxyuser.hadoop.hosts and Hadoop.proxyuser.hadoop.groups is the account that we mentioned above a,hadoop.proxyuser.hadoop.hosts corresponding value needs to be filled Oozie installation section The ip,hadoop.proxyuser.hadoop.groups of the point corresponds to the value required to complete the user Group C we mentioned above.

As General Hadoop and Oozie are installed in the Hadoop account, and the Hadoop account belongs to the Hadoop user group. So there's this funny configuration that Hadoop submits the task instead of Hadoop.

Note Since this configuration entry is Core-site.xml, you will need to restart the cluster to take effect after modifying the configuration.

6 Time zone settings

The purpose of the time zone setting is to use the East eight area in the configuration.

Add in Oozie-site.xml:

Oozie.processing.timezone

Info

7 Port number Settings

If you need to modify the port number, modify the port number in $oozie_home/conf/oozie-env.sh to prevent the ports from conflicting. The default port number is 11000.

8 Database Settings

Oozie uses RDBMS to store meta information, using Apache's embedded pure Java database Derby by default. There may be problems in use, the recommended use of MySQL, configured as follows:

Oozie.db.schema.name

Oozie

Oozie DataBase Name

Oozie.service.JPAService.create.db.schema

True

Creates Oozie DB.

If set to true, it creates the DB schema if it does not exist. If The DB schema exists is a NOP.

If set to False, it does not create the DB schema. If the DB schema does not exist it fails start up.

Oozie.service.JPAService.jdbc.driver

Com.mysql.jdbc.Driver

JDBC driver class.

Oozi.service.JPAService.jdbc.url

Jdbc:mysql://localhost:3306/oozie?createdatabaseifnotexist=true

The JDBC URL.

Oozie.service.JPAService.jdbc.username

Root

DB user name.

Oozie.service.JPAService.jdbc.password

abc

DB user password.

Important:if password is emtpy leave a 1 space string, the service trims the value,

If empty revisit assumes it is NULL.

9 Running oozie-setup.sh

$OOZIE _home/bin/oozie-setup.sh-hadoop 1.0.4/home/hadoop/hadoop-extjs/home/hadoop/oozie-3.3.2/libext/

10 Creating a Database

$OOZIE _home/bin/ooziedb.sh Create-sqlfile Oozie.sql-run

11 Running oozie-start.sh

$OOZIE _home/bin/oozie-start.sh

12 Running oozie-run.sh

nohup/home/hadoop/oozie-3.3.2/bin/oozie-run.sh >/home/hadoop/oozie-3.3.2/logs/log 2>/home/hadoop/ Oozie-3.3.2/logs/errorlog &

13 Verify that the Oozie is in a normal state

Oozie Admin-oozie Http://IP:11000/oozie-status

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.