Comparison of Azkaban and Oozie of Hadoop workflow engine (iv)

Source: Internet
Author: User

What is Azkaban? (a)

Functional characteristics of Azkaban (II.)

Architecture of the Azkaban (iii)

Not much to say, directly on the dry goods!

Http://www.cnblogs.com/zlslch/category/938837.html

Currently, there are two of the most popular Hadoop workflow engine schedulers Azkaban and Oozie on the market.

Specifically, you can look further at my blog.

Azkaban Concept Learning Series http://www.cnblogs.com/zlslch/category/938837.html

and Oozie Concept Learning series http://www.cnblogs.com/zlslch/category/916607.html

The following table compares the key features of the above 2 Hadoop workflow schedulers, although the requirements scenarios that these workflow schedulers can address are basically consistent, but there are differences in design concepts, target users, application scenarios, and so on.

Characteristics

Oozie

Azkaban

Workflow Description Language

XML (XPDL based)

Text file with Key/value pairs

Dependency mechanism

Explicit

Explicit

Do you want the Web container

Yes

Yes

Progress tracking

Web page

Web page

Hadoop Job scheduling support

Yes

Yes

Operating mode

Daemon

Daemon

Pig support

Yes

Yes

Event notification

No

No

Need to install

Yes

Yes

Supported versions of Hadoop

0.20+

Currently unknown

Retry support

Workflownode Evel

Yes

Run arbitrary commands

Yes

Yes

Amazon EMR Support

No

Currently unknown

Comparison of Azkaban and Oozie

The following detailed comparisons are given for the two most popular schedulers on the market. The high profile should be Apache Oozie, but the process of configuring workflow is to write a lot of XML configuration, and the code complexity is relatively high, not easy to two times development. Ooize compared to Azkaban is a heavyweight task scheduling system, full-featured, but also more complex configuration use. The lightweight scheduler Azkaban is a good candidate if you can not care about the absence of certain features.

Compare from function

Both can dispatch Linux commands, MapReduce, Spark, Pig, Java, Hive, Java programs, script workflow tasks

Both can perform workflow tasks on a timed basis

Compare from Workflow definition

1. Azkaban using the properties file to define the workflow

2. Oozie using XML file to define workflow

Compare from work Flow

1, Azkaban support direct parameters, such as ${input}

2, Oozie support parameters and El expressions, such as ${fs:dirsize (Myinputdir)}

Compare from timed execution

1, Azkaban scheduled execution of tasks is time-based

2, Oozie the timing of the task based on time and input data

Comparison from Resource management

1, Azkaban has more strict control of permissions, such as the user to the workflow to read/write/execute and other operations

2, Oozie temporarily no strict control of authority

Compare from Workflow execution

1, Azkaban has three modes of operation:

1.1, the solo server mode: The simplest mode, the database built-in H2 database, the Management Server and the execution server are running in a process, the task volume is not large project can adopt this mode.

1.2, the server mode: the database for MySQL, Management Server and execution server in different processes, this mode, the Management Server and execution server do not affect each other

1.3, multiple Executor mode: In this mode, the execution server and the Management Server are on different hosts, and the execution server can have multiple

I use the second mode this time, the Management Server, the execution server sub-process, but on the same host.

2. Oozie runs as a workflow server, supporting multiple users and multiple workflows

Compare Workflow Management

1, Azkaban support browser and AJAX mode operation workflow

2. Oozie Support command line, HTTP REST, Java API, browser operation workflow

Another version differs:

The two are roughly the same in terms of functionality, except that the Oozie underlying commits the Hadoop spark job through Org.apache.hadoop's encapsulated interface, and Azkaban can directly manipulate the shell statement. It might be better to oozie on security.

  Workflow Definition : Oozie is defined by XML and Azkaban to properties.

  deployment Process : The deployment of Oozie is too abusive. It's a little difficult. At the same time it is pulling the task log from yarn.

Azkaban If a task fails, as long as the process executes effectively, then the task succeeds, this is a bug, but Oozie can effectively detect the success and failure of the task.

  Action Workflow : Azkaban uses web operations. Oozie supports Web,restapi,java API operations.

  rights control : Oozie basic no permission control, Azkaban has a more complete permission control, into the user to read and write the workflow operation.

Oozie's action runs primarily in Hadoop and Azkaban's actions run on Azkaban's servers.

  record the status of Workflow : Azkaban saves the workflow state in memory and oozie it in MySQL.

  failure occurs : Azkaban loses all workflows, but Oozie can run on a workflow that continues to fail.

Comparison of Azkaban and Oozie of Hadoop workflow engine (iv)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.