More than half a year has been doing hive-related development work, and using Oozie as the engine for hive workflows to manage Hadoop tasks. Oozie's task flow includes: Croodinator, workflow. Workflow is used to describe the order in which tasks are executed, and croodinator is used to define Oozie scheduled tasks. Workflow defines two kinds of nodes: Control Flow node: Mainly start, end, fork, join, etc., where fork, join in pairs appear, in fork expansion. Branch, and finally converge on the join node. Action node: includes Hadoop tasks, SSH, HTTP, EMAIL, Oozie subtasks, and more. Workflow.xml is used to configure the workflow task action, when the job's script is more difficult to read, and when concurrency occurs, parsing is more difficult. The recent optimization of the old job of Hadoop involves more jobs, most of which are developed by other colleagues, and workflow's interpretation of the work process has to face the tedious work. So spare time, wrote a workflow.xml file parsing tool: Enter the name of the job, can display the job flowchart. There are two main problems that need to be solved: the reading and parsing of the job's workflow.xml file. The drawing of the node view. XML file parsing, using the DOM4J package can be easily resolved. The workflow file can be read directly from the SVN trunk. Because I am familiar with the Web development based on the Java EE, I finally decided to display the node view with the Web page. The web found an HTML drawing plug-in Raphael, for the page on the loss of graphics. Programming ideas have, the following is the implementation of the start code, the entire process is as follows: SVN job code download: SVN main code from its branch into the new code, need to write a timer, every day to the full amount of synchronous SVN code. Dom4j parse Workflow.xml, abstract node object, view data preparation stage. Using the Freemarker Template tool to resolve the vision data to the client, the client draws the node based on the node data.Job SearchIn order to simplify the operation of the user, a search job is done on the net, using the AutoComplete method of jquery: When the user enters the job keyword, fuzzy searches the job on SVN, displays the job on the configuration, and simplifies the user input.
Workflow Data ParsingThe core work of coding is the 2nd step, the parsing of XML file, it is easy to use dom4j to parse workflow.xml into a linked list of nodes. Each node holds the list of nodes to be executed next. The difficulty is to put the linked list data, conversion view display coordinates, output to the client. We might as well think of 50px as a coordinate unit that shows only the border, shows the node name in the bounding box, and a unit between the node and the node before and after the point. Share my coordinate conversion algorithm: the height of the node is one unit, and the width is related to the length of the node name text (5 characters in one unit). The recursive algorithm is used to traverse the list of nodes from the start node to the end node. We might as well take the last step of the node to the node called the node, the next node to execute, called the node's back node.
Node horizontal =max (The horizontal axis of the front node + node width + 1) node ordinate =max (the ordinate of the front node) using recursive algorithm, when the view is added to the new node, it may cause the node in the view of the front node, the number of post-node changes, you need to readjust the node's horizontal, ordinate.
Code ImplementationThe core code is as follows: Ooziehelper.java
Package com.lxr.oozie.workflow;
Import java.util.ArrayList;
Import Java.util.HashMap;
Import java.util.List;
Import Java.util.Map;
Import org.dom4j.Document;
Import org.dom4j.Element;
public class Ooziehelper {private document document;
Private Element root = null;
Private list<workflownode> workflownodes = new arraylist<workflownode> ();
Private map<string, workflownode> workflownodemap = new hashmap<string, workflownode> ();
Public Ooziehelper (document document) {this.document = document;
} private Element GetNode (String tag) {return root.element (tag);
} Private Element GetNode (string attr, String value) {list<element> nodes = root.elements ();
for (Element el:nodes) {if (Value.equals (El.attributevalue (attr))) {return el;
}} return null;
} private Element Getnodebyname (String name) {return GetNode ("name", name);
} private String Getnextnodename (Element node) {String tagName = Node.getname (); if ("Action ". Equals (TagName)) {element element = Node.element (" OK ");
Return Element.attributevalue ("to");
} else {return Node.attributevalue ("to");
}} private list<element> Getnextnodes (Element node) {if (Null = = node) {return null;
} list<element> nextnodes = new arraylist<element> ();
String nextnodename = getnextnodename (node);
if (null! = Nextnodename) {Element nextnode = Getnodebyname (nextnodename);
String tagName = Nextnode.getname ();
if ("fork". Equalsignorecase (TagName)) {list<element> elements = nextnode.elements ();
for (Element el:elements) {nextnodename = El.attributevalue ("Start");
if (null! = nextnodename) {nextnode = Getnodebyname (nextnodename);
if (null! = NextNode) {nextnodes.add (nextnode);
}}}} and else if ("Join". Equals (TagName)) {NextNode = Getnodebyname (Nextnode.attributevalue ("to"));
Nextnodes.add (NextNode); } else {Nextnodes.add(NextNode);
}} return nextnodes; } private void Adjusttoaddnode (Workflownode workflownode) {for (Workflownode node:workflownodes) {if (Node.equa
LS (workflownode)) {return;
}} workflownodes.add (Workflownode); } private void Gennextworkflownodes (Workflownode parent) {list<element> nextnodes = getnextnodes (Parent.getelem
ENT ()); if (0! = nextnodes.size ()) {for (int i = 0, Len = nextnodes.size (); i < Len; i++) {Element el = Nextnodes.get (
i);
String nodeName = el.attributevalue ("name");
Workflownode Subworkflownode = Workflownodemap.get (nodeName);
int subx = PARENT.GETX () + parent.getlength () + 1;
int suby = parent.gety () + 2;
if (null = = Subworkflownode) {Subworkflownode = new Workflownode (EL);
Subworkflownode.setname (NodeName);
Subworkflownode.setx (SUBX);
Subworkflownode.sety (Suby);
Gennextworkflownodes (Subworkflownode);
Workflownodes.add (Subworkflownode); Adjusttoaddnode (Subworkflownode);
} else {Subworkflownode.setx (Math.max (SUBX, Subworkflownode.getx ()));
if (Suby > Subworkflownode.gety ()) {Subworkflownode.adjustnextnodesy (suby-subworkflownode.gety ());
Subworkflownode.sety (Suby);
}} subworkflownode.previousnodes (). Add (parent);
Parent.nextnodes (). Add (Subworkflownode);
Workflownodemap.put (NodeName, Subworkflownode);
}}} public list<workflownode> parse () {root = Document.getrootelement ();
Element startnode = GetNode ("Start");
if (null! = Startnode) {Workflownode start = new Workflownode (startnode);
Start.setname ("Start");
Start.setx (0);
Start.sety (0);
Workflownodes.add (start);
Gennextworkflownodes (start); } else {System.out.println ("The Start node was not found.
");
} return workflownodes; }
}
View-workflow.html
<! DOCTYPE html> Oozie.wf.draw.js
JQUERY.FN.ATTRN = function (attr) {return parseint ($ (this). attr (attr));}
htmldivelement.prototype.$$ = function (attr) {return $ (this). attr (attr);}
htmldivelement.prototype.$ $n = function (attr) {return parseint (this.$$ (attr));}
$ (function () {var $holder = $ ('. Workflow-holder ');
var $nodes = $holder. Find ('. Workflow-node ');
var nodes = []; $nodes. Each (function (i, EL) {Nodes[i] = {ID: $ (EL). attr (' id '), index:i, xx:el.$ $n (' xx '), yy:el.$ $n (
' yy '), length:el.$ $n (' length '), $instance: $ (EL)}}); var workflowcfg = {id: $holder. attr (' id '), margin: {left: $holder. ATTRN (' Margin-left '), Top: $holder. attrn (' Margin-top ')}, grid: {paddingleft: $holder. ATTRN (' Grid-padding-left '), Paddingtop: $holder. ATTRN (' Grid-pa Dding-top '), Width: $holder. attrn (' grid-width '), Height: $holder. Attrn (' Grid-height ')}, Nodes:nodes, node
SMap: (function () {var map = {}; for (var i = 0; i < nodes.length; i++) {VAR node = nodes[i];
Map[node. $instance. attr (' id ')] = node;
} return map;
})()
};
Console.log (WORKFLOWCFG)//The order used to store nodes var connections = [];
The event at the start of the drag node var dragger = function () {This.ox = this.attr (' x ');
This.oy = this.attr (' y ');
This.animate ({' fill-opacity ':. 2}, 500);
};
Drag event var move = function (dx, dy) {var att = {x:this.ox + dx, Y:this.oy + dy};
This.attr (ATT); $holder. Find ("#" + this.id). Offset ({Top:this.oy + dy + workflowCfg.grid.paddingTop, Left:this.ox + dx + WorkFlo
WCfg.grid.paddingLeft}); for (var i = connections.length; i--;)
{R.drawarr (connections[i]);
}
};
Drag the end of the event var up = function () {this.animate ({' fill-opacity ': 0}, 500);
};
Create a drawing object var r = Raphael (workflowcfg.id, $ (window). Width (), $ (window). Height ());
Draw node var shapes = [];
var maxright = 0;
for (var i = 0, len = workflowCfg.nodes.length; i < Len; i++) {var node = workflowcfg.nodes[i]; Node.left = WorkflowCfg.margin.left + node.xx * workflowCfg.grid.width;
Node.top = workflowCfg.margin.top + node.yy * workflowCfg.grid.height;
Node.width = WorkflowCfg.grid.width * NODE.LENGTH;
Node.height = WorkflowCfg.grid.height;
Shapes[i] = R.rect (Node.left, Node.top, Node.width, Node.height, 4); Locate the text on the node. $instance. Offset ({top:node.top + workflowCfg.grid.paddingTop, Left:node.left + workflowcfg.g
Rid.paddingleft});
var = node. $instance. Offset (). Left + node.width; Maxright = maxright > right?
Maxright:right;
} var $svg = $holder. Find (' svg ');
var svnwidth = maxright + workflowCfg.grid.paddingLeft;
var _svnwidth = $svg. Attrn (' width ');
$svg. attr (' width ', svnwidth > _svnwidth svnwidth: _svnwidth);
Add styles and events to the nodes, and draw arrows between nodes for (var i = 0, ii = shapes.length; i < II; i++) {var color = Raphael.getcolor (); Shapes[i].attr ({fill:color, Stroke:color, ' fill-opacity ': 0, ' stroke-width ': 2, cursor: ' Move '})
; ShaPes[i].id = workflowcfg.nodes[i].id;
Shapes[i].drag (move, Dragger, up); Shapes[i].dblclick (function () {alert (this.id)})}//node wired for (var i = 0; i < workflowCfg.nodes.length; i++)
{var node = workflowcfg.nodes[i];
var nextnodeids = node. $instance. attr (' Next-node ');
if (nextnodeids) {var Nextnodeidarr = nextnodeids.split (', ');
for (var j = 0; J < Nextnodeidarr.length; J + +) {var nextnodeid = nextnodeidarr[j];
var nextnode = Workflowcfg.nodesmap[nextnodeid];
Connections.push (R.drawarr ({Obj1:shapes[node.index], Obj2:shapes[nextnode.index]}); }
}
}
});
The results are as follows: