Source Analysis Ambari Dag how to do it

Source: Internet
Author: User

I think one of the most interesting places in Ambari is how to compute dag (directed acyclic graph, a direction- free graph)

Let's briefly summarize how Ambari determines the execution process:

depending on the metadata information of the cluster, the Ambari server establishes a stage DAG at the time of the execution of a operation, according to which DAG there is a sequential relationship between the different stage,stage, which must be in the order; and according to each stage , generates the command's DAG, sends the command to the Ambari agent, executes all of the first Stage's command, executes the second stage, and so on.


Related Classes

Of course, there are a lot of details to be added, we first know several classes, sorted by the importance of understanding: Rolegraphnode

First class: Rolegraphnode, it can be seen as a point of DAG

public class Rolegraphnode {public rolegraphnode, Rolecommand command) {this.role = role;
  This.command = command;
  private role role;
  Private Rolecommand command;
  private int indegree = 0;
  Private list<string> hosts = new arraylist<string> ();
  Private map<string, rolegraphnode> edges = new treemap<string, rolegraphnode> ();
  Public synchronized void Addhost (String host) {Hosts.add (host); Public synchronized void Addedge (Rolegraphnode rgn) {if Edges.containskey (Rgn.getrole (). toString ()) {RET
    Urn
    } edges.put (Rgn.getrole (). toString (), RGN);
  Rgn.incrementindegree ();
  Private synchronized void Incrementindegree () {indegree + +;
  Public role Getrole () {return role;
  Public Rolecommand GetCommand () {return command;
  Public list<string> gethosts () {return hosts;
  public int Getindegree () {return indegree; } collection<rolegraphnode> geTedges () {return edges.values ();
  Public synchronized void Decrementindegree () {indegree-;
    @Override public String toString () {StringBuilder builder = new StringBuilder ();
    Builder.append ("+role+", "+command +", "+indegree+") ");
  return builder.tostring (); }
}

A DAG node is composed of two parts, role and Rolecommand,

(1) Role:role defines the components that are available to Ambari. Plainly, it is component.
(2) Rolecommand: An enumeration class containing: Install,uninstall,start,stop,execute,abort,upgrade,service_check,
  /**
   * Represents any custom command */
  Custom_command,

  /**
   * represents any
   action
  * * Actionexecute

As mentioned earlier, stage needs to be executed sequentially, so DAG how to determine the order.

Data structure classes have been on, since it is a direction-free graph, find all the nodes that are 0, and these nodes can be divided into the first stage, and then remove the nodes, with the nodes of the end of these nodes as the starting point to 1, To find all the nodes that are in the 0, then these nodes can be divided into the second stage, and so on, to determine how many stage and the order of each stage are executed.

In the code is how to embody it.

Each rolegraphnode can be represented as a (role, command, Indegree) ternary group, such as (Datanode, install, 0);
Each rolegraphnode has information about the edge of the node (Map Rolecommandorder

The second class: Rolecommandorder, which can be understood as the rule of how to generate DAG graph:

This class is used to establish the order between two roles.

The decision rule function is implemented as follows:

/** * Returns the dependency order.-1 => rgn1 before rgn2, 0 => they can be * para Llel 1 => rgn2 before rgn1 * * @param rgn1 roleGraphNode1 * @param rgn2 roleGraphNode2/public int ord ER (rolegraphnode rgn1, Rolegraphnode rgn2) {Rolecommandpair RCP1 = new Rolecommandpair (Rgn1.getrole (), RGN1.G
    Etcommand ());
    Rolecommandpair RCP2 = new Rolecommandpair (Rgn2.getrole (), Rgn2.getcommand ()); if ((This.dependencies.get (RCP1)!= null) && (This.dependencies.get (RCP1). Contains (RCP2))) {return
    1;
      else if ((This.dependencies.get (RCP2)!= null) && (This.dependencies.get (RCP2). Contains (RCP1))) {
    return-1;
    else if (!rgn2.getcommand (). Equals (Rgn1.getcommand ())) {return comparecommands (rgn1, RGN2);
  return 0; }

Given two node:node1 and Node2
1. If Node1-> Node2, return-1;
2. If the Node1 and Node2 do not have a succession relationship, return 0;
3. Conversely, if Node2-> Node1, then return 1;

Who is first and then depends on the type of command:
INSTALL-> START-> EXECUTE-> service_check-> STOP

  private int Comparecommands (Rolegraphnode rgn1, Rolegraphnode rgn2) {//Todo:add proper order comparison support F
    or rolecommand.actionexecute Rolecommand RC1 = Rgn1.getcommand ();
    Rolecommand RC2 = Rgn2.getcommand ();
      if (Rc1.equals (RC2)) {//if its coming here means roles no have.
    return 0;
    } if (Independentcommands.contains (RC1) && Independentcommands.contains (RC2)) {return 0;
    } if (Rc1.equals (Rolecommand.install)) {return-1;
    else if (rc2.equals (Rolecommand.install)) {return 1; else if (rc1.equals (rolecommand.start) | | rc1.equals (rolecommand.execute) | | rc1.equals (rolecommand.service_
    CHECK)) {return-1; else if (rc2.equals (rolecommand.start) | | rc2.equals (rolecommand.execute) | | rc2.equals (rolecommand.service_
    CHECK)) {return 1;
    else if (rc1.equals (rolecommand.stop)) {return-1;
else if (rc2.equals (rolecommand.stop)) {      return 1;
  return 0; }

the clear point here is:
In addition to the above rules, there is a kind of rule information is config, such as Role_command_ Order.json the rules given in this configuration file, these rules usually specify the order relationship between different component, which can be a different component between service, such as:

"Hive_server-start": [
  "Hive_metadata_database-start"
],

It can also be different component between different service, for example:

"Kafka_manager-start": [
  "Zookeeper_server-start"
],

Like a metainfo.xml with a service. The rules given by this profile specify the dependencies between service:

<requiredServices>
    <service>YARN</service>
    <service>HIVE</service>
    <service>HDFS</service>
</requiredServices>


Rolegraph

The third class: Rolegraph, which divides the stage by a given nodes, provides the following methods:

(1) The public void build (Stage Stage)
, given a Stage, establishes a DAG

(2) public list<stage> getstages ()
given nodes, Set up Stages

(3) Private synchronized void Removezeroindegreenode (String role)
to remove the entry to 0 node

(4) Private Stage Getstagefromgraphnodes (Stage origstage,
      list<rolegraphnode> stagegraphnodes)
given a Stage, and some nodes, Reconstructs a new stage

(5) public string stringifygraph ()
string representation dag

Here's what's clear: There are two kinds of dag
1. DAG between stage: stage DAG established under nodes, stage of different DAG;
2. Stage internal DAG: Create task DAG according to nodes within a given stage, i.e. DAG of different command;


Call Relationship

The following is an introduction to the invocation relationships, how they are strung together and how they are invoked by other classes.
Waiting for updates ...


The next section wants to introduce the scheduling of Ambari ...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.