Jstorm and Storm Source Analysis (Iv.)--Equalizer Scheduler, Evenscheduler

Source: Internet
Author: User
Tags map vector

Evenscheduler, like Defaultscheduler, also implements the IScheduler interface,
It can be seen from the following code:

(NS Backtype.storm.scheduler.EvenScheduler  (: Use [backtype.storm Util log Config])  (: Require [Clojure.set:as set])  (: Import [Backtype.storm.scheduler IScheduler topologies            Cluster topologydetails workerslot executordetails])  (: Gen-class    : Implements [Backtype.storm.scheduler.IScheduler]) Evenscheduler is a scheduler that distributes resources evenly: (Defn- Prepare [this conf]  ) (Defn-schedule [This ^topologies topologies ^cluster Cluster]  ( schedule-topologies-evenly topologies cluster))

It is done by calling the Schedule-topologies-evenly method to complete the task assignment.
The specific definition of the Schedule-topologies-evenly method is as follows:

(Defn schedule-topologies-evenly [^topologies topologies ^cluster Cluster];; By invoking the Needsschedulingtopologies method of the cluster object to obtain all topology collections that need to be scheduled for the task;;  The Needsschedulingtopologies method is specifically defined as shown in FN1. ;;  The basis for determining whether topology is required for task scheduling is described in fn2.  (Let [Needs-scheduling-topologies (. needsschedulingtopologies cluster topologies)] (Doseq [^topologydetails topology Needs-scheduling-topologies;; For each of the topology that need to be scheduled for a task, first get its topology-id: let [Topology-id (. getId topology);; Call the Schedule-topology method to get the calculated <executor,node+port> type set new-assignment;;          The Schedule-topology method is specifically defined as shown in Fn3. New-assignment (schedule-topology topology cluster);          Gets the <node+port,executors> collection by reversing the keys and values of the new-assignment. Node+port->executors (Reverse-map new-assignment)];;      For each item in the <node+port,executors> collection that you get earlier, do the following. (Doseq [[Node+port executors] node+port->executors;; Constructs the Workerslot object with the node and port information and uses it as the Slot:let [^workerslot slot (Workerslot. (First node+port) (last Node+port));; The following two lines of code: For each item in the Executors collection, construct the Executordetail object,;; and returns a Executordetails collection as executors executors (for [[Start-task End-task] executors] (Executordetai Ls. Start-task End-task)];; Call cluster's Assign method to assign the computed slot to the executors (. Assign cluster slot Topology-id executors) that corresponds to the topology)))

FN1:

/** * Gets all the topology that need to be dispatched and returns the */public list<topologydetails> Needsschedulingtopologies as a collection (topologies topologies) {    list<topologydetails> ret = new arraylist<topologydetails> ();    For (Topologydetails topology:topologies.getTopologies ()) {        if (needsscheduling (topology)) {            Ret.add ( topology);        }    }    return ret;}

FN2:

/** * Determine whether topology requires task scheduling based on two: * 1.Topology the number of numworkers set is greater than the number of workers already assigned to topology * 2. Whether the number of executor that topology has not been allocated is greater than 0 */public Boolean needsscheduling (Topologydetails topology) {    int Desirednumworkers = Topology.getnumworkers ();    int assignednumworkers = this.getassignednumworkers (topology);    if (Desirednumworkers > Assignednumworkers) {        return true;    }    Return This.getunassignedexecutors (topology). Size () > 0;}

FN3:

;; The method assigns a task to the topology based on the current available resources of the cluster (defn-schedule-topology [^topologydetails topology ^cluster Cluster];; Get Topology-id (Let [Topology-id (. getId topology); Call cluster's Getavailableslots method to obtain the currently available slot resource for the cluster;; Convert it to <node,port> set merge assignment to Available-slots;;              The getavailableslots is primarily responsible for calculating supervisor Ports Available-slots (->> (. Getavailableslots cluster) that are not used in the current cluster. (Map # (vector (. Getnodeid%)        (. Getport%)))) ;; Call Getexecutors to get all executor information for topology;; Convert it to <start-task-id,end-task-id> collection,;; Then assign the value to All-executors and return All-executors (->> topology. Getexecutors (Map # (vector (. get StartTask%)              (. Getendtask%))) set);; Call the Get-alive-assigned-node+port->executors method (specifically defined as fn3_1);; Calculate the current resource situation that the topology has already been divided into; Finally, a <node+port,executors> set is returned and assigned to the variable alive-assigned;; Parameters are cluster information and Topology-id alive-assigned (get-alive-assigned-node+port->executors cluster topoLogy-id);; Calculates the number of slots that the current topology can use and assigns it to total-slots-to-use;; The specific content of this value is the minimum value of the following two values:;; Number of worker set in 1.Topology;; 2. Current Available-slots plus alive-assigned number total-slots-to-use (min (. Getnumworkers topology) (+ (Count Ava Ilable-slots) (count alive-assigned)); Sort the available-slots to calculate the number of slots that need to be allocated (Total-slots-to-use minus alive-assigned);; Finally, the slots are removed sequentially from the sorted Available-slots collection and assigned to Reassign-slots Reassign-slots (Take (-Total-slots-to-use (Count Alive-assi gned)) (Sort-slots available-slots)); Gets the Executor collection reassign-executors (sort (set/difference all-executors (s) that need to be allocated by comparing the differences between the all-executors and the assigned executor collection. ET (Apply concat (Vals alive-assigned)))); The above computed reassign-executors is associated with reassign-slots and converted to the <executor,slot> mapping set;; and assign the value to reassignment, at this time there are two kinds of situation:;; The number of 1.reassign-executors is less than the number of reassign-slots: means that there are more available resources in the current cluster;;       Eg.reassign-executors for (E1,E2,E3), reassign-slots for (S1,S2,S3,S4,S5), ;; Then the match result is {[E1,S1],[E2,S2],[E3,S3]};; Number of 2.reassign-executors more than Reassign-slots: means that the available resources of the current cluster are very limited;; Eg.reassign-executors for (E1,E2,E3,E4,E5,E6), reassign-slots for (S1,S2),;; Multiple executor are assigned to the same slot at this point, and the result may be:;; {[E1,S1],[E2,S1],[E3,S2],[E4,S1],[E5,S2],[E6,S2]} reassignment (into {} (map vector reassi Gn-executors;; For some reason it goes into infinite loop without limiting the REPEAT-SEQ (Repeat-seq (Count reassign-exec utors) (reassign-slots))];;  Determines whether reassignment is empty and prints the log of the available slot information if it is not empty (When-not (empty reassignment) (log-message "Available Slots:" (pr-str Available-slots)); Returns a collection of type <executor,[node,port]> reassignment, reassignment))

Fn3_1:

;; This method is used to obtain the resources currently allocated by the topology (Defn get-alive-assigned-node+port->executors [cluster topology-id];; Call cluster's Getassignmentbyid to get the topology current assignment (let [existing-assignment Getassignmentbyid Topology-id);;                         Determines whether the current assignment is empty, and if not null, gets the <executor,slot> information Executor->slot (if existing-assignment                         (. Getexecutortoslot existing-assignment) {})         ;; Convert the previously acquired <executor,slot> to <executor,[node+port]> collection Executor->node+port (Into {} (for [[^executordetai                  LS executor ^workerslot slot] executor->slot:let [executor [(. Getstarttask executor) (. Getendtask executor)] Node+port [(. Getnodeid slot) (. Getport slot)]] [{executor node+port}]); Convert the previous <executor,[node+port]> collection to <[node+port],executors> collection alive-assigned (Reverse-map executor-> Node+port)];; Returns the resulting <[node+port],executors> set alive-assigned))

Note: Learn Li Ming and other storm source analysis and Chen Min-min teachers and other storm technology insider with Big data practice notes collation.
Please pay attention to the following QR code for technical exchanges:

Jstorm and Storm Source Analysis (Iv.)--Equalizer Scheduler, Evenscheduler

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.