How does storm assign tasks and load balancing?

Source: Internet
Author: User

background

In the previous article: Storm's basic framework analysis

It basically explores Storm's:

    1. Worker, executor, and other components of the relationship.
    2. Threading models and messaging systems.
    3. Task assignment process.
    4. Topology the process of submission to execution.

However, the relationship between Nimbus, supervisor, parallelism, task allocation and load balancing is not clearly explained, and there are some flaws in the details, this article adds.

relationships between the underlying components

Here are some additions:

    1. A worker is a process that is started by supervisor and is only responsible for processing one topology, so multiple topology are not processed at the same time.
    2. Executor is a thread that is started by the worker and is the physical container that runs the task, and the task is a 1-n relationship.
    3. Component is the abstraction of Spout/bolt/acker.
    4. Task is also an abstraction of spout/bolt/acker, but after calculating the degree of parallelism. Component and task are 1-N relationships.

Supervisor will periodically obtain topologies, assigned task assignment information assignments and various heartbeat information from zookeeper, which is the basis for task assignment.

When the supervisor is periodically synchronized, new workers are started or the old workers are shut down according to the new task assignment in response to task allocation and load balancing.

Workers are informed of other workers that they should communicate by periodically updating connections information.

When a worker starts, one or more executor threads are started based on the task to which they are assigned. These threads only handle unique topology.

The executor thread handles the logic of multiple spouts or multiple bolts, which are spouts or bolts, also known as tasks.

computation of the degree of parallelism significance of the relevant configuration and Parameters

How many workers, how many executor, each executor is responsible for how many tasks are determined by the configuration and the specified Parallelism-hint, but the specified degree of parallelism does not necessarily equal the number of actual runs.

1. The topology-workers parameter specifies the number of workers to be started by a topology runtime.

2. Parallelism-hint specifies the number of initial executor for a component (component, such as spout).

3, Topology-tasks is the tasks of component, calculate a little more complex points:
(1). If topology-tasks is not specified, this value is equal to the initial executors number.
(2). If specified, compare with topology-max-task-parallelism value and take the small one as the actual topology-tasks.

To express in code is:

(defn-Component-parallelism[storm-conf component]  ( Let [storm-conf (merge storm-conf (component-conf component)) num-tasks (
        
         or 
         (storm-conf topology-tasks) 
         (num-start-executors component) )
         max-parallelism (storm-conf topology-max-task-parallelism) ]    (if max-parallelism (min max-parallelism num-tasks) num-tasks)))

4, for Acker This special bolt, the degree of parallelism is calculated as follows:

(1). If topology-acker-executors is specified, the value is calculated as such.
(2). If not specified, then set the degree of parallelism by the value of Topology-workers, in which case a Acker corresponds to a worker, which is obviously inappropriate in the case of heavy computational tasks and large data volumes.

5, if Nimbus-slots-per-topology is configured, when submitting topology to Nimbus, the total number of workers required for topology is verified, and if this value is exceeded, an exception is thrown if the requirement is not met.

6. If Nimbus-executors-per-topology is configured, such as 5th, the total number of executor required for topology will be verified and an exception will be thrown if exceeded.

At the same time, it is important to note that in real-world operations, there is a possibility that parallel tasks are less than the number specified.
The above degree of parallelism can be changed dynamically by invoking the rebalance or do-rebalance operation of the Nimbus interface.

the manifestation of parallelism computation in task assignment

Review some of the main roles in the task assignment first:

Then look at some important parallelism calculation code:

1. Calculate the mapping relationship of all topology Topology-id to executors:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Calculates the mapping of all tolopogy Topology-id to executors;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;(defn- compute-topology->executors [Nimbus storm-ids]"compute a Topology-id Executors map "( into {} ( for [Tid storm-ids]{tid ( Set (compute-executors nimbus tid                  ))))

2. Calculate Topology-id to executors mapping information:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Calculate the mapping of Topology-id to executors;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;(defn-Compute-executors[Nimbus Storm-id]  ( Let [conf  (:conf  Nimbus)  storm-base  (.storm-base   ( :storm-cluster-state  nimbus)  storm-id nil )  component->executors  (:component-> Executors  storm-base)  storm-conf read-storm-conf< /span> conf storm-id)  topology  (read-storm-topology  conf storm-id)  task->component  (storm-task-info  Topology storm-conf) ]     (->> (storm-task-info topology storm-conf) reverse-map (map-v Al sort)(join-maps component->executors)(map-val (partial apply partition-fixed)(mapcat second) (map To-executor-id)          )                                    ))

3, Calculate topology task Information Task-info, here Topology-tasks decide each component component (spout, bolt) degree of parallelism, or said tasks number:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; calculate the task-info of topology;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;(defnStorm-task-info"Returns map from task, component ID"  [^stormtopology user-topology storm-conf]  (->> (system-topology! storm-conf user-topology)All-components; Gets the number of parallel per component       (map-val (comp #(get % topology-tasks) component-conf))       (sort-by first)       (mapcat (fn [[C Num-tasks]] (repeat num-tasks C )))       (map (fn [ID comp] [ID comp]) (iterate (comp< /c12> Int Inc ) (int  1 )))       ( into {})))

4, the above 1, 2, 3 segment code will be called during Nimbus task assignment, the task assignment is done by the Mk-assignments function, the calling process is described by pseudo-code as follows:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; nimbus进行任务分配;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;mk-assignments;; 这一步计算topology的所有executor对应的node + port信息->compute-new-topology->executor->node+port->compute-topology->executors-> ...
Nimbus to assign Tasks

Here we review and complement the main processes for assigning tasks to Nimbus:

process of Task assignment

1, Nimbus A group of node + port called Worker-slot, from executor to worker-slot mapping information, the decision executor will be in which machine, which worker process run, with the spout, Bolt, The location of the Acker is also determined, as shown in:

2. Nimbus is the core of the whole cluster, which is responsible for the topology submission, running status monitoring, load balancing and task assignment.

3. The tasks assigned by Nimbus include the path where the topology code resides (in Nimbus local), tasks, executors, and workers information.
The worker is uniquely determined by node + port and the number of worker configured.

The Task information assignment structure is as follows:

3, Supervisor is responsible for the actual operation of the synchronous worker. A supervisor is called a node. The so-called synchronous worker refers to the task assignment in response to Nimbus, and the establishment, dispatch and destruction of worker.
When the task is received, if the associated topology code is not local, supervisor downloads the code from Nimbus and writes to the local file.

4, through node, port, host information calculation, workers know and which machines to communicate, and when load balancing occurs, the task is reassigned, these machines may have changed, The worker learns the changes by periodically calling Refresh-connections, and works on the creation of new connections, destruction of discarded connections, as shown in:

the basis of task assignment

The heartbeat information of components such as supervisor, worker, executor, and so on will be synchronized to zookeeper,nimbus periodically to obtain this information, combined with assigned task information assignments, cluster existing topologies (running + Not run), and so on, for task assignment, as shown in:

Timing of task assignments

1. Triggering load balancing via rebalance and do-reblalance (for example, from a Web call) triggers a mk-assignments-task assignment.

2, at the same time, the Nimbus process starts, the task will be assigned periodically.

3, the client storm jar ... topology submits topology through the way, will call Nimbus interface through thrift, commit topology, launch new storm instance, and trigger task assignment.

Load Balancing

Load balancing and task allocation are linked, or the key information used in task allocation is dominated by load balancing, and the main roles and processes of task allocation are analyzed above, so load balancing is easy to understand, and the process and framework are as follows:

Among them, the load balancing part of the strategy can be divided into the average distribution, machine isolation or topology isolation, round-robin, and so on, because the main discussion of Storm's basic framework, and the specific load balancing policies are different, and this strategy is fully customizable, For example, the actual capabilities of the machine such as CPU, disk, memory, network, and so on can be abstracted into a resource slot, allocated in this slot, and so on.

It's not going to unfold here.

The new task assignment information is obtained by load balancing Assignments,nimbus and then some conversion calculations are made, the information is synchronized to the zookeeper, and supervisor can synchronize the workers based on this information.

Conclusion

This article is to complement and perfect the previous article.

The answer to this question is also complete:

In topology we can specify the degree of parallelism of spout, Bolt, how storm will automatically publish spout, bolts to each server and control the CPU, disk and other resources of the service when committing topology?

End.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

How does storm assign tasks and load balancing?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.