Storm task scheduling algorithm for distributed computing system

Source: Internet
Author: User

The comparison http://www.111cn.net/sys/linux/96715.htm of worker, executor and task in distributed computing system storm

Overview of 3 kinds of scheduler

Eventscheduler: Distribute the available resources in the system evenly to the topology that need resources, but it is not absolutely uniform, the following will be explained in detail
Defaultscheduler: And Evenetscheduler is similar, but will be the other topology not need to collect the resources, and then Eventscheduler
Isolationscheduler: Users can define this topology machine resource and assign these topology as a priority when allocating storm to ensure that the machine assigned to the topology serves only this one topology

Defaultscheduler

Call the cluster Needsschedualertopologies method to obtain the topologies for the task assignment
Start processing each topology separately
Call the cluster Getavailableslots method to get the resources available to the current cluster, return in the form of a <node,port> collection, and assign to Available-slots
The executor information of the current topology is obtained and converted to the &LT;START-T ask-id,end-task-id> collection is stored all-executors, the topology information is calculated according to executors, Using the compute-executors algorithm, will be explained later
Then call the Eventscheduler Get-alive-assigned-node+port->executors method to get the resources that topology has already obtained, and return to <node+port,executor> Collection to alive-assigned, why do you want to calculate the allocated resources for the current topology instead of all the allocated resources in the cluster? , guessing might be useful when doing a task rebalance.
Then we call the Slot-can-reassign to judge the slots information in alive-assigned, and select the slot deposit variable which can be reassigned can-reassigned
This available resource is made up of available-slots and can-reassigned.
Next, calculate the total number of slot that the current topology can use Total-slots--to-use:min (numworker number of topology)
If the number of slots currently allocated is total-slots--to-use>, the Bad-slots method is invoked to compute the slot that can be freed
Call the cluster Freeslots method to release the computed Bad-slot
Finally, the Eventscheduler Schedule-topologies-evenly is called to allocate
Continue to the next topology

Main process Comb: Get current cluster idle resource-> compute current topology executor information (used when allocating)-> compute reallocated and deallocated resource-> allocations


Eventscheduler

Eventscheduler scheduling algorithm with default compared to a calculation can be reallocated to allocate resources, directly using the supervisor of idle slot to distribute, no longer in this detail.


Eventscheduler and Defaultscheduler scheduling examples:

These two scheduling mechanisms in general, the scheduling results are basically consistent, so together to see:

Cluster initial state


Next we submit 3 topology

Topology

Worker number

Executer number

Task number

T-1

3

8

16

T-2

5

10

10

T-3

3

5

10


1. Submit T-1

The sort-slots algorithm handles available slots, with the result {[S1 6700] [S2 6700] [S3 6700] [S4 6700] [S1 6701] [S2 6701] [S3 6701] [S4 6701] [s1 6702] [s2 6702] [S3 6702] [S4 6702] [S1 6703] [S2 6703] [S3 6703] [S4 6703]}
Compute-executors algorithm calculated after the executor list is: {[1 2] [3 4] [5 6] [7 8] [9 10] [11 12] [13 14] [15 16]}; Note: format is [Start-task-id end-task -id], a total of 8 worker, the first contains 2 Task,start-task-id for 1,end-task-id 2, so it is recorded as [1 2], and so on ... the compute-executors algorithm will be detailed in the next blog post
8 Executor on 3 Worker's distribution status is [3,3,2]
The results of the assignment are:
{[1 2] [3 4] [5 6]}-> [S1 6700]
{[7 8] [9] [one]}-> [s2 6700]
{[->]} [S3 6700]

After allocation, the cluster status is:


2. Submit T-2

Available slot after sort-slots: {[S1 6701] [S2 6701] [S3 6701] [S4 6700] [S1 6702] [S2 6702] [S3 6702] [S4 6701] [S1 6703] [S2 6703] [S3 6703] [S4 6702] [S4 6703]}
Comput-executors computed after executor list: {[1 1] [2 2] [3 3] [4 4] [5 5] [6 6] [7 7] [8 8] [9 9] [10 10]}
The distribution of 10 executor on 5 worker is [2,2,2,2,2]
The results of the assignment are:
{[1 1] [2 2]}-> [S1 6701]
{[3 3] [4 4]}-> [S2 6701]
{[5 5] [6 6]}-> [S3 6701]
{[7 7] [8 8]}-> [S4 6700]
{[9 9] [ten]}-> [S1 6702]

After allocation, the cluster status is:


3. Submit T-3

Sort-slots after slot list is: {[S1 6703] [S2 6702] [S3 6702] [S4 6701] [S2 6703] [S3 6703] [S4 6702] [S2 6704] [S3 6704] [S4 6703] [s 4 6704]}
Compute-executors After the executor list is: {[1 2] [3 4] [5 6] [7 8] [9 10]}
The distribution of 5 executor on 3 worker: [2,2,1]
The results of the assignment are:
{[1 2] [3 4]}-> [S1 6703]
{[5 6] [7 8]}-> [S2 6702]
[9]-> [S3 6702]

After allocation, the cluster status is:


As shown in the figure, this task scheduling method is not absolutely uniform, S1 already full load operation, and S4 just use a slots.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.