Sparrow: Decentralized stateless distributed scheduler for fine-grained tasks with low latency scheduling

Last Update:2018-07-25 Source: Internet

Author: User

Tags dashed line

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Background Introduction

Sparrow's paper is included in Sosp 2013, on the Internet can also find a writer's talk ppt, it is worth mentioning that the author is a bit ppmm. She has previously published a case for Tiny Tasks in Compute Clusters, this article I did not read carefully, but at that time when looking at the mesos thickness and granularity pattern, the group has discussed this paper. Combined with her GitHub project, she found that she and Mesos,spark in the amp lab had a lot of roots in research and collaboration. Through Sparrow, combined with some understanding of mesos and yarn, we can get more information about resource scheduling in theory and practical projects.

Applicable Scenarios

Sparrow is a decentralized (dispersed) stateless distributed scheduler that provides low-latency scheduling for fine-grained tasks. In a workload consisting of sub-second tasks, the scheduler must provide a millisecond-level deferred scheduling decision for millions of tasks per second, while tolerating scheduling failures. Sparrow mainly with Batch sampling + late Binding + Constraints to achieve better results. The following describes the batch sampling and late Binding, and constraints means that the user can make some constraints and settings for each job, such as all tasks must run on the worker with the GPU, That is, when selecting workers, add some conditional constraints. Constraints may allow users to maintain a higher level of goodwill when using Sparrow, while real low-latency scheduling should depend on the strategy of batch sampling + late binding.

Basic

Sparrow assumes that each worker already has a long-runing executor process for each framework, the most basic being random, assigning the tasks of the job to the randomly selected two workers, The worker itself maintains one or more queue (multiple queues are maintained when the user is quarantined or the priority is required).

This most basic method of assigning a task is derived from the Power of two Choices in randomized Load balancing, which allows the scheduler to probe two randomly selected workers and assign the task to a shorter queue ( Only consider the number of tasks that already exist on the worker. Based on the idea of the Power of two Choices in randomized Load balancing, Sparrow presents its own three improved methods, and gives some test data explaining the effect of each feature improvement.

Sparrow based on random allocation, the first improvement is the per-task sampling, which is a random operation for each task, sent by scheduler probe (probe, a lightweight RPC) to the worker, Then select a queue for the shorter worker assignment task. The next task re-makes the random work selection.

Batch Sampling

The above allocation method will let the tail of tasks wait for a long time, Batch sampling improved the way the task was previously assigned to the random worker, let scheduler simultaneously for a job m tasks probe m*d workers (d >1), monitor many workers for tasks together. The following figure, scheduler, detects four workers simultaneously for two tasks. The performance of bulk sampling does not degrade as job parallelism increases.

Late Binding

As mentioned in the previous random pattern, scheduler's choice of worker is to select a worker with the shortest queue length to place a new task from the selected worker that has been probed, but it may cause a longer wait time. Because the execution time of the tasks that have been queued is not taken into account. If a task is directly assigned to queue on a worker, it can cause longer wait times.

Delay binding combined with the above batch sampling method, you can monitor d*m worker at the same time, as long as the worker queue is empty, you can put the task in the past to execute. In the image below, Scheduler selects four workers for batch sampling, and when it detects that the first worker queue is empty, it can actually put one of the two tasks in the job on that worker, Another task continues to be monitored by scheduler for four workers.

In terms of performance, the following diagram shows the latency comparisons of the various methods, where the black dashed line is the optimal scheduling that the scheduler can make with respect to both worker and task, and the closest to the optimal performance is the batch + late binding curve. From left to right corresponds to the above diagram, but also proves that the improvement of sparrow in millions fine-grained tasks low-latency scheduling progress.

Sparrow & Spark

To prove the usability of the Sparrow, the author adds a sparrow dispatch plugin for spark that submits each stage of spark to a sparrow Job, which is submitted to Sparrow schedulers. However, because the Sparrow is stateless, the scheduler used by the framework, if fail, needs to be detected and connected to the standby scheduler. In the test, the client obtains a scheduler list, by sending the heartbeat to detect whether the scheduler is not fail, if fail, the response should also be set for the use of the scene, If you give up this job and start reassigning tasks on another sheduler, make sure idempotent. Sparrow also does not control the fail event of the worker. The project branch address.

I feel sparrow in the status, more like Spark localscheduler within the Localtasksetmanager, but greater than it, Sparrow own some schedulers. Sparrow assumes that each worker already has a long-runing executor process for each framework, and this executor can be a Mesos process running on executor or spark Standalone mode in each slave on the long-running process, sparrow do is I give you a schedulers list, you use my scheduler to handle the tasks in your job how to dispatch and distribution, The specific distribution of the implementation of the matter and the worker fail is not related to sparrow.

In the following illustration, in Spark standalone mode, spark does not use a third-party resource scheduling system such as mesos or yarn, but uses its own scheduling module. Spark's own scheduling module has FIFO pool and FAIR pool, if replaced by Sparrow, tasks are no longer the scheduling of simple first-out (FAIR pool inside is still FIFO), and should be the above batch sampling + lazy Binding + Constraints the way to dispatch to slave.

(The image below is part of a larger image that I drew when I was reading the spark source code, and the lower right corner of the Clusterscheduler section is the Mesos dispatch module, which can be used to schedulerbackend coarse-grained or fine-grained Mesos Scheduler Backend, the lower left uses its own dispatch module Localscheduler, in 0.8 in addition to the FIFO pool, but also new to the fair pool, in my earlier this article it also introduced. )

Similarly, Sparrow implements a priority queue and also implements queue isolation between different user types, in which case each worker maintains multiple queues.

(End of full text)

------------------------I'm a supplemental split line------------------------

Sparrow vs Mesos/yarn

Sparrow through its own scheduler, so that job tasks can be suitable for workers on the low-latency scheduling, suitable for fine-grained large tasks of low-latency scheduling. and Mesos, yarn and other resources related to the allocation of the scheduler is not the same. I understand that Sparrow is a task distribution above Mesos/yarn, distributed to workers is actually started long-running executor and get resources slaves

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More