Faced with a variety of services required by the computing system and framework, for resources, the common resource scheduling method of static partitioning method, the framework between fragmentation, but the use of low efficiency.
Basic problems in scheduling design:
Resources are heterogeneous (some of them are high, some are low) so they are divided into small-grained resources.
Data locality (mobile computing instead of moving data)
Support preemption resource or not
Allocation of resource granularity: full-or-no (MPI) or incremental-satisfying allocation strategy (MAPREDUCE)
The current development of the resources as a whole, above the abstraction of a resource scheduling system.
A common resource scheduling system model:
The machine responsible for the job has a node manager and can be divided into multiple isolated containers, each container (Container) can perform tasks, the Node Manager on the one hand is responsible for the local to the resource collector to report their own resources, on the other hand is responsible for the task to be included in an easy implementation of the machine.
The scheduler consists of a resource collector and a resource scheduling policy, which collects the state of the resources returned by the nodes and reflects them into the resource pool, which is responsible for which resources are assigned to which tasks.
The common scheduler has two levels, the first level of the central scheduler, you can see what resources, scheduling which resources allocated to each computing framework (coarse granularity), the second quarter for the framework scheduler, according to the resources allocated by the Central scheduler to further fine-grained resource scheduling. (Mesos,yarn)
"Big Data Day Knowledge" Cluster resource management and scheduling notes