In Mesos and yarn, the dominant Resource fairness algorithm (DRF) is used, unlike Hadoop slot-based fair and scheduler capacity, which are based on scheduler implementations, Paper reading: Dominant Resource fairness:fair Allocation of multiple Resource Types.
Consider the issue of fair resource allocation in a system that includes multiple resource types (mainly CPU and mem), where different users have different requirements for resources. To solve this problem, several of Berkeley's Daniel proposed dominant Resource fairness (DRF), a max-min fairness for different resource types. And the DRF is evaluated in the design and implementation of Mesos, showing that it can get better throughput than the slot-based fair scheduling algorithm.
DRF is a common multi-resource max-min fairness allocation strategy. The intuitive idea behind DRF is that in a multi-environment, a user's resource allocation should be determined by the user's dominant share (the dominant share of resources), dominant share is a resource that occupies the largest share of all the various resources that have been allocated to the user. In short, DRF tries to maximize the smallest dominant share in all users.
For example, if User a runs a CPU-intensive task and User B runs a memory-intensive task, DRF attempts to balance the CPU resource share of User A and the share of User B's memory resources. In the case of a single resource, the DRF is degraded to max-min fairness.
DRF has four main features, namely: Sharing incentive, strategy-proofness, Pareto efficiency and envy-freeness.
DRF is to provide sharing incentive by ensuring that the system's resources are statically and evenly distributed between users, and that users cannot get more resources than other users. In addition, DRF is Strategy-proof, and users cannot get more resources by misrepresenting their resource requirements. DRF is pareto-efficient, while satisfying other features, allocates all available resources without replacing existing resource allocations. Finally, DRF is Envy-free, and users do not prefer resource allocations for other users.
Consider a system with 9 CPUs and 18GB, two users: Each task of User a requests (1CPU,4GB) resources, and each task of User B requests (3CPU,1GB) resources. How do you build a fair distribution strategy for this situation?
For user A, each task consumes a resource of <1/9,4/18>=<1/9,2/9>, so the dominant shares of a is memory, and the ratio is 2/9.
For User B, each task consumes a resource of <3/9,1/18>=<1/3,1/18>, so B's dominant shares is CPU, proportional to 1/3
It is a good choice to assign 3 resources to user A by the equation of the column inequalities, and User B allocates 2 resources.
The DRF algorithm pseudo-code is:
Using the DRF idea, the allocation process is shown in the following table, noting that the decision to decide which resource to assign each time, depending on the last allocation, is currently dominant share the smallest user can get the next resource allocation.
Each iteration chooses a user to assign a resource to, and the user chooses the option: Select the current SI minimum user.
Si: The proportion of the master resources that have been allocated to user I as a percentage of this resource total
In this example, user A's CPU accounted for 2/9 of total mem, while User B's CPU accounted for 1/3,mem 2/9, so A's main resource is memory, and B's main resource is CPU. Based on this, DRF maximizes the minimum allocation of A's memory and maximizes B's minimum CPU allocation.
Job scheduling algorithm in yarn: DRF (dominant Resource fairness)