This paper presents a process-aware task scheduling algorithm ioaware o This algorithm evaluates the hardware performance of compute nodes and guesses the attributes of tasks during the execution of tasks. When the successor task is assigned, the performance of the compute node is assigned to different tasks, thus achieving the shared compute node disk IO effect. This can shorten the execution time of parallel tasks and increase the throughput of the cluster. The ioaware algorithm is characterized in two aspects: first, from the task of disk IO to determine the requirements of the task attributes, the task is divided into bound and io-bound types, the different types of tasks combined to reduce the number of tasks at the same time on disk 10 operation, reduce the possibility of disk blocking Secondly, when considering the attribute of the task, it is an important index to improve the localization ratio of the input data of the task, to reduce the transmission time of the data network and to reduce the execution time of the task. In order to verify the theoretical feasibility of ioaware scheduling algorithm, the paper designs and implements the Ioaware scheduling module under the Hadoop platform. In the Hadoop cluster, the scheduling module is used for many experiments to compare the performance of ioaware and fifo,capacity scheduling algorithm and fair dispatch algorithm from the four aspects of job response time, task data localization ratio, system throughput rate and system resources. The experiment found that for the individual task execution time, the scheduling module gets the same time as the existing scheduling module. For tasks with different attributes, the scheduling module can combine tasks of different attributes, reduce the number of disk operations at the same time, and shorten the time that the CPU waits for disk The CPU utilization is improved, secondly, the scheduling module can improve the data localization rate of the task and improve the throughput rate of the system.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.