Research on job scheduling algorithm in Hadoop platform
Zhaoxiaobing of Zhengzhou University
The main content of this paper is the improvement of job scheduling algorithm on Hadoop. An improved algorithm (Btis) for task backup scheduling is proposed for late algorithm to estimate task schedule value and SAMR algorithm does not consider the problem of backup execution node. The Btis algorithm accurately calculates the progress of the task through the history record, find real slow tasks that need to start a backup; When you select a fast node to start a backup for a slow task, you will consider the success rate of the work node's execution and the current load on the work node, and the node with the high success load will be eligible to perform the backup. In the autonomous Hadoop cluster, it is verified that the Btis algorithm can perform the scheduling of the user's job, and shorten the complete time of the whole operation. Some of the data in the experiment are obtained by means of averaging. By comparing the Btis algorithm with the late algorithm and the SAMR algorithm, it can be seen that the Btis algorithm is able to better determine the proportion of each phase of the task, find the most suitable to start the backup of the slow task, and the high efficiency of the backup, can shorten the completion time of the whole operation, improve the utilization Optimize platform performance.
Research on job scheduling algorithm in Hadoop platform