MapReduce error handling, task scheduling and shuffle process

Last Update:2015-06-10 Source: Internet

Author: User

Tags shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Error Handling

There are three main errors in the following types:
1. Task Tasks
2, Jobtracker failure
3, Tasktracker failure

Task Tasks

1. When the code in the map or reduce sub-task throws an exception , the JVM process sends an error report to the service Tasktracker process before exiting. Tasktracker will mark this (task attempt) Taskattempt as failed state, releasing a slot to run another task .

2. For a flow task, if the stream process exits with a non-0 run, it is marked as failed.

3, the sub-JVM abruptly exits (JVM error), then Tasktracker will notice that the process has been exited, marked as failed.

4, Tasktracker the child task marked as failed, it will reduce its own counter one, in order to request a new task to Jobtracker , but also through the heartbeat tells Jobtracker a local task attempt failed.

5, Jobtracker received a task failure notification, it will rejoin to the dispatch queue reassigned to other Tasktracker execution (to avoid assigning the failed task to the execution failed tasktracker), but this attempt is also a limited number of times, by default, when the task is still not completed after 4 attempts, it will not retry (Jobtracker will mark it as killed), and the entire job will fail at this time.

tasktracker Failure

1, Tasktracker once failed, will stop sending heartbeat to Jobtracker.

2, at the same time jobtracker from the task pool to delete this tasktracker, Tasktracker running tasks will be transferred to other Tasktracker nodes to run.

3, if each tasktracker above the number of task failures is much higher than the other nodes,Jobtracker put the Tasktracker into the blacklist.

4, if the successful completion of the map task, the Tasktracker node has been invalidated, then the reduce task will not be able to access the tasktracker stored on the local file system intermediate results , need to be re-executed on the other Tasktracker node.

jobtracker Failure

1) jobtracker failure is the most serious kind of failure, and in the case of a single point of failure in hadoop1.x is quite serious.

2) can be controlled by starting multiple jobtracker, in which case only one master Jobtracker is run, and the master Jobtracker is coordinated by zookeeper. However, the probability of failure is still relatively high, so hadoop2.x uses a new architecture yarn. The job scheduling and task management are completely stripped out.

Job scheduling

FIFO Scheduler default mode
Fair Scheduler (Fairscheduler)
Capacity Scheduler (Capacityscheduler)

FIFO-First-out scheduler

1. FIFO Scheduler is the default scheduler in Hadoop, which first follows priority and then schedules according to the order in which the jobs arrive .

2. One drawback of this default scheduler is that high-priority and long-running jobs are being processed, while low-priority and short-term jobs are not scheduled for long periods of time.

Fair Scheduler (Fairscheduler)

A scheduler developed by Fackbook.
1. The goal of Fairscheduler is to allow each user to share the cluster fairly .
2, the job is placed in the pool, by default, each user has their own pool.
3, support preemption , if a pool does not receive a fair allocation of resources within a certain time, the scheduler terminates the task of getting too many resources in the running pool to give the task slot to a pool with insufficient resources.

Capacity Scheduler (Capacityscheduler)

The scheduler developed by Yahoo.
1, support multiple queues, each queue can be configured a certain amount of resources , each queue with a FIFO scheduling policy.
2, in order to prevent the same user submitted a job exclusive queue of resources, the same user submitted jobs accounted for the amount of resources to qualify .
3) Features: Hierarchical queue, Resource capacity assurance, security, resiliency, operability, resource-based scheduling.

Configure Fairscheduler

1, modify the Mapred-site.xml

<property >    <name>Mapred.jobtracker.taskScheduler</name>    <value>Org.apache.hadoop.mapred.FairScheduler</value></Property ><property >    <name>Mapred.fairscheduler.allocation. File</name>    <value>$HADOOP _home/conf/fair-scheduler.xml</value></Property >

2, Fair-scheduler.xml
The above is the related operation of hadoop1.x.

hadoop2.x to Yarn-site.xml Configuration can refer to the official website
Http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

The default scheduler that comes with Hadoop YARN is Capacityscheduler.

Shuffle and sort

Map End

1, the map end is not simply write the intermediate results to the disk, but the use of the ring buffer to first output the map into memory.

2, each map has a ring buffer , the default size of 100M, size can be modified by the property io.sort.mb .

3, once the memory buffer reaches an overflow threshold. ( io.sort.spill.percent ), a new overflow file is created .

4, multiple overflow files will eventually be merged into a partitioned output file that has been sorted, written to disk as input to reduce.

5, io.sort.factor control the maximum number of partitions can be combined at a time.

Reduce End

1, reduce end shuffle process, divided into three stages: Copy map output, sorting merge and reduce processing.
2. Since reduce can receive multiple map outputs, it is still necessary to sort and merge locally when copying the map output phase.

3, map tasks can be completed at different times, prisoners this as long as a map task is finished, the reduce task begins to replicate its output.

4, the reduce task has a small number of replication threads, you can obtain the map output in parallel.
( mapred.reduce.parallel.copies attribute to control)

5. The reduce processing stage does not wait for all inputs to be merged into one large file to be processed , but rather to process the results of the partial merge directly .

MapReduce error handling, task scheduling and shuffle process

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More