Hadoop Error Handling Mechanism

Source: Internet
Author: User
Hadoop Error Handling Mechanism

1. hardware faults

Hardware faults refer to jobtracker faults or tasktracker faults.

Jobtracker is a single point. If a fault occurs, hadoop cannot handle it yet. Only the most reliable hardware can be used as jobtracker.

Jobtracker uses the heartbeat (one minute cycle) signal to check whether tasktracker is faulty or overloaded.

Jobtracker removes the faulty tasktracker from the task node list.

If the faulty node is executing the map task and has not completed the task, jobtracker will ask other nodes to re-execute the map task.

If the failed node has not completed the reduce task, jobtracker will ask other nodes to continue the unfinished reduce task.

 

2. Task failed

Task failure caused by code defects or process crashes

JVM automatically exits. If you want to send an error message to the tasktracker parent process, the error message will also be written to the log.

The tasktracker listener will find that the process exits, or if the information is not updated for a long time, mark the task as failed.

After a failed task is marked, the task counter minus 1 to receive the new task and send a heartbeat signal to jobtracker about the task failure.

After jobtracker learns that the task has failed, it will re-put the task into the scheduling queue and re-allocate it before executing it.

If a task fails more than four times (configurable), it will not be executed again, and the job will also be declared as a failure.

 

Hadoop Error Handling Mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.