[Hadoop]-Tasktracker Source Analysis (Tasktracker node health monitoring)

Source: Internet
Author: User

Object Healthstatus in Tasktracker saves the current node's health state, The corresponding class is org.apache.hadoop.mapred.TaskTrackerStatus.TaskTrackerHealthStatus. Defined as follows:

 static  class  Tasktrackerhealthstatus implements   writable {    /span>private  //     node is healthy  private    String Healthreport; //     private  long  lastreported; //  recent reporting time, last reported time   ..." ....................  

The Healthstatus object is a property in the Tasktrackerstatus instance status that is sent to jobtracker along with other node information, such as memory capacity. The properties in Healthstatus are computed by thread nodehealthcheckerservice. This thread allows an administrator to configure a health monitoring script to detect node health. The only thing to note is that if the script monitors the node in an unhealthy state, you need to print a statement that begins with "ERROR" in standard output. The nodehealthcheckerservice thread monitors the output of the script periodically, and if the statement in the output has a statement that begins with an error, then it is set to an unhealthy node, and Jobtracker adds the node to the blacklist. Assignment tasks are not sent to the node again.

Benefits of this mechanism:

1, can be as the load of the node feedback, such as: When the script monitoring network, IO, file system and other busy time, can be notified to set as unhealthy nodes, reduce the allocation of tasks.

2, artificial temporary maintenance tasktracker, when the occurrence of tasktracker failure, you can temporarily let the Tasktracker stop receiving new tasks, after maintenance, in the set to be able to receive the state.

Configuration parameters:

Parameter name Parameter meaning
Mapred.healthChecker.script.path The absolute path where the Health check script is located, the thread bodehealthcheckerservice the new execution of the script to determine if the node is healthy, and if it is empty, the thread is not started.
Mapred.healthChecker.interval Frequency of thread nodehealthcheckerservice execution of monitoring scripts, in milliseconds
Mapred.healthChecker.script.timeout If the monitoring script does not respond within a certain amount of time, it is set to unhealthy
Mapred.healthChecker.script.args Monitor script parameters, if there are multiple parameters separated by commas

Example: Print the statement at the beginning of an error when the idle memory of a node is less than 10% of the total amount of memory.

#! /bin/= 0.1= ' grep memfree/proc/meminfo | awk ' {print $} '= ' grep memtotal/proc/meminfo | awk ' {print $} '= ' echo | awk ' {print int ("' $totalMem '" * "' $MEMORY _ratio ')} 'if [$freeMem-lt    $limitMem]; Then "ERROR, totalmem= $totalMem, freemem= $freeMem, limitmem= $limitMem"else    "OK"fi

[Hadoop]-Tasktracker Source Analysis (Tasktracker node health monitoring)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.