In mapreduce, our custom Mapper and reducer programs may encounter errors and exits after execution. In mapreduce, jobtracker tracks the execution of tasks throughout the process, mapreduce also defines a set of processing methods for erroneous tasks.The first thing to clarify is how mapreduce judges the task failure. In three cases, the task is considered to fail: a non-zero value is returned, a Java exception is generated, and a timeout occurs (no response is returned for a long time ). For the first type, it is usually used for streaming programs. If a non-zero value is returned when your mapper or reducer program ends, mapreduce will think that your task has failed. The second type is mainly used for mapreduce programs written in Java. For the third type, it is estimated that many people do not know. For streaming, mapreduce will monitor the output of the task (standard output) after the task is executed. if the task is not output within a certain period of time (this time can be done through mapred. task. timeout option), mapreduce will consider this task to fail. Therefore, when writing a mapreduce program, you must pay attention to whether the program will be suspended due to excessive time. If so, consider whether the program will be killed by mistake. After a task fails, mapreduce will re-execute the task. The number of retries can also be set, generally four. Finally, it should be noted that mapreduce also has a speculative execution mechanism. Under this mechanism, if the task execution time exceeds expectation (this expectation is determined based on the execution time of other tasks ), in this case, mapreduce starts another parallel execution task that is the same as this task, and kills other unfinished tasks after the first execution of a task is completed. This mechanism is mainly used to avoid a problem in the execution environment of a reduce task or an abnormal situation occurs in the execution of a reduce task, so that the overall progress is delayed. However, this mechanism may also cause problems in some situations. For example, if your reduce program is executed concurrently with the same input, it may cause a conflict, the speculative execution mechanism is a huge risk for you. However, the speculative execution mechanism can also be disabled.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.
A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service