Hadoop on Mac with intellij idea-7 solution failed to report status for 600 seconds. Killing! Problem

Source: Internet
Author: User

This article describes how a job encounters a failed to report status for 600 seconds. Killing in the ruduce stage after hadoop 1.2.1 completes map! The process of solving the problem.

Environment: Mac OS X 10.9.5, intellij idea 13.1.4, hadoop 1.2.1

Hadoop is stored in a virtual machine. The host machine is connected through SSH, And the IDE and data files are stored in the host machine. Idea runs on JDK 1.8 and uses JDK 1.6 for idea engineering and hadoop.

After submitting a job to hadoop, the job execution time is too long and the output is as follows:

The reduce stage starts again after 66%, and then the output report fails to report the status within 10 minutes. The process is terminated. Then, reduce continues.

The cause of the timeout may be that the reducer does not report the task progress to the hadoop framework for time-consuming computing. It may also be that the program runs out of all Java heap space or the Garbage Collector starts frequently, causing the reducer to fail to send the status to the job tracker in time and thus be terminated. Or, one of the reducers receives too much erroneous data, which causes the program to lose response. There are two solutions to this problem.

Method 1, In the mapred-site.xml to increase the Super Value

<property><name>mapred.task.timeout</name><value>1200000</value></property>

The default timeout value is 600000 milliseconds, that is, 10 minutes.

Method 2Record every n rows, as shown in the reducer document example.

public void reduce(K key, Iterator<V> values,OutputCollector<K, V> output, Reporter reporter) throws IOException {    // report progress    if ((noValues%10) == 0) {        reporter.progress();    }    // ...}

In addition, you can add a custom counter in the preceding example, such as reporter. incrcounter (num_records, 1). If the preceding method is invalid, you can consider method 3.

Method 3, Trying to increase the JVM heap size, in the mapred-site.xml settings

<property><name>mapred.child.java.opts</name><value>-Xmx2048m</value></property>

For details about how to determine the heap size, refer to hadoop on Mac with intellij idea-5 to solve the Java heap space problem reference: JVM tuning Summary-XMS-xmx-xmn-XSS. At the same time, we try to reduce the number of parallel reducers.

<property><name>mapred.tasktracker.reduce.tasks.maximum</name><value>1</value></property>

The default value is 2. The new value should be smaller than the current value.

Reference

[1] http://stackoverflow.com/questions/15281307/the-reduce-fails-due-to-task-attempt-failed-to-report-status-for-600-seconds-ki

Hadoop on Mac with intellij idea-7 solution failed to report status for 600 seconds. Killing! Problem

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.