Mapreduce job submission FAQ

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. upload and download files from HDFS
First error:

 <Span style = "font-size: Medium"> exception inthread "Main"Java. Lang. illegalargumentexception: Wrong FS: HDFS://192.168.1.11: 9000/usr/yujing/wordcount,Expected: HDFS://MASTERS: 9000 </span>

Many people have encountered this problem. HDFS: // 192.168.1.11: 9000 cannot be directly used when connecting to a cluster in Ubuntu or Windows systems, to add the 192.168.ing of 192.168.1.11 to the hosts file, someone may not know where the hosts file is in windows? C: \ windows \ system32 \ drivers \ etc \ hosts (this is a hidden file that you can display). Add 192.168.1.11 master to the host file.
Second error:

<Span style = "font-size: Medium">Org. Apache. hadoop. IPC. RemoteException: org. Apache. hadoop. HDFS. server. namenode. safemodeexception: cannot create File/Usr/yujing/Wordcount. Name node is in safe mode. The ratio of reported Blocks0.0000Has not reached the threshold0.9990. Safe mode will be turned off automatically.</Span>

This error occurs because the client has no permission to operate HDFS in the cluster.
Solution:
(1) InCodeAdd the last line of code
Conf. Set ("DFS. Permissions", "false ");
(2) hdfs-site.xml in the cluster Profile
Property>
<Name> DFS. Permissions </Name>
<Value> false </value>
</Property>
Then restart
1. wordcount example provided by hadoop
First error:

<Span style = "font-size: Medium"> 14:24:59, 12/02/10Info IPC. Client: retrying connect to server: localhost/127.0.0.1: 9000. Already tried 0Time (s ).12/02/10 14:25:01 info IPC. Client: retrying connect to </span>

When you clearly write the cluster IP address in the code, but the connection is localhost, this is because mapreduce connects localhost by default.
Solution:
Conf. Set ("fs. Default. Name", "HDFS :/// master: 9000 ");
Conf. Set ("hadoop. Job. User", "yujing ");
Conf. Set ("mapred. Job. Tracker", "Master: 9001 ");
In this way, jobclient submits the job to the hadoop cluster.
Second error:

<Span style = "font-size: Medium"> exception in thread "Main"Org. Apache. hadoop. mapreduce. Lib. Input. invalidinputexception: input path does not exist: HDFS://MASTER: 9000/user/yujing/D:/qq.txt </span>

This error occurs because the job submitted to the cluster is the file input path that must be the file path on HDFS, and the output path is also the file on HDFS.
Third error:

<Span style = "font-size: Medium"> 14:52:36, 2/02/10Warn mapred. jobclient: no job jar file set. User classes may not be found. See jobconf (class) or jobconf # setjar (string ).12/02/10 14:52:36Info mapred. jobclient: cleaning up the staging area HDFS://MASTER: 9000/tmp/hadoop-Hadoop/Mapred/staging/yujing/. Staging/job_201202091335_0293 </span>

The preceding error occurs because the mapreduce output path already exists. You must delete the file first.
Correct running result:

<Span style = "font-size: Medium"> 14:59:35, 12/02/10 Info input. fileinputformat: total input paths to process: 1 12/02/10 14:59:35Info mapred. jobclient: running job: job_201202091335_0299 12/02/10 14:59:36 info mapred. jobclient: Map 0% reduce 0%/10 14:59:48 info mapred. jobclient: Map 12/02 Reduce 0% 12/02/10 15:00:04 info mapred. jobclient: Map 100% Reduce 100% 12/02/10 15:00:09 Info mapred. jobclient: job complete: job_201202091335_0299 12/02/10 15:00:09 info mapred. jobclient: counters: 25 </span> 12/02/10 14:59:35 Info input. fileinputformat: total input paths to process: 112/02/10 14:59:35 Info mapred. jobclient: running job: job_201202091335_0299 12/02/10 14:59:36 info mapred. jobclient: Map 0% reduce 0%/10 14:59:48 info mapred. jobclient: Map 12/02 Reduce 0% 12/02/10 15:00:04 info mapred. jobclient: Map 100% Reduce 100% 12/02/10 15:00:09 Info mapred. jobclient: job complete: job_201202091335_0299 12/02/10 15:00:09 info mapred. jobclient: counters: 25

2. Self-written mapreduceProgram
First error:

<Span style = "font-size: Medium">Java. Lang. runtimeexception: Java. Lang. classnotfoundexception: CN. hadoop. invertedindex $ invertedindexmapper at org. Apache. hadoop. conf. configuration. getclass (configuration. Java:866) At org. Apache. hadoop. mapreduce. jobcontext. getmapperclass (jobcontext. Java:(195) </span>

Solution 1:
This is because the jar package we submitted does not exist in the cluster, so namenode does not know how to execute our job, so it will report a null pointer Exception error, so you need to open the jar and submit it to the cluster.
Solution: First compress your program into a jar package, put it in the root directory of the project, and add jobconf conf = new jobconf to the code.
(); Conf. setjar ("PR. Jar"); this way, this error has plagued us for a long time.
Solution 2:
When using the Eclipse plug-in, many people will always seek the desired results when using the plug-in at the beginning. This may be related to the eclipse or plug-in package version, and some versions of eclipse are not compatible with the hadoop plug-in, there is a hadoop-eclipse-plugin-0.20.203.0.jar plug-in less some package, you need to manually modify, how to modify this online method, on the eclipse installation plug-in is to try more can be, after the plug-in is complete, click Run hadoop. The plug-in will package your program and submit it to the cluster.

Solution 3:
Package Your mapreduce program in the program and submit job tasks. This part of code is also implemented in the Eclipse plug-in. Here we will use code to implement the plug-in function.
Second error:

 <Span style = "font-size: Medium"> 14:59:35, 2/02/10Info input. fileinputformat: total input paths to process:1 12/02/10 14:59:35Info mapred. jobclient: running job: job_201202091335_029912/02/10 14:59:36 info mapred. jobclient: Map 0% reduce 0%/10 14:59:48 info mapred. jobclient: Map 12/02Reduce0% </span>

This error occurs because mapreduce has one reduce by default. If the number of maps is large, the reduce process will not proceed. The solution is to add a job in the code. setnumreducetasks (4); set the number of CED.

From: http://yu06206.iteye.com/blog/1402084

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mapreduce job submission FAQ

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support