Java. Lang. runtimeexception: Why JAVA. Lang. classnotfoundexception is abnormal when running mapreduce Program
After the hadoop distributed configuration is complete, I will direct the hadoop-0.20.1 under the master node into eclipse, want to write programs in eclipse and directly compile and run on the hadoop cluster. It was discovered today that this was impossible to succeed. Because I ignore the running mechanism of mapreduce programs in hadoop: when running a job, the mapreduce framework can execute task tasks (MAP and reduce functions) on each slave node ), the resources required to run the job, including the JAR file, configuration file, and calculation input division, are copied to a directory named after the job ID on HDFS, in addition, the job jar has many copies to ensure that the tasktracker can access the copy and execute the program when running the task.
Symptom:
0/08/16 15:25:48 warn Conf. Configuration: deprecated: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
Output directory wcout already exists, firstly delete it
10/08/16 15:25:49 warn mapred. jobclient: no job jar file set. User classes may not be found. See jobconf (class) or jobconf # setjar (string ).
10/08/16 15:25:49 info input. fileinputformat: total input paths to process: 4
10/08/16 15:25:50 info mapred. jobclient: running job: job_201008161439_0004
10/08/16 15:25:51 info mapred. jobclient: Map 0% reduce 0%
10/08/16 15:26:00 info mapred. jobclient: task id: attempt_201008161439_0004_m_000000_0, status: Failed
Java. Lang. runtimeexception: Java. Lang. classnotfoundexception: org. Apache. hadoop. Examples. wordcount2 $ wordcountmapper
At org. Apache. hadoop. conf. configuration. getclass (config. Java: 808)
At org. Apache. hadoop. mapreduce. jobcontext. getmapperclass (jobcontext. Java: 157)
At org. Apache. hadoop. mapred. maptask. runnewmapper (maptask. Java: 532)
At org. Apache. hadoop. mapred. maptask. Run (maptask. Java: 305)
At org. Apache. hadoop. mapred. Child. Main (child. Java: 170)
Caused by: Java. Lang. classnotfoundexception: org. Apache. hadoop. Examples. wordcount2 $ wordcountmapper
At java.net. urlclassloader $ 1.run( urlclassloader. Java: 202)
At java. Security. accesscontroller. doprivileged (native method)
At java.net. urlclassloader. findclass (urlclassloader. Java: 190)
At java. Lang. classloader. loadclass (classloader. Java: 307)
At sun. Misc. launcher $ appclassloader. loadclass (launcher. Java: 301)
At java. Lang. classloader. loadclass (classloader. Java: 248)
At java. Lang. Class. forname0 (native method)
At java. Lang. Class. forname (class. Java: 247)
At org. Apache. hadoop. conf. configuration. getclassbyname (configuration. Java: 761)
At org. Apache. hadoop. conf. configuration. getclass (config. Java: 806)
... 4 more
Cause analysis:
The program is not run in the form of jar, so it will not upload the jar to HDFS, so that all nodes outside the node cannot find the map and reduce classes during task execution, therefore, an error occurs when running the task. In fact, the running method without uploading jar in the program may cause an error message: no job jar file set. user classes may not be found. see jobconf (class) or jobconf # setjar (string ).
As for the fact that the original wordcount program can run because each node has this class, I only have this class on the master node.
Solution:
(1) Add this class to the examples package of all nodes,
(2) package it into jar and run it again.
Notes:
In a pseudo-distributed environment (a node is both the master node and slave node), it cannot run successfully directly in eclipse, and it also needs to be packaged into a jar