After you can run the program in the Hadoop cluster environment on the command line, match the various configurations in Eclipse and click Run on Hadoop.
The job runs successfully, and the results are visible on HDFs, but still, not committed to the real cluster environment.
Long-time data, directly in the code to specify the remote Jobtracker address, still failed.
Then in Eclipse Debug the program, run successfully after the jar package upload into the Hadoop cluster to run:
Direct export to ensure the meta-inf/manifest of the jar file. Main-class mappings exist in MF files:
Main-class:wordcount
In fact, there is this relationship directly in the next automatic file.
Upload the well-made jar to the server, assuming that the command is in the/OPT directory:
Hadoop Jar/opt/mywordcount.jar wordcount/test_in/output12
Error:
Xception in thread "main" java.lang.UnsupportedClassVersionError:WordCount:Unsupported Major.minor version 52.0
At Java.lang.ClassLoader.defineClass1 (Native Method)
At Java.lang.ClassLoader.defineClass (classloader.java:800)
At Java.security.SecureClassLoader.defineClass (secureclassloader.java:142)
At Java.net.URLClassLoader.defineClass (urlclassloader.java:449)
At java.net.urlclassloader.access$100 (urlclassloader.java:71)
At Java.net.urlclassloader$1.run (urlclassloader.java:361)
At Java.net.urlclassloader$1.run (urlclassloader.java:355)
At java.security.AccessController.doPrivileged (Native Method)
At Java.net.URLClassLoader.findClass (urlclassloader.java:354)
At Java.lang.ClassLoader.loadClass (classloader.java:425)
At Java.lang.ClassLoader.loadClass (classloader.java:358)
At JAVA.LANG.CLASS.FORNAME0 (Native Method)
At Java.lang.Class.forName (class.java:270)
At Org.apache.hadoop.util.RunJar.main (runjar.java:205)
Online search data, suspected Java version of the difference caused, Win7 on the eclipse is java1.8. And the server is java1.7.
Windows--preference--java--compile--compile level in Eclipse, select 1.7
Re-import Run
Error occurred:
14/11/07 10:33:46 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 0 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:47 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 1 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:48 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 2 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:49 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 3 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:50 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 4 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:51 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 5 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
14/11/07 10:33:52 INFO IPC. Client:retrying Connect to server:hadoop-05/192.168.0.7:8032. Already tried 6 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
ResourceManager not connected. Check that the Yarn-site.xml are all configured.
However, the discovery port number is inconsistent with the default port number, so modify the
The configuration file changes to the following:
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.0.7</value>
</property>
Rerun, the same error still occurs, and the explicitly specified job.tracker in the code is commented out.
There was another mistake:
Usage:wordcount <in> <out>
Check the code and find that this is because the input parameters are not two. But check the command did not find errors, can only write the path to die in the program, and then hit the jar package
Fileinputformat.addinputpath (Job, New Path ("hdfs://192.168.0.7:9000/test_in"));
Fileoutputformat.setoutputpath (Job, New Path ("HDFS://192.168.0.7:9000/OUT1"));
Submitted to the Hadoop cluster, the results come out.
But still did not figured out for what path to write on the outside not to. First record mark