Sharing reason: Although a blog post to write questions feel a bit extravagant, but search Baidu, related articles too little, struggling to find a log to solve the solution.
Problem: The MapReduce program developed on the Windows platform has been slow to run.
MapReduce Program
Public classTest { Public Static voidMain (String [] args)throwsexception{Configuration conf=NewConfiguration (); Conf.set ("Fs.defaultfs", "Hdfs://master:9000/"); Conf.set ("Mapreduce.job.jar", "d:/intelij-workspace/aaron-bigdata/aaorn-mapreduce/target/ Aaorn-mapreduce-1.0-snapshot.jar ". Trim ()); Conf.set ("Mapreduce.framework.name", "yarn"); Conf.set ("Yarn.resourcemanager.hostname", "Master"); Conf.set ("Mapreduce.app-submission.cross-platform", "true"); Job Job=job.getinstance (conf); Job.setmapperclass (wordcountmapper.class); Job.setreducerclass (wordcountreducer.class); Job.setmapoutputkeyclass (Text.class); Job.setmapoutputvalueclass (longwritable.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (longwritable.class); Fileinputformat.setinputpaths (Job,"Hdfs://master:9000/input/"); Fileoutputformat.setoutputpath (Job,NewPath ("hdfs://master:9000/output3/")); Job.waitforcompletion (true); }}
Run results
[QC] INFO [main] org.apache.hadoop.yarn.client.RMProxy.createRMProxy (98) | Connecting to ResourceManager at MASTER/192.168.56.100:8032[QC] WARN [main] Org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles (64) | Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with Toolrunner to remedy this. [QC] INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus (283) | Total input paths to PROCESS:2[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal (198) | Number of SPLITS:2[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.printTokens (287) | Submitting tokens for JOB:JOB_1496627557122_0004[QC] INFO [main] Org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication (273) | Submitted application APPLICATION_1496627557122_0004[QC] INFO [main] Org.apache.hadoop.mapreduce.Job.submit (1294) | The URL to the Job:http://master:8088/proxy/application_1496627557122_0004/[qC] INFO [main] Org.apache.hadoop.mapreduce.Job.monitorAndPrintJob (1339) | Running job:job_1496627557122_0004
Master (NameNode) log
Java.io.IOException:Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0 (Native Method) at Sun.nio.ch.SocketDispatcher.read (socketdispatcher.java:39) at Sun.nio.ch.IOUtil.readIntoNativeBuffer ( ioutil.java:223) at Sun.nio.ch.IOUtil.read (ioutil.java:197) at Sun.nio.ch.SocketChannelImpl.read ( socketchannelimpl.java:380) at org.apache.hadoop.ipc.Server.channelRead (server.java:2603) at org.apache.hadoop.ipc.server.access$2800 (server.java:136) at org.apache.hadoop.ipc.server$ Connection.readandprocess (server.java:1481) at Org.apache.hadoop.ipc.server$listener.doread (Server.java : 771) at Org.apache.hadoop.ipc.server$listener$reader.dorunloop (server.java:637) at Org.apache.hadoop.ipc.server$listener$reader.run (server.java:608
slave (DataNode) Log exception
2017-06-05 09:49:40,464 INFO org.apache.hadoop.ipc.Client:Retrying connect to server:0.0.0.0/0.0.0.0:8031. Already tried 2 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:41,464 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 3 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:42,465 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 4 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:43,467 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 5 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:44,468 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 6 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:45,470 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 7 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:46,472 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 8 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS) 2017-06-05 09:49:47,474 INFO org.apache.hadoop.ipc.Client:Retrying Connect to server:0.0.0.0/0.0.0.0:8031. Already tried 9 time (s); Retry policy is Retryuptomaximumcountwithfixedsleep (maxretries=10, sleeptime=1000 MILLISECONDS)
DescriptionMy Hadoop cluster is master (namenode), Slave1, Slave2, Slave3
SolutionsIn all the yarn-site.xml of the slave machine, I only added these to the master machine.
<Configuration> < Property> <name>Yarn.resourcemanager.hostname</name> <value>Master</value> </ Property> < Property> <name>Yarn.nodemanager.aux-services</name> <value>Mapreduce_shuffle</value> </ Property> < Property> <name>Yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>Org.apache.hadoop.mapred.ShuffleHandler</value> </ Property></Configuration>
Windows Platform Development MapReduce program Remote Call runs in Hadoop cluster-yarn dispatch engine exception