first, under the Win7
(i), installation environment and installation package
Win7-bit
Jdk7
Eclipse-java-juno-sr2-win32.zip
Hadoop-2.2.0.tar.gz
Hadoop-eclipse-plugin-2.2.0.jar
Hadoop-common-2.2.0-bin.rar
(b), installation
The JDK, Eclipse, and Hadoop pseudo-distribution mode are already installed by default
1, copy the hadoop-eclipse-plugin-2.2.0.jar plugin to the Eclipse installation directory subdirectory plugins, restart Eclipse.
2. Setting Environment variables
3. Configuring the installation directory for Hadoop in eclipse
Extracthadoop-2.2.0.tar.gz
4. Decompression hadoop-common-2.2.0-bin.rar
Copy the files inside to the bin folder of the Hadoop installation directory
(iii), under Win7, mapreuce on yarn execution
Create a new project
Click Window–>show view–>map/reduce Locations
Click New Hadoop location ...
Add the following configuration and click Done.
From there, you can view the relevant content in HDFs.
Writing a MapReduce program
In the SRC directory, add the file log4j.properties, which reads as follows:
log4j.rootLogger=debug,appender1log4j.appender.appender1=org.apache.log4j.ConsoleAppenderlog4j.appender.appender1.layout=org.apache.log4j.TTCCLayout
Run with the following results:
Second, under the Linux
(a) under Linux, mapreuce on yarn
Run
[email protected] documents]# yarn jar Test.jar Hdfs://liguodong:8020/hello Hdfs://liguodong:8020/output the/ to/Geneva Geneva: -: AINFO client. Rmproxy:connecting toResourceManager at/0.0. 0. 0:8032.................. the/ to/Geneva Geneva: -: -INFO MapReduce. Jobsubmitter:submitting Tokens forjob:job_1430648117067_0001 the/ to/Geneva Geneva: -: -INFO Impl. yarnclientimpl:submitted Application application_1430648117067_0001 toResourceManager at/0.0. 0. 0:8032 the/ to/Geneva Geneva: -: -INFO MapReduce. Job:the URL toTrack the Job:http://liguodong:8088/proxy/application_1430648117067_0001/ the/ to/Geneva Geneva: -: -INFO MapReduce. Job:running job:job_1430648117067_0001 the/ to/Geneva Geneva: -: +INFO MapReduce. Job:job job_1430648117067_0001 RunninginchUber Mode:false the/ to/Geneva Geneva: -: +INFO MapReduce. Job:Map 0% reduce0% the/ to/Geneva Geneva: -: +INFO MapReduce. Job:Map -% reduce0% the/ to/Geneva Geneva: -: $INFO MapReduce. Job:Map -% reduce -% the/ to/Geneva Geneva: -: $INFO MapReduce. Job:job JOB_1430648117067_0001 completed successfully the/ to/Geneva Geneva: -: $INFO MapReduce. Job:counters: + FileSystem CountersFILE: number ofBytes read=98 FILE: number ofBytes written=157289 FILE: number ofRead operations=0 FILE: number ofLarge Read operations=0 FILE: number ofWrite operations=0Hdfs:number ofBytes read=124Hdfs:number ofBytes written= -Hdfs:number ofRead operations=6Hdfs:number ofLarge Read operations=0Hdfs:number ofWrite operations=2Job Counters launchedMaptasks=1launched reduce tasks=1Data-localMaptasks=1Total TimeSpent by AllMapsinchOccupied slots (ms) =16924Total TimeSpent by AllReducesinchOccupied slots (ms) =3683 Map-reduce FrameworkMapInput records=3 MapOutput records=6 MapOutput bytes= the MapOutput materialized bytes=98Input Split bytes= theCombine input records=0Combine Output records=0Reduce input groups=4Reduce Shuffle bytes=98Reduce input records=6Reduce Output records=4Spilled records= AShuffled Maps =1Failed shuffles=0MergedMapoutputs=1Gc TimeElapsed (ms) = theCpu TimeSpent (ms) =12010Physical memory (bytes) snapshot=211070976Virtual memory (bytes) snapshot=777789440Total committed heap usage (bytes) =130879488Shuffle Errors bad_id=0connection=0Io_error=0Wrong_length=0wrong_map=0Wrong_reduce=0 FileInput Format Counters Bytes read= + FileOutput Format Counters Bytes written= -
View Results
[[email protected] Documents]# HDFs Dfs-ls/Found3Items-rw-r--r--2 root supergroup 2015-05-03 03:15/helloDrwxr-xr-x-root supergroup0 -- to-Geneva Geneva: -/outputdrwx-------root supergroup 0 2015-05-03 03:16/tmp[[email protected] Documents]# HDFs Dfs-ls/outputFound2Items-rw-r--r--2 root supergroup 0 2015-05-03 03:16/output/_success-rw-r--r--2 root supergroup 2015-05-03 03:16/output/part-r-00000[[email protected] Documents]# HDFs dfs-text/output/pa*Hadoop1Hello3Me 1You1
Problems encountered
/output/……… 0of minReplication (=1). 1andnointhis operation.
Find a lot of methods on the Internet is to try not to solve, and then own according to the Chinese meaning of this sentence is only copied to 0 copies, not the least one copy.
I set the first dfs.replication.min to 0, but unfortunately, after running it must be greater than 0, and I changed to 1.
And then Dfs.datanode.data.dir more set up a few paths, when it is in a system to back up a few times, after the discovery of success.
Set as follows to add the following configuration in Hdfs-site.xml.
<property> <name>dfs.datanode.data.dir</name> <value> file://${hadoop.tmp.dir}/dfs/dn,file://${hadoop.tmp.dir}/dfs/dn1,file://${hadoop.tmp.dir}/dfs/dn2 </value> </property>
(ii) under Linux, mapreuce on local
In Mapred-site.xml, add the following configuration file.
<configuration> <property> <name>mapreduce.framework.name</name> <value>local</value> </property></configuration>
You can not start ResourceManager and NodeManager.
Run
[root@liguodongDocuments]# hadoop jar test.jar hdfs://liguodong:8020/hello hdfs://liguodong:8020/output
Third, the MapReduce operation mode has a variety of
In Mapred-site.xml
1) Local run mode (default)
<configuration> <property> <name>mapreduce.framework.name</name> <value>local</value> </property></configuration>
2) running on yarn
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>
Four, Uber Mode
Uber mode is a way to optimize for mapreuduce job small jobs in hadoop2.x (how to reuse the JVM).
Small jobs refer to the amount of data that the MapReduce job runs, when the amount of data (size) is smaller than the size of the block (128M) when the HDFS stores the data.
The default is no boot.
In Mapred-site.xml
<name>mapreduce.job.ubertask.enable</name><value>true</value>
Win7 Installing the hadoop2.x plugin and win7/linux running the MapReduce program