This installation is deployed in the development experimental environment, only related to the global resource management scheduling system yarn installation, HDFs or first generation, no deployment of HDFs Federation and HDFs HA, follow-up will be added.
Os:centos Linux Release 6.0 (Final) x86_64
To deploy the machine:
Dev80.hadoop 192.168.7.80
Dev81.hadoop 192.168.7.81
Dev82.hadoop 192.168.7.82
Dev83.hadoop 192.168.7.83
Dev80 mainly as ResourceManager, Namenode,secondarynamenode,slave nodes (from Datanode and NodeManager) including dev80,dev81,dev82,dev83
First you need to install the JDK and make sure to ssh through the various slave nodes.
Download the 2.0.5 Alpha version from the Hadoop website (the latest packaged version, the beta version has been pulled from the trunk, but it needs to build itself)
wget http://apache.fayea.com/apache-mirror/hadoop/common/hadoop-2.0.5-alpha/hadoop-2.0.5-alpha.tar.gz
Tar XZVF hadoop-2.0.5-alpha.tar.gz
After extracting it, we found that the entire directory and Hadoop 1.0 have changed a lot, and the Linux root structure is very similar, the client's startup commands are placed under the bin, and the administrator service-side startup command is under Sbin (Super Bin), the configuration file is unified under Etc/hadoop , in the original based on a yarn-site.xml and yarn-env.sh, start yarn words can be used sbin/yarn-daemon.sh and sbin/yarn-daemons.sh (start multiple slave service)
Drwxr-xr-x 2 Hadoop hadoop 4096 Aug 18:18 bin
drwxr-xr-x 3 Hadoop hadoop 4096 Aug etc
10:27 2 drwxr-xr-x OOP Hadoop 4096 Aug 10:27 include
drwxr-xr-x 3 Hadoop hadoop 4096 Aug 10:27 lib Drwxr-xr-x
2 Hadoop Hadoop 4 096 Aug 15:58 libexec
drwxrwxr-x 3 Hadoop hadoop 4096 Aug-18:15 logs drwxr-xr-x
2 Hadoop hadoop 4096 Aug 16 18:25 sbin
drwxr-xr-x 4 Hadoop hadoop 4096 Aug-10:27
Configuration
Export Hadoop_home=/usr/local/hadoop/hadoop-2.0.5-alpha is added to the/etc/profile file, which is loaded into the system environment variable when it starts
Java Home and SSH parameters set in hadoop-env.sh
Export JAVA_HOME=/USR/LOCAL/JDK
Export hadoop_ssh_opts= "-P 58422"
The slaves file joins the following nodes:
Dev80.hadoop
dev81.hadoop
dev82.hadoop
dev83.hadoop
Core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs ://dev80.hadoop:8020</value>
<final>true</final>
</property>
</ Configuration>
Hdfs-site.xml Namenode Stores Editlog and Fsimage directories, and Datanode directory for block storage
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value >/data/yarn/name</value>
</property>
<property>
<name> dfs.datanode.data.dir</name>
<value>/data/yarn/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</ Property>
</configuration>
The shuffle part of the Yarn-site.xml,yarn is separated into a service that needs to be started as a auxiliary service when the NodeManager is started, so that the shuffle of the third party can be customized provider , and Shuffleconsumer, for example, can replace the current HTTP Shuffle for RDMA Shuffle, and a more appropriate strategy for intermediate result merges to achieve better performance improvements
<configuration> <!--Site specific YARN configuration Properties--> <property> <name>yarn.resourcemanager.address</name> <value>dev80.hadoop:9080</va lue> </property> <property> <name>yarn.resourcemanager.schedule
R.address</name> <value>dev80.hadoop:9081</value> </property>
<property> <name>yarn.resourcemanager.resource-tracker.address</name>
<value>dev80.hadoop:9082</value> </property> <property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <NAME>YARN.NODEMANAGER.AUX-SERVICES.MAPREDUCE.S Huffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
Mapred-site.xml is required to configure Mapreduce.framework.name as yarn so that the Mr Job will be submitted to ResourceManager
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value >yarn</value>
</property>
</configuration>
Rsync the above conf file to each slave node
Start Service
Start HDFs first.
Bin/hdfs Namenode-format
After the execution of this command,/data/yarn/name was formatted under the line.
Start Namenode:
sbin/hadoop-daemon.sh Start Namenode
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/