Some of the preparatory work will not be said, including setting up SSH connection, and so on, mainly about the configuration file content and startup process, to 192.168.157.100~105 several servers as an example:
1, Core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-kf100.jd.com:8020</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
2, hadoop-env.sh:
Add JDK installation directory: Export java_home=/export/servers/jdk1.6.0_25
3, Hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-kf100.jd.com:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
4, Mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-kf100.jd.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-kf100.jd.com:19888</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx768M</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx1024M</value>
</property>
</configuration>
5, Slaves:
Hadoop-kf101.jd.com
Hadoop-kf102.jd.com
Hadoop-kf103.jd.com
Hadoop-kf104.jd.com
Hadoop-kf105.jd.com
6, Yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
< Value>mapreduce_shuffle</value>
</property>
<property>
<name> Yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value> Org.apache.hadoop.mapred.shufflehandler</value>
</property>
<property>
<name> Yarn.resourcemanager.address</name>
<value>hadoop-kf100.jd.com:8032</value>
</ Property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
< Value>hadoop-kf100.jd.com:8030</value>
</property>
<property>
<name> Yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-kf100.jd.com:8031</value
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop-kf100.jd.com:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop-kf100.jd.com:8088</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/usr/local/hadoop-2.2.0/etc/hadoop/fair-scheduler.xml</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
</configuration>
7. fair-scheduler.xml:--set the number of apps in the queue
<allocations>
<queue name= "Erpmerge" >
<minresources>671193 mb,378vcores</minresources>
<maxresources>851151 mb,480vcores</maxresources>
<maxRunningApps>200</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<user name= "Erpmerge" >
<maxRunningApps>200</maxRunningApps>
</user>
<queue name= "MART_CFO" >
<minresources>671193 mb,378vcores</minresources>
<maxresources>851151 mb,480vcores</maxresources>
<maxRunningApps>200</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<user name= "MART_CFO" >
<maxRunningApps>200</maxRunningApps>
</user>
<userMaxAppsDefault>100</userMaxAppsDefault>
<defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
</allocations>
The contents of each node are the same, and the following is the startup process:
Namenode:sh hadoop-daemon.sh Namenode Start
Datanode:sh hadoop-daemons.sh Datanode Start
Jobhistory:sh mr-jobhistory-daemon.sh jobhistory Start
Resoucemanager, Nodemanager:sh start-yarn.sh
hadoop2.2 configuration file (simpler version)-Latest version