Installation configuration for yarn

Source: Internet
Author: User
Tags final resource shuffle ssh linux

This installation is deployed in the development experimental environment, only related to the global resource management scheduling system yarn installation, HDFs or first generation, no deployment of HDFs Federation and HDFs HA, follow-up will be added.

Os:centos Linux Release 6.0 (Final) x86_64

To deploy the machine:

Dev80.hadoop 192.168.7.80

Dev81.hadoop 192.168.7.81

Dev82.hadoop 192.168.7.82

Dev83.hadoop 192.168.7.83

Dev80 mainly as ResourceManager, Namenode,secondarynamenode,slave nodes (from Datanode and NodeManager) including dev80,dev81,dev82,dev83

First you need to install the JDK and make sure to ssh through the various slave nodes.

Download the 2.0.5 Alpha version from the Hadoop website (the latest packaged version, the beta version has been pulled from the trunk, but it needs to build itself)

wget http://apache.fayea.com/apache-mirror/hadoop/common/hadoop-2.0.5-alpha/hadoop-2.0.5-alpha.tar.gz  
Tar XZVF hadoop-2.0.5-alpha.tar.gz

After extracting it, we found that the entire directory and Hadoop 1.0 have changed a lot, and the Linux root structure is very similar, the client's startup commands are placed under the bin, and the administrator service-side startup command is under Sbin (Super Bin), the configuration file is unified under Etc/hadoop , in the original based on a yarn-site.xml and yarn-env.sh, start yarn words can be used sbin/yarn-daemon.sh and sbin/yarn-daemons.sh (start multiple slave service)

Drwxr-xr-x 2 Hadoop hadoop 4096 Aug 18:18 bin  
drwxr-xr-x 3 Hadoop hadoop 4096 Aug etc  
10:27 2 drwxr-xr-x OOP Hadoop 4096 Aug 10:27 include  
drwxr-xr-x 3 Hadoop hadoop 4096 Aug 10:27 lib Drwxr-xr-x  
2 Hadoop Hadoop 4 096 Aug 15:58 libexec  
drwxrwxr-x 3 Hadoop hadoop 4096 Aug-18:15 logs drwxr-xr-x  
2 Hadoop hadoop 4096 Aug 16 18:25 sbin  
drwxr-xr-x 4 Hadoop hadoop 4096 Aug-10:27

Configuration

Export Hadoop_home=/usr/local/hadoop/hadoop-2.0.5-alpha is added to the/etc/profile file, which is loaded into the system environment variable when it starts

Java Home and SSH parameters set in hadoop-env.sh

Export JAVA_HOME=/USR/LOCAL/JDK  
Export hadoop_ssh_opts= "-P 58422"

The slaves file joins the following nodes:

Dev80.hadoop  
dev81.hadoop  
dev82.hadoop  
dev83.hadoop

Core-site.xml

<configuration>  
        <property>  
                <name>fs.default.name</name>  
                <value>hdfs ://dev80.hadoop:8020</value>  
                <final>true</final>  
        </property>  
</ Configuration>

Hdfs-site.xml Namenode Stores Editlog and Fsimage directories, and Datanode directory for block storage

<configuration>  
        <property>  
                <name>dfs.namenode.name.dir</name>  
                <value >/data/yarn/name</value>  
        </property>  
        <property>  
                <name> dfs.datanode.data.dir</name>  
                <value>/data/yarn/data</value>  
        </property>  
        <property>  
                <name>dfs.replication</name>  
                <value>3</value>  
        </ Property>  
</configuration>

The shuffle part of the Yarn-site.xml,yarn is separated into a service that needs to be started as a auxiliary service when the NodeManager is started, so that the shuffle of the third party can be customized provider , and Shuffleconsumer, for example, can replace the current HTTP Shuffle for RDMA Shuffle, and a more appropriate strategy for intermediate result merges to achieve better performance improvements

<configuration> <!--Site specific YARN configuration Properties--> <property> <name>yarn.resourcemanager.address</name> <value>dev80.hadoop:9080</va lue> </property> <property> <name>yarn.resourcemanager.schedule  
        R.address</name> <value>dev80.hadoop:9081</value> </property>  
                <property> <name>yarn.resourcemanager.resource-tracker.address</name>  
                <value>dev80.hadoop:9082</value> </property> <property>  
        <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> &LT;NAME&GT;YARN.NODEMANAGER.AUX-SERVICES.MAPREDUCE.S Huffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>

Mapred-site.xml is required to configure Mapreduce.framework.name as yarn so that the Mr Job will be submitted to ResourceManager

<configuration>  
        <property>  
                <name>mapreduce.framework.name</name>  
                <value >yarn</value>  
        </property>  
</configuration>

Rsync the above conf file to each slave node

Start Service

Start HDFs first.

Bin/hdfs Namenode-format

After the execution of this command,/data/yarn/name was formatted under the line.

Start Namenode:

sbin/hadoop-daemon.sh Start Namenode

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.