Hadoop 2.7.1 high-availability installation configuration based on QMJ

Source: Internet
Author: User
Tags sqoop

Hadoop 2.7.1 high-availability installation configuration based on QMJ

Hadoop 2.7.1 high-availability installation configuration based on QMJ

1. Modify the Host Name and hosts file

10.205.22.185 nn1 (main) function namenode, resourcemanager, datanode, zk, hive, sqoop
10.205.22.186 nn2 (standby) function namenode, resourcemanager, datanode, zk
10.205.22.187 dn1 function datanode, zk

1.1 configure ssh password-free Login

The master node can log on to each slave node without a password

Ssh nn1
Ssh nn2
Ssh dn1

2. Install jdk1.8, zookeeper, hive, and sqoop before installing

2.1 modify the profile file and configure Environment Variables

Export JAVA_HOME =/usr/java/jdk1.8.0 _ 65
Export JRE_HOME =/usr/java/jdk1.8.0 _ 65/jre
Export HADOOP_HOME =/app/hadoop-2.7.1
Export HIVE_HOME =/app/hive
Export SQOOP_HOME =/app/sqoop
Export ZOOKEEPER_HOME =/app/zookeeper-3.4.6
Export PATH = $ PATH: $ JAVA_HOME/bin: $ HADOOP_HOME/sbin: $ ZOOKEEPER_HOME/bin: $ HIVE_HOME/bin: $ SQOOP_HOME/bin: $ MAVEN_HOME/bin
Export CLASSPATH =.: $ JAVA_HOME/lib: $ JRE_HOME/lib
Ulimit-SHn 65536

2.2 modify zookeeper configuration file zoo. cfg

Add:

Server.1 = nn1: 2888: 3888
Server.2 = nn2: 2888: 3888
Server.3 = dn1: 2888: 3888

3. Install the hadoop-2.7.1 and modify the configuration file

Create a directory

Mkdir-p/home/hadoop/tmp
Mkdir-p/home/hadoop/hdfs/data
Mkdir-p/home/hadoop/journal
Mkdir-p/home/hadoop/name

Modify slaves files

Nn1
Nn2
Dn1

Modify hadoop-env.sh files

Export JAVA_HOME =/usr/java/jdk1.8.0 _ 65

3.1 configure hdfs-site.xml

<Configuration>
<Property>
<Name> dfs. nameservices </name>
<Value> masters </value>
</Property>
<Property>
<Name> dfs. ha. namenodes. masters </name>
<Value> nn1, nn2 </value>
</Property>
<Property>
<Name> dfs. namenode. rpc-address.masters.nn1 </name>
<Value> nn1: 9000 </value>
</Property>
<Property>
<Name> dfs. namenode. http-address.masters.nn1 </name>
<Value> nn1: 50070 </value>
</Property>
<Property>
<Name> dfs. namenode. rpc-address.masters.nn2 </name>
<Value> nn2: 9000 </value>
</Property>
<Property>
<Name> dfs. namenode. http-address.masters.nn2 </name>
<Value> nn2: 50070 </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value> file:/home/hadoop/hdfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 2 </value>
</Property>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value> file:/home/hadoop/name </value>
</Property>
<Property>
<Name> dfs. namenode. shared. edits. dir </name>
<Value> qjournal: // nn1: 8485; nn2: 8485; dn1: 8485/masters </value>
</Property>
<Property>
<Name> dfs. journalnode. edits. dir </name>
<Value>/home/hadoop/journal </value>
</Property>
<Property>
<Name> dfs. ha. automatic-failover.enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> dfs. client. failover. proxy. provider. masters </name>
<Value> org. apache. hadoop. hdfs. server. namenode. ha. ConfiguredFailoverProxyProvider </value>
</Property>
<Property>
<Name> dfs. ha. fencing. methods </name>
<Value> sshfence </value>
</Property>
<Property>
<Name> dfs. ha. fencing. ssh. private-key-files </name>
<Value>/root/. ssh/id_rsa </value>
</Property>
<Property>
<Name> dfs. ha. fencing. ssh. connect-timeout </name>
<Value> 30000 </value>
</Property>
</Configuration>

3.2 configure the core-site.xml File

<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // masters </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/home/hadoop/tmp </value>
</Property>
<Property>
<Name> ha. zookeeper. quorum </name>
<Value> nn1: 2181, nn2: 2181, dn1: 2181 </value>
</Property>

<Property>
<Name> io. compression. codecs </name>
<Value> org. apache. hadoop. io. compress. gzipCodec, org. apache. hadoop. io. compress. defaultCodec, com. hadoop. compression. lzo. lzoCodec, com. hadoop. compression. lzo. lzopCodec, org. apache. hadoop. io. compress. BZip2Codec </value>
</Property>
<Property>
<Name> io. compression. codec. lzo. class </name>
<Value> com. hadoop. compression. lzo. LzoCodec </value>
</Property>
</Configuration>

3.3 configure the yarn-site.xml File

<Configuration>
<Property>
<Name> yarn. resourcemanager. ha. enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. cluster-id </name>
<Value> rm-cluster </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. rm-ids </name>
<Value> rm1, rm2 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. automatic-failover.enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. automatic-failover.embedded </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. hostname. rm1 </name>
<Value> nn1 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. hostname. rm2 </name>
<Value> nn2 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. store. class </name>
<Value> org. apache. hadoop. yarn. server. resourcemanager. recovery. ZKRMStateStore </value>
</Property>
<Property>
<Name> yarn. resourcemanager. zk-address </name>
<Value> nn1: 2181, nn2: 2181, dn1: 2181 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address. rm1 </name>
<Value> nn1: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address. rm2 </name>
<Value> nn2: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address.rm1 </name>
<Value> nn1: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address.rm2 </name>
<Value> nn2: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address. rm1 </name>
<Value> nn1: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address. rm2 </name>
<Value> nn2: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address. rm1 </name>
<Value> nn1: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address. rm2 </name>
<Value> nn2: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address. rm1 </name>
<Value> nn1: 8088 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address. rm2 </name>
<Value> nn2: 8088 </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services.mapreduce.shuffle.class </name>
<Value> org. apache. hadoop. mapred. ShuffleHandler </value>
</Property>
<Property>
<Name> yarn. client. failover-proxy-provider </name>
<Value> org. apache. hadoop. yarn. client. ConfiguredRMFailoverProxyProvider </value>
</Property>
</Configuration>

3.4 configure the mapred-site.xml File

<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> nn1: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> nn2: 19888 </value>
</Property>

<Property>
<Name> mapred. compress. map. output </name>
<Value> true </value>
</Property>
<Property>
<Name> mapred. map. output. compression. codec </name>
<Value> com. hadoop. compression. lzo. LzoCodec </value>
</Property>
<Property>
<Name> mapred. child. env </name>
<Value> LD_LIBRARY_PATH =/usr/local/lzo/lib </value>
</Property>
</Configuration>

3.5 synchronize hadoop to each node and configure the above-mentioned files

4. Start the service

4.1 start zookeeper on each node to view the status

ZkServer. sh start
ZkServer. sh status

Format zookeeper on the master node

Hdfs zkfc-formatZK


4.2 start the log program on each node

Hadoop-daemon.sh start journalnode

4.3 format hadoop on the master namenode Node

Hadoop namenode-format


4.4 start the namenode process on the master namenode node.

Hadoop-daemon.sh start namenode


4.5 execute commands on the slave node. This is to format the directory of the slave namenode node and synchronize the metadata from the master namenode node.

Hdfs namenode-bootstrapStandby
Hadoop-daemon.sh start namenode
Yarn-daemon.sh start resourcemanager

4.6 start other related services

Start-dfs.sh
Start-yarn.sh

4.7 View High Availability Status

Hdfs haadmin-getServiceState nn1/nn2 view namenode
Yarn rmadmin-getServiceState rm1/rm2 view resourcemanager

4.8 log on to the web to view the status

Http: // nn1: 50070
Http: // nn1: 8088

You may also like the following articles about Hadoop:

Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.