Hadoop 2.7.1 high-availability installation configuration based on QMJ
Hadoop 2.7.1 high-availability installation configuration based on QMJ
1. Modify the Host Name and hosts file
10.205.22.185 nn1 (main) function namenode, resourcemanager, datanode, zk, hive, sqoop
10.205.22.186 nn2 (standby) function namenode, resourcemanager, datanode, zk
10.205.22.187 dn1 function datanode, zk
1.1 configure ssh password-free Login
The master node can log on to each slave node without a password
Ssh nn1
Ssh nn2
Ssh dn1
2. Install jdk1.8, zookeeper, hive, and sqoop before installing
2.1 modify the profile file and configure Environment Variables
Export JAVA_HOME =/usr/java/jdk1.8.0 _ 65
Export JRE_HOME =/usr/java/jdk1.8.0 _ 65/jre
Export HADOOP_HOME =/app/hadoop-2.7.1
Export HIVE_HOME =/app/hive
Export SQOOP_HOME =/app/sqoop
Export ZOOKEEPER_HOME =/app/zookeeper-3.4.6
Export PATH = $ PATH: $ JAVA_HOME/bin: $ HADOOP_HOME/sbin: $ ZOOKEEPER_HOME/bin: $ HIVE_HOME/bin: $ SQOOP_HOME/bin: $ MAVEN_HOME/bin
Export CLASSPATH =.: $ JAVA_HOME/lib: $ JRE_HOME/lib
Ulimit-SHn 65536
2.2 modify zookeeper configuration file zoo. cfg
Add:
Server.1 = nn1: 2888: 3888
Server.2 = nn2: 2888: 3888
Server.3 = dn1: 2888: 3888
3. Install the hadoop-2.7.1 and modify the configuration file
Create a directory
Mkdir-p/home/hadoop/tmp
Mkdir-p/home/hadoop/hdfs/data
Mkdir-p/home/hadoop/journal
Mkdir-p/home/hadoop/name
Modify slaves files
Nn1
Nn2
Dn1
Modify hadoop-env.sh files
Export JAVA_HOME =/usr/java/jdk1.8.0 _ 65
3.1 configure hdfs-site.xml
<Configuration>
<Property>
<Name> dfs. nameservices </name>
<Value> masters </value>
</Property>
<Property>
<Name> dfs. ha. namenodes. masters </name>
<Value> nn1, nn2 </value>
</Property>
<Property>
<Name> dfs. namenode. rpc-address.masters.nn1 </name>
<Value> nn1: 9000 </value>
</Property>
<Property>
<Name> dfs. namenode. http-address.masters.nn1 </name>
<Value> nn1: 50070 </value>
</Property>
<Property>
<Name> dfs. namenode. rpc-address.masters.nn2 </name>
<Value> nn2: 9000 </value>
</Property>
<Property>
<Name> dfs. namenode. http-address.masters.nn2 </name>
<Value> nn2: 50070 </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value> file:/home/hadoop/hdfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 2 </value>
</Property>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value> file:/home/hadoop/name </value>
</Property>
<Property>
<Name> dfs. namenode. shared. edits. dir </name>
<Value> qjournal: // nn1: 8485; nn2: 8485; dn1: 8485/masters </value>
</Property>
<Property>
<Name> dfs. journalnode. edits. dir </name>
<Value>/home/hadoop/journal </value>
</Property>
<Property>
<Name> dfs. ha. automatic-failover.enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> dfs. client. failover. proxy. provider. masters </name>
<Value> org. apache. hadoop. hdfs. server. namenode. ha. ConfiguredFailoverProxyProvider </value>
</Property>
<Property>
<Name> dfs. ha. fencing. methods </name>
<Value> sshfence </value>
</Property>
<Property>
<Name> dfs. ha. fencing. ssh. private-key-files </name>
<Value>/root/. ssh/id_rsa </value>
</Property>
<Property>
<Name> dfs. ha. fencing. ssh. connect-timeout </name>
<Value> 30000 </value>
</Property>
</Configuration>
3.2 configure the core-site.xml File
<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // masters </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/home/hadoop/tmp </value>
</Property>
<Property>
<Name> ha. zookeeper. quorum </name>
<Value> nn1: 2181, nn2: 2181, dn1: 2181 </value>
</Property>
<Property>
<Name> io. compression. codecs </name>
<Value> org. apache. hadoop. io. compress. gzipCodec, org. apache. hadoop. io. compress. defaultCodec, com. hadoop. compression. lzo. lzoCodec, com. hadoop. compression. lzo. lzopCodec, org. apache. hadoop. io. compress. BZip2Codec </value>
</Property>
<Property>
<Name> io. compression. codec. lzo. class </name>
<Value> com. hadoop. compression. lzo. LzoCodec </value>
</Property>
</Configuration>
3.3 configure the yarn-site.xml File
<Configuration>
<Property>
<Name> yarn. resourcemanager. ha. enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. cluster-id </name>
<Value> rm-cluster </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. rm-ids </name>
<Value> rm1, rm2 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. automatic-failover.enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. ha. automatic-failover.embedded </name>
<Value> true </value>
</Property>
<Property>
<Name> yarn. resourcemanager. hostname. rm1 </name>
<Value> nn1 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. hostname. rm2 </name>
<Value> nn2 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. store. class </name>
<Value> org. apache. hadoop. yarn. server. resourcemanager. recovery. ZKRMStateStore </value>
</Property>
<Property>
<Name> yarn. resourcemanager. zk-address </name>
<Value> nn1: 2181, nn2: 2181, dn1: 2181 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address. rm1 </name>
<Value> nn1: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address. rm2 </name>
<Value> nn2: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address.rm1 </name>
<Value> nn1: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address.rm2 </name>
<Value> nn2: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address. rm1 </name>
<Value> nn1: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address. rm2 </name>
<Value> nn2: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address. rm1 </name>
<Value> nn1: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address. rm2 </name>
<Value> nn2: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address. rm1 </name>
<Value> nn1: 8088 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address. rm2 </name>
<Value> nn2: 8088 </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services.mapreduce.shuffle.class </name>
<Value> org. apache. hadoop. mapred. ShuffleHandler </value>
</Property>
<Property>
<Name> yarn. client. failover-proxy-provider </name>
<Value> org. apache. hadoop. yarn. client. ConfiguredRMFailoverProxyProvider </value>
</Property>
</Configuration>
3.4 configure the mapred-site.xml File
<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> nn1: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> nn2: 19888 </value>
</Property>
<Property>
<Name> mapred. compress. map. output </name>
<Value> true </value>
</Property>
<Property>
<Name> mapred. map. output. compression. codec </name>
<Value> com. hadoop. compression. lzo. LzoCodec </value>
</Property>
<Property>
<Name> mapred. child. env </name>
<Value> LD_LIBRARY_PATH =/usr/local/lzo/lib </value>
</Property>
</Configuration>
3.5 synchronize hadoop to each node and configure the above-mentioned files
4. Start the service
4.1 start zookeeper on each node to view the status
ZkServer. sh start
ZkServer. sh status
Format zookeeper on the master node
Hdfs zkfc-formatZK
4.2 start the log program on each node
Hadoop-daemon.sh start journalnode
4.3 format hadoop on the master namenode Node
Hadoop namenode-format
4.4 start the namenode process on the master namenode node.
Hadoop-daemon.sh start namenode
4.5 execute commands on the slave node. This is to format the directory of the slave namenode node and synchronize the metadata from the master namenode node.
Hdfs namenode-bootstrapStandby
Hadoop-daemon.sh start namenode
Yarn-daemon.sh start resourcemanager
4.6 start other related services
Start-dfs.sh
Start-yarn.sh
4.7 View High Availability Status
Hdfs haadmin-getServiceState nn1/nn2 view namenode
Yarn rmadmin-getServiceState rm1/rm2 view resourcemanager
4.8 log on to the web to view the status
Http: // nn1: 50070
Http: // nn1: 8088
You may also like the following articles about Hadoop:
Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04
Install and configure Hadoop2.2.0 on CentOS
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition