Recently, a Hadoop cluster was installed, so the HA,CDH4 that configured HDFS supported the quorum-based storage and shared storage using NFS two HA scenarios, while CDH5 only supported the first scenario, the Qjm ha scenario.
About the installation deployment process for Hadoop clusters You can refer to the process of installing CDH Hadoop clusters using Yum or manually installing Hadoop clusters. Cluster Planning
I have installed a total of three nodes of the cluster, for the HA scenario, three nodes are ready to install the following services: Cdh1:hadoop-hdfs-namenode (primary), Hadoop-hdfs-journalnode, HADOOP-HDFS-ZKFC Cdh2:hadoop-hdfs-namenode (Standby), Hadoop-hdfs-journalnode, HADOOP-HDFS-ZKFC cdh3: Hadoop-hdfs-journalnode
According to the plan above, install the corresponding service on the corresponding node. Installation Steps shut down the cluster
Stop all services on the cluster.
$ Sh/opt/cmd.sh ' for x in ' Ls/etc/init.d/|grep Spark '; Do service $x stop; Done '
$ sh/opt/cmd.sh ' for x in ' Ls/etc/init.d/|grep Impala ', do service $x stop, done '
$ sh/opt/cmd.sh ' F or x in ' Ls/etc/init.d/|grep hive '; Do service $x stop; Done '
$ sh/opt/cmd.sh ' for x in ' Ls/etc/init.d/|grep hbase ', do service $x stop; "
$ Sh/opt/cmd.sh ' for X in ' Ls/etc/init.d/|grep Hadoop '; Do service $x stop; Done
Cmd.sh code content is described in the summary of deployment permissions for Hadoop cluster/opt/shell/cmd.sh. Stopping the client program
Stop all client programs for the service cluster, including scheduled tasks. Back up HDFs metadata
A, find the locally configured file directory (property named Dfs.name.dir or Dfs.namenode.name.dir or Hadoop.tmp.dir)
GREP-C1 hadoop.tmp.dir/etc/hadoop/conf/hdfs-site.xml
#或者
grep-c1 dfs.namenode.name.dir/etc/hadoop/conf/ Hdfs-site.xml
With the above command, you can see information similar to the following:
<property>
<name>hadoop.tmp.dir</name>
<value>/data/dfs/nn</value>
</property>
b, backup the HDFS data
Cd/data/dfs/nn
Tar-cvf/root/nn_backup_data.tar.
Installation Services
Install Hadoop-hdfs-journalnode on CDH1, CDH2, CDH3
$ ssh cdh1 ' yum install hadoop-hdfs-journalnode-y '
$ ssh cdh2 ' yum install hadoop-hdfs-journalnode-y '
$ ssh CD H3 ' Yum install hadoop-hdfs-journalnode-y '
Install the HADOOP-HDFS-ZKFC on CDH1, CDH2:
SSH cdh1 "yum install hadoop-hdfs-zkfc-y"
ssh cdh2 "yum install hadoop-hdfs-zkfc-y"
Modifying a configuration file
Modify the/etc/hadoop/conf/core-site.xml to make the following changes:
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster:8020</value >
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value >cdh1:21088,cdh2:21088,cdh3:21088</value>
</property>
Modify/etc/hadoop/conf/hdfs-site.xml, remove some of the original Namenode configuration, add the following:
<!--Hadoop HA--<property> <name>dfs.nameservices</name> <value>mycluster</value > </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2
</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>cdh1:8020</value> </property> <property> <name> Dfs.namenode.rpc-address.mycluster.nn2</name> <value>cdh2:8020</value> </property> < Property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>cdh1:50070</value > </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value >cdh2:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</ name> <value>qjournal://cdh1:8485,cdh2:8485,cdh3:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/data/dfs/jn</value> </ property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value> Org.apache.hadoop.hdfs.server.namenode.ha.configuredfailoverproxyprovider</value> </property> < Property> <name>dfs.ha.fencing.methods</name> <value>sshfence (HDFS) </value> </ property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/var/lib /hadoop-hdfs/.ssh/id_rsa</value> </property> <property> <name>
Dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
Synchronizing Profiles
To synchronize the configuration files to other nodes in the cluster:
$ sh/opt/syn.sh/etc/hadoop/conf/etc/hadoop/
Create a directory on the Journalnode three nodes:
$ ssh cdh1 ' mkdir-p/data/dfs/jn; Chown-r hdfs:hdfs/data/dfs/jn '
$ ssh cdh2 ' mkdir-p/data/dfs/jn; chown-r hdfs:hdfs/data/dfs/jn '
$ ssh cdh3 ' Mkdir-p/data/dfs/jn; Chown-r Hdfs:hdfs/data/dfs/jn '
Configure login without password
Configuration of HDFs users with no password login on two nn:
For CDH1:
$ passwd HDFs
$ su-hdfs
$ ssh-keygen
$ ssh-copy-id CDH2
For CDH2:
$ passwd HDFs
$ su-hdfs
$ ssh-keygen
$ ssh-copy-id cdh1
Start Journalnode
Start the Hadoop-hdfs-journalnode service on Cdh1, CDH2, CDH3
$ ssh cdh1 ' service hadoop-hdfs-journalnode start '
$ ssh cdh2 ' service hadoop-hdfs-journalnode start '
$ ssh cdh3 ' Service Hadoop-hdfs-journalnode start '
Initializing shared storage
The shared storage is initialized on Namenode, and if it is not formatted, it is formatted first:
HDFs namenode-initializesharededits
Start Namenode:
$ service Hadoop-hdfs-namenode Start
Synchronous Standby NameNode
CDH2 as Standby NameNode, install the NameNode service on this node first
$ yum Install Hadoop-hdfs-namenode-y
Run again:
$ sudo-u HDFs Hadoop namenode-bootstrapstandby
If Kerberos is used, obtain the ticket of HDFs before executing:
$ kinit-k-t/etc/hadoop/conf/hdfs.keytab hdfs/cdh1@javachem.com
$ Hadoop namenode-bootstrapstandby
Then, start Standby NameNode:
$ service Hadoop-hdfs-namenode Start
Configuring automatic Switching
On two namenode, i.e. CDH1 and CDH2, install the HADOOP-HDFS-ZKFC
$ ssh cdh1 ' yum install hadoop-hdfs-zkfc-y '
$ ssh cdh2 ' yum install hadoop-hdfs-zkfc-y '
On either Namenode, the following command creates a Znode for automatic failover.
$ HDFs Zkfc-formatzk
If you want to encrypt access to ZooKeeper, refer to the section enabling HDFS HA securing access to ZooKeeper.
Then start ZKFC on two NameNode nodes:
$ ssh cdh1 "service HADOOP-HDFS-ZKFC start"
$ ssh cdh2 "service HADOOP-HDFS-ZKFC start"
Test
Visit http://cdh1:50070/and http://cdh2:50070/respectively to see who is active Namenode, who is standyby Namenode.
To view the status of a namenode:
#查看cdh1状态
$ sudo-u hdfs hdfs haadmin-getservicestate nn1
active
#查看cdh2状态
$ sudo-u HDFs HDFs haadmin-g Etservicestate nn2
Standby
To perform a manual switchover:
$ sudo-u HDFs hdfs haadmin-failover nn1 nn2
failover to NameNode at cdh2/192.168.56.122:8020 successful
Visit http://cdh1:50070/and http://cdh2:50070/again to see who is active Namenode, who is standyby Namenode. Configure HBase HA
First stop hbase and then modify the/etc/hbase/conf/hbase-site.xml to make the following changes:
<!--Configure HBase to use the HA NameNode nameservice--
<property>
<name>hbase.rootdir </name>
<value>hdfs://mycluster:8020/hbase</value>
</property>
Run/usr/lib/zookeeper/bin/zkcli.sh on the Zookeeper node
$ ls/hbase/splitlogs
$ rmr/hbase/splitlogs
Finally, start the HBase service. Configuring Hive HA
Run the following command to metastore the root address of the hive to HDFs Nameservice.
$/usr/lib/hive/bin/metatool-listfsroot
Initializing hivemetatool.
Listing FS Roots.
Hdfs://cdh1:8020/user/hive/warehouse
$/usr/lib/hive/bin/metatool-updatelocation hdfs://mycluster hdfs://cdh1 -tablepropkey avro.schema.url
-serdepropkey schema.url
$ metatool-listfsroot
Listing FS Roots.
Initializing Hivemetatool.
Hdfs://mycluster:8020/user/hive/warehouse
Configure Impala
There is no need to make any changes, but be sure to keep in mind that the Fs.defaultfs parameter value in Core-site.xml is with the upper port number, which is 8020 in CDH. Configure YARN
For the time being unused, please refer to MapReduce (MRV1) and YARN (MRV2) high availability for details. Configure Hue
For the time being unused, please refer to Hue high availability for more information. Configure Llama
For the time being unused, please refer to Llama high availability for detailed instructions.
Original articles, reproduced please specify: Reproduced from Javachen Blog, author: javachen
This article link address: http://blog.javachen.com/2014/07/18/install-hdfs-ha-in-cdh.html
This article is based on the attribution 2.5 China mainland license agreement, which is reproduced, interpreted or used for commercial purposes, but must retain the attribution and link to this article. If you have any questions or authorization to negotiate, please contact me by email.
Configuring HDFs HA in CDH
Hadoop Hadoop
Like to cancel like