Introduction:Previously reproduced in a team brother "Illidan" written by the nn ha experimental record, I also based on his environment to experiment with NN ha for client transparency. This article records the detailed whole process of configuring NN ha personally, and the whole process of testing ha for client access transparency, and I hope it will be helpful to everyone.
Experimental Environment: Hadoop2.2.0 4 node cluster, ZK node 3 (ZK node is best for odd number), Hosts file and each node role assignment as follows:
hosts: 192.168.66.91 master 192.168.66.92 slave1 192.168.66.93 slave2 192.168.66.94 slave3
Role Assignment:
|
Active NN |
Standby NN |
Dn |
Journalnode |
Zookeeper |
Failovercontroller |
Master |
V |
|
|
V |
V |
V |
Slave1 |
|
V |
V |
V |
V |
V |
Slave2 |
|
|
V |
V |
V |
|
Slave3 |
|
|
V |
|
|
|
Experimental Process:
1. Download the stable version of Zookeeperhttp://mirrors.cnnic.cn/apache/zookeeper/stable/and extracted to a directory in the Hadoop cluster, I placed under the/home/yarn/.
2. Modify the configuration fileConfig file in the Conf folder, rename Zoo_sample.cfg to Zoo.cfg, and modify it in response, the following is the modified Zoo.cfg # The number of milliseconds of each tick zk between, Or the heartbeat interval between client and ZK ticktime=2000
# The number of ticks that the initial # synchronization phase can take initlimit=10 # The number of ticks that can pass Between # Sending a request and getting an acknowledgement synclimit=5
# The directory where the snapshot is stored. # do not use/tmp for storage,/tmp here is just # example sakes. The directory where ZK data is saved, please specify datadir=/home/yarn/zookeeper/zoodata here after you create it yourself
# The port at which the clients would connect client connects to the ZK server ports clientport=2181
# The maximum number of the client connections. # Increase this if you need to handle more clients #maxClientCnxns =60 # # is sure to read the maintenance section of the # Administrator Guide before turning on Autopurge. # # # # # # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # of snapshots to retain in da Tadir #autopurge. snapretaincount=3 # Purge Task interval in hours # Set to ' 0 ' to disable Auto Purge feature #autopurge. Pu Rgeinterval=1
#保存ZK日志的目录, please specify datalogdir=/home/yarn/zookeeper/zoolog here after you create it yourself
#****** the following sections are distributed ZK clusters using ****** #ZK集群初始化时, follower nodes need to communicate with leader nodes, Initlimit is configured leader to wait for the maximum number of heartbeats initlimit=5
The maximum number of heartbeats to wait for #Leader和Follower之间发送消息, requests, and responses synclimit=2
#server. A=b:c:d #A是一个数字, indicates that this is the number of the server #B是当前服务器的ID或者主机名 #C是Follower服务器与Leader服务器交换信息的端口 #D是当Leader挂掉时, re-elect the port used by leader Server.1 =192.168.66.91:2888:3888 server.2=192.168.66.92:2888:3888 server.3=192.168.66.93:2888:3888 # Note: The next need to be in each of the points of the DataDir directory to establish the myID file, the content is the corresponding A, that is, each ZK node myID file content is different!!!
3. Modify the environment variables for each nodeIn the/etc/profile file add: Export zookeeper_home=/home/yarn/zookeeper/zookeeper-3.4.6 and add $ZOOKEEPER to path _home/bin Note: The export zookeeper_home is to be above the path.
here's how to start modifying the Hadoop configuration file:
4. Modify Core-site.xml<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://
Myhadoop</value> <description>
Note: Myhadoop is the logical name of the cluster and needs to be consistent with the dfs.nameservices in Hdfs-site.xml. </description> </property> <property> <name>hadoop.tmp.dir</name> <value>/home /yarn/hadoop/hdfs2.0/tmp</value> </property> <property> <name>ha.zookeeper.quorum</ Name> <value>master:2181,slave1:2181,slave2:2181</value>
<description> The ip/host of each ZK node, and the port to which the client connects to ZK, which is consistent with the clientport in the zoo.cfg. </description></property> </configuration>
5. Modify Hdfs-site.xml<?xml version= "1.0" encoding= "UTF-8"?> <?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?> <! --Licensed under the Apache License, Version 2.0 (the "License"); You are not a use of this file except in compliance with the License. Obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License was distributed on an " As is ' BASIS, without warranties or CONDITIONS of any KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. -
<!--Put Site-specific property overrides in the this file. -
<configuration> <property> <name>dfs.nameservices</name>
<value>myhadoop</value><description> comma-separated List of nameservices. As same as Fs.defaultfs in Core-site.xml. </description> </property>
<property> <name>dfs.ha.namenodes.
Myhadoop</name> <value>nn1,nn2</value> <description> The prefix for a given nameservice, contains A comma-separated list of namenodes for a given nameservice (eg Examplenameservice). </description> </property>
<property> <name>dfs.namenode.rpc-address.
Myhadoop.nn1</name> <value>master:8020</value> <description> RPC address for nomenode1 of HADOOP-T EST </description> </property>
<property> <name>dfs.namenode.rpc-address. Myhadoop.nn2</name> <value>slave1:8020</value> <description> RPC address for Nomenode2 of Hadoop-test </description> </property>
<property> <name>dfs.namenode.http-address.
Myhadoop.nn1</name> <value>master:50070</value> <description> The address and the base port where The Dfs Namenode1 Web UI is listen on. </description> </property>
<property> <name>dfs.namenode.http-address.
Myhadoop.nn2</name> <value>slave1:50070</value> <description> The address and the base port where The Dfs Namenode2 Web UI is listen on. </description> </property>
<property> <name>dfs.namenode.servicerpc-address.
Myhadoop.n1</name> <value>master:53310</value> </property> <property> <name>dfs.na Menode.servicerpc-address.
Myhadoop.n2</name> <value>slave1:53310</value> </property>
<property> <name>dfs.namenode.name.dir</name> <value>file:///home/yarn/hadoop/hdfs2.0/ Name</value> <description>determines where on the local filesystem the DFS name node should store the Name table (fsimage). If This was a comma-delimited list of directories then the name table was replicated in all of the directories, For redundancy. </description> </property>
<property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://slave1:8485; Slave2:8485;slave3:8485/hadoop-journal</value> <description>a directory on shared storage between the Multiple namenodes in an HA cluster. This directory is written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to being listed in Dfs.namenode.edits.dir above. It should is left empty in a non-ha cluster. </description> </property>
<property> <name>dfs.datanode.data.dir</name> <value>file:///home/yarn/hadoop/hdfs2.0/ Data</value> <description>determines where on the local filesystem a DFS data node should store its block S. If This is a comma-delimited list of directories, then data would be stored in all named directories, typically on Different devices. Directories that does not exist is ignored. </description> </property>
<property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> < Description> <