NameNodeHA configuration details

Last Update:2018-06-12 Source: Internet

Author: User

Tags zookeeper client myhdfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The NameNode in the hdfs cluster has SPOF ). For clusters with only one NameNode, if an unexpected downtime occurs on the NameNode machine, the entire cluster will not be available until the NameNode restarts. Hdfs ha implements hot backup of NameNode in the cluster by configuring Active/Standby NameNodes. If the downtime of Active NN appears, the system switches to Standby so that the NN service is uninterrupted. Hdfs ha depends on zookeeper. The following is the test process.

The environment is as follows:
HOST: debugo0 [1-3], CentOS 6.5
Hadoop 2.4.1
ZooKeeper 3.4.6

HDFS	ZooKeeper
Debugo01	NN, ZKFC, JournalNode, DN	Server
Debugo02	NN, ZKFC, JournalNode, DN	Server
Debugo03	NN, JournalNode, DN	Server

1. Start ZooKeeper

Edit zookeeper configuration file

$ mkdir -p /home/hadoop/zooKeeper /home/hadoop/log/zoolog$ cd $ZOOKEEPER_HOME/conf$ cp zoo_sample.cnf zoo.cnf$ vim zoo.cnftickTime=2000initLimit=10syncLimit=5dataDir=/home/hadoop/zookeeperdataLogDir=/home/hadoop/log/zoologclientPort=2181

Copy the configuration file to the other two nodes, create a myid and start zookeeper

$ echo  "1" > /home/hadoop/zookeeper/myid$ zkServer start...$ zkServer statuszkServer.sh statusJMX enabled by defaultUsing config: /opt/zookeeper/bin/../conf/zoo.cfgMode: leader

2. Modify Hadoop Configuration

In core-site, you must use ha. zookeeper. quorum to set the ZooKeeper server node. In addition, fs. defaultFS needs to be set to the logical service name of HDFS (must be consistent with dfs. nameservices in the hdfs-site.xml ).

$ core-site.xml  fs.defaultFS  hdfs://myhdfs        hadoop.tmp.dir        /home/hadoop/tmp        hadoop.logfile.size        104857600        hadoop.logfile.count        10        io.file.buffer.size        131072        ha.zookeeper.quorum        debugo01,debugo02,debugo03

The hdfs-site.xml needs to add more settings:
dfs.nameservices-- Hdfs nn logical name, using the myhdfs set above
dfs.ha.namenodes.myhdfs-- List of nodes with the Service Logic name myhdfs
dfs.namenode.rpc-address.myhdfs.nn1-- RPC address for external services of the nn1 node in myhdfs
dfs.namenode.http-address.myhdfs.nn1-- The http address of the external service of the nn1 node in myhdfs
dfs.namenode.shared.edits.dir-- Sets the URI address of a group of journalnodes. active NN writes the edit log to these journalnodes, while standby NameNode reads the edit log and acts in the directory tree in the memory. If journalNode has multiple nodes, use semicolons to separate them. This attribute value must comply with the following format: qjournal: // host1: port1; host2: port2; host3: port3/journalId
dfs.journalnode.edits.dir-- A directory on the node where JournalNode is located, used to store editlog and other status information.
dfs.ha.automatic-failover.enabled-- Enable automatic failover. The automatic failover depends on the zookeeper cluster and ZKFailoverController (ZKFC). The latter is a zookeeper client that monitors the NN status information. Each node running NN must run an zkfc. Zkfs provides the following functions:
Health monitoringZkfc regularly runs the health-check command on the local NN. If the NN returns the correct result, this NN is considered OK. Otherwise, the node is considered invalid.
ZooKeeper session managementWhen the local NN is healthy, zkfc will hold a session in zk. If the local NN is active again, zkfc also has a "ephemeral" node as the lock. Once the local NN becomes invalid, the node will be automatically deleted.
ZooKeeper-based electionIf the local NN is healthy and zkfc finds that no other NN holds the exclusive lock. Then he will try to obtain the lock. Once the lock succeeds, it will need to execute Failover and then become an active NN node. The Failover process is: the first step is to execute fence on the previous NN, if necessary. Step 2: Convert the local NN to the active state.
The method to start zkfc is as follows: hadoop-daemon.sh start zkfc. The process is automatically started through the start-dfs.sh, generally without manual start and stop.
dfs.client.failover.proxy.provider.myhadoop-- Java Implementation class for interaction between the client and active NameNode. the DFS client searches for the current active NN through this class.
dfs.ha.fencing.methods-- Solve the split-brain problem of the HA cluster (that is, two master nodes provide external services at the same time, resulting in inconsistent systems ). In hdfs ha, JournalNode only allows one NameNode to write data. There are no two active NameNode issues,
However, when switching between the master and slave nodes, the previous active NameNode may still process RPC requests from the client. Therefore, you need to add an isolation mechanism (fencing) to kill the previous active NameNode. The common fence method is sshfence. You must specify the key dfs. ha. fencing. ssh. private-key-files used for ssh communication and the connection timeout time.

$ hdfs-site.xml  dfs.nameservices  myhdfs  dfs.ha.namenodes.myhdfs  nn1,nn2  dfs.namenode.rpc-address.myhdfs.nn1  debugo01:8020  dfs.namenode.rpc-address.myhdfs.nn2  debugo02:8020  dfs.namenode.http-address.myhdfs.nn1  debugo01:50070  dfs.namenode.http-address.myhdfs.nn2  debugo02:50070  dfs.namenode.shared.edits.dir  qjournal://debugo01:8485;debugo02:8485;debugo03:8485/hadoop-journal  dfs.ha.automatic-failover.enabled  true  dfs.journalnode.edits.dir  /home/hadoop/journal  dfs.client.failover.proxy.provider.myhadoop                          org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider  dfs.ha.fencing.methods        sshfence    how to communicate in the switch process  dfs.ha.fencing.ssh.private-key-files        /home/hadoop/.ssh/id_rsa  the location stored ssh key  dfs.ha.fencing.ssh.connect-timeout    5000    dfs.datanode.data.dir  /home/hadoop/data  dfs.namenode.name.dir  /home/hadoop/namenode  dfs.namenode.handler.count    8  dfs.replication2

3. Start NameNode HA

Initialize zkfc

mkdir /home/hadoop/journal /home/hadoop/data /home/hadoop/namenodehdfs zkfc -formatZK14/09/13 21:17:03 INFO zookeeper.ClientCnxn: Opening socket connection to server debugo02/192.168.46.202:2181. Will not attempt to authenticate using SASL (unknown error)14/09/13 21:17:03 INFO zookeeper.ClientCnxn: Socket connection established to debugo02/192.168.46.202:2181, initiating session14/09/13 21:17:03 INFO zookeeper.ClientCnxn: Session establishment complete on server debugo02/192.168.46.202:2181, sessionid = 0x2487208163e0000, negotiated timeout = 500014/09/13 21:17:03 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/myhdfs in ZK.

Start formatting HDFS for the first time. During HDFS formatting, HA will communicate with the journalnode. Therefore, you must first start the journalnode of the three nodes.
hdfs journalnode
hdfs namenode -format
Start all services directly through start-dfs.sh

$ start-dfs.sh Starting namenodes on [debugo01 debugo02]debugo01: starting namenode, logging to /opt/hadoop/logs/hadoop-hadoop-namenode-debugo01.outdebugo02: starting namenode, logging to /opt/hadoop/logs/hadoop-hadoop-namenode-debugo02.outdebugo01: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-datanode-debugo01.outdebugo02: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-datanode-debugo02.outdebugo03: starting datanode, logging to /opt/hadoop/logs/hadoop-hadoop-datanode-debugo03.outStarting journal nodes [debugo01 debugo02 debugo03]debugo01: starting journalnode, logging to /opt/hadoop/logs/hadoop-hadoop-journalnode-debugo01.outdebugo03: starting journalnode, logging to /opt/hadoop/logs/hadoop-hadoop-journalnode-debugo03.outdebugo02: starting journalnode, logging to /opt/hadoop/logs/hadoop-hadoop-journalnode-debugo02.outStarting ZK Failover Controllers on NN hosts [debugo01 debugo02]debugo01: starting zkfc, logging to /opt/hadoop/logs/hadoop-hadoop-zkfc-debugo01.outdebugo02: starting zkfc, logging to /opt/hadoop/logs/hadoop-hadoop-zkfc-debugo02.out$ jps11562 Jps11031 NameNode11494 DFSZKFailoverController11324 JournalNode11136 DataNode7657 QuorumPeerMain

Access debugo01: 50070 in a browser and the node becomes active.

The namenode started first becomes active. You can see the regular replication in the standby log.

2014-09-13 21:25:46,132 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting CacheReplicationMonitor with interval 30000 milliseconds2014-09-13 21:25:46,132 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning because of pending operations2014-09-13 21:25:46,132 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).....

You need to synchronize the metadata once below:

$ hdfs namenode -bootstrapStandby......About to bootstrap Standby ID nn1 from:           Nameservice ID: myhdfs        Other Namenode ID: nn2  Other NN's HTTP address: http://debugo02:50070  Other NN's IPC  address: debugo02/192.168.46.202:8020             Namespace ID: 863538584            Block pool ID: BP-351445905-192.168.46.202-1410670136650               Cluster ID: CID-c98eb846-66b5-4663-9a35-a091eb1718d1           Layout version: -56=====================================================Re-format filesystem in Storage Directory /home/hadoop/namenode ? (Y or N) Y

Access

Kill the active NN process on debugo01, and standby NN becomes active.

Note: The following warning is prompted during manual switchover. Therefore, when zkfc is started, no switchover is required.

$ hdfs haadmin -transitionToActive nn1Automatic failover is enabled for NameNode at debugo01/192.168.46.201:8020. Refusing to manually manage HA state, since it may cause a split-brain scenario or other incorrect state.If you are very sure you know what you are doing, please  specify the forcemanual flag.

Reference

Http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-High-Availability-Guide/cdh4hag_topic_2_3.html

Http://blog.csdn.net/u010967382/article/details/30976935

Http://blog.csdn.net/chenpingbupt/article/details/7922089

The environment is as follows:
HOST: debugo0 [1-3], CentOS 6.5
Hadoop 2.4.1
ZooKeeper 3.4.6

1. Start ZooKeeper
$ Mkdir-p/home/hadoop/zooKeeper/home/hadoop/log/zoolog
$ Cd $ ZOOKEEPER_HOME/conf
$ Cp zoo_sample.cnf zoo. cnf
$ Vim zoo. cnf
TickTime = 2000
InitLimit = 10
SyncLimit = 5
DataDir =/home/hadoop/zookeeper
DataLogDir =/home/hadoop/log/zoolog
ClientPort = 2181
Copy the configuration file to the other two nodes, create a myid and start zookeeper
$ Echo "1">/home/hadoop/zookeeper/myid
$ ZkServer start
...
$ ZkServer status
ZkServer. sh status
JMX enabled by default
Using config:/opt/zookeeper/bin/../conf/zoo. cfg
Mode: leader

2. Modify Hadoop Configuration
In core-site, you must use ha. zookeeper. quorum to set the ZooKeeper server node. In addition, fs. defaultFS needs to be set to the logical service name of HDFS (must be consistent with dfs. nameservices in the hdfs-site.xml ).
$ Core-site.xml

Fs. defaultFS
Hdfs: // myhdfs Hadoop. tmp. dir
/Home/hadoop/tmp Hadoop. logfile. size
104857600 Hadoop. logfile. count
10 Io. file. buffer. size
131072 Ha. zookeeper. quorum
Debugo01, debugo02, debugo03

The hdfs-site.xml needs to add more settings:
Dfs. nameservices -- Logical name of hdfs nn, use the myhdfs
Dfs. ha. namenodes. myhdfs -- List of nodes with the given service logic name myhdfs
Dfs. namenode. rpc-address.myhdfs.nn1 -- RPC address for external service of nn1 node in myhdfs
Dfs. namenode. http-address.myhdfs.nn1 -- the http address of the external service of the nn1 node in myhdfs
Dfs. namenode. shared. edits. dir -- sets the URI address of a group of journalnodes. active NN writes the edit log to these journalnodes, while standby NameNode reads the edit log and acts in the directory tree in the memory. If journalNode has multiple nodes, use semicolons to separate them. This attribute value must comply with the following format: qjournal: // host1: port1; host2: port2; host3: port3/journalId
Dfs. journalnode. edits. dir -- a directory on the node where JournalNode is located, used to store editlog and other status information.
Dfs. ha. automatic-failover.enabled -- enable automatic failover. The automatic failover depends on the zookeeper cluster and ZKFailoverController (ZKFC). The latter is a zookeeper client that monitors the NN status information. Each node running NN must run an zkfc. Zkfs provides the following functions:

Health monitoring
Zkfc regularly runs the health-check command on the local NN. If the NN returns the correct result, this NN is considered OK. Otherwise, the node is considered invalid.
ZooKeeper session management
When the local NN is healthy, zkfc will hold a session in zk. If the local NN is active again, zkfc also has a "ephemeral" node as the lock. Once the local NN becomes invalid, the node will be automatically deleted.
ZooKeeper-based election
If the local NN is healthy and zkfc finds that no other NN holds the exclusive lock. Then he will try to obtain the lock. Once the lock succeeds, it will need to execute Failover and then become an active NN node. The Failover process is: the first step is to execute fence on the previous NN, if necessary. Step 2: Convert the local NN to the active state.
The method to start zkfc is as follows: hadoop-daemon.sh start zkfc. The process is automatically started through the start-dfs.sh, generally without manual start and stop.
Dfs. client. failover. proxy. provider. myhadoop -- Java Implementation class for interaction between the client and active NameNode. the DFS client searches for the current active NN through this class.
Dfs. ha. fencing. methods -- solves the split-brain problem of the HA cluster (that is, when two masters provide external services at the same time, the system is in an inconsistent state ). In hdfs ha, JournalNode only allows one NameNode to write data. There are no two active NameNode issues,
However, when switching between the master and slave nodes, the previous active NameNode may still process RPC requests from the client. Therefore, you need to add an isolation mechanism (fencing) to kill the previous active NameNode. The common fence method is sshfence. You must specify the key dfs. ha. fencing. ssh. private-key-files used for ssh communication and the connection timeout time.
$ Hdfs-site.xml

Dfs. nameservices
Myhdfs Dfs. ha. namenodes. myhdfs
Nn1, nn2 Dfs. namenode. rpc-address.myhdfs.nn1
Debugo01: 8020 Dfs. namenode. rpc-address.myhdfs.nn2
Debugo02: 8020 Dfs. namenode. http-address.myhdfs.nn1
Debugo01: 50070 Dfs. namenode. http-address.myhdfs.nn2
Debugo02: 50070 Dfs. namenode. shared. edits. dir
Qjournal: // debugo01: 8485; debugo02: 8485; debugo03: 8485/hadoop-journal Dfs. ha. automatic-failover.enabled
True Dfs. journalnode. edits. dir
/Home/hadoop/journal Dfs. client. failover. proxy. provider. myhadoop
Org. apache. hadoop. hdfs. server. namenode. ha. ConfiguredFailoverProxyProvider Dfs. ha. fencing. methods
Sshfence
How to communicate in the switch process Dfs. ha. fencing. ssh. private-key-files
/Home/hadoop/. ssh/id_rsa
The location stored ssh key Dfs. ha. fencing. ssh. connect-timeout
5000 Dfs. datanode. data. dir
/Home/hadoop/data Dfs. namenode. name. dir
/Home/hadoop/namenode Dfs. namenode. handler. count
8 Dfs. replication
2

Mkdir/home/hadoop/journal/home/hadoop/data/home/hadoop/namenode

Hdfs zkfc-formatZK
14/09/13 21:17:03 INFO zookeeper. ClientCnxn: Opening socket connection to server debugo02/192.168.46.202: 2181. Will not attempt to authenticate using SASL (unknown error)
14/09/13 21:17:03 INFO zookeeper. ClientCnxn: Socket connection established to debugo02/192.168.46.202: 2181, initiating session
14/09/13 21:17:03 INFO zookeeper. ClientCnxn: Session establishment complete on server debugo02/192.168.46.202: 2181, sessionid = 0x2487208163e0000, negotiated timeout = 5000
14/09/13 21:17:03 INFO ha. ActiveStandbyElector: Successfully created/hadoop-ha/myhdfs in ZK.

Start formatting HDFS for the first time. During HDFS formatting, HA will communicate with the journalnode. Therefore, you must first start the journalnode of the three nodes.
Hdfs journalnode
Hdfs namenode-format
Start all services directly through start-dfs.sh
$ Start-dfs.sh
Starting namenodes on [debugo01 debugo02]
Debugo01: starting namenode, loglogging to/opt/hadoop/logs/hadoop-hadoop-namenode-debugo01.out
Debugo02: starting namenode, logging to/opt/hadoop/logs/hadoop-hadoop-namenode-debugo02.out
Debugo01: starting datanode, logging to/opt/hadoop/logs/hadoop-hadoop-datanode-debugo01.out
Debugo02: starting datanode, logging to/opt/hadoop/logs/hadoop-hadoop-datanode-debugo02.out
Debugo03: starting datanode, logging to/opt/hadoop/logs/hadoop-hadoop-datanode-debugo03.out
Starting journal nodes [debugo01 debugo02 debugo03]
Debugo01: starting journalnode, logging to/opt/hadoop/logs/hadoop-hadoop-journalnode-debugo01.out
Debugo03: starting journalnode, logging to/opt/hadoop/logs/hadoop-hadoop-journalnode-debugo03.out
Debugo02: starting journalnode, logging to/opt/hadoop/logs/hadoop-hadoop-journalnode-debugo02.out
Starting ZK Failover Controllers on NN hosts [debugo01 debugo02]
Debugo01: starting zkfc, logging to/opt/hadoop/logs/hadoop-hadoop-zkfc-debugo01.out
Debugo02: starting zkfc, logging to/opt/hadoop/logs/hadoop-hadoop-zkfc-debugo02.out
$ Jps
Jps 11562
11031 NameNode
11494 DFSZKFailoverController
11324 JournalNode
11136 DataNode
7657 QuorumPeerMain
Access debugo01: 50070 in a browser and the node becomes active.

The namenode started first becomes active. You can see the regular replication in the standby log.
21:25:46, 132 INFO org. apache. hadoop. hdfs. server. blockmanagement. CacheReplicationMonitor: Starting CacheReplication
Monitor with interval 30000 milliseconds
21:25:46, 132 INFO org. apache. hadoop. hdfs. server. blockmanagement. CacheReplicationMonitor: Rescanning because of pen
Ding operations
21:25:46, 132 INFO org. apache. hadoop. hdfs. server. blockmanagement. CacheReplicationMonitor: Scanned 0 directive (s)
D 0 block (s) in 1 millisecond (s ).
....

You need to synchronize the metadata once below:
Hdfs namenode-bootstrapStandby
......
About to bootstrap Standby ID nn1 from:
Nameservice ID: myhdfs
Other Namenode ID: nn2
Other NN's HTTP address: http: // debugo02: 50070
Other NN's IPC address: debugo02/192.168.46.202: 8020
Namespace ID: 863538584
Block pool ID: BP-351445905-192.168.46.202-1410670136650
Cluster ID: CID-c98eb846-66b5-4663-9a35-a091eb1718d1
Layout version:-56
========================================================== ==================
Re-format filesystem in Storage Directory/home/hadoop/namenode? (Y or N) Y
Access

Kill the active NN process on debugo01, and standby NN becomes active.

Note: The following warning is prompted during manual switchover. Therefore, when zkfc is started, no switchover is required.
Hdfs haadmin-transitionToActive nn1
Automatic failover is enabled for NameNode at debugo01/192.168.46.201: 8020. Refusing to manually manage HA state, since it may cause a split-brain scenario or other incorrect state.
If you are very sure you know what you are doing, please specify the forcemanual flag.

Reference

Http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-High-Availability-Guide/cdh4hag_topic_2_3.html

Http://blog.csdn.net/u010967382/article/details/30976935

Http://blog.csdn.net/chenpingbupt/article/details/7922089

Original article address: NameNode HA configuration details, thanks to the original author for sharing.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More