Distributed configuration Hadoop-2.2.0 in Ubuntu and centos introduces the most basic configuration of hadoop 2.2.0. Hadoop 2.2.0 provides the HA function. This article introduces the configuration of hadoop 2.2.0ha based on the previous article.
Note:
The following two namenode machines are named namenode1 and namenode2. among them, namenode1 is active node and namenode2 is standby namenode.
There are three journalnode machines (at least three): journalnode1, journalnode2, and journalnode3. (The number of machines of journalnode can be 3, 5, 7 ...)
In addition, pay attention to the consistency of the two namenode. Most of the operations performed on namenode1 must also be performed on namenode2.
Configuration File
The related configuration of core-site.xml and hdfs-site.xml is as follows:
1. core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop2.2.0</value> </property></configuration>
2. hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/dfs/data</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>namenode1:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>namenode2:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>namenode1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>namenode2:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://journalnode1:8485;journalnode2:8485;journalnode3:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/dfs/journal</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/root/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>6000</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>false</value> </property></configuration>
Restart to make the configuration take effect.
Startup Process:
1. Start journalnode on the journalnode Machine
sbin/hadoop-daemon.sh start journalnode
2. Start namenode on the namenode machine (assuming that namenode1 is active and namenode2 is standby)
A) if it is the first time to start, run the format command on namenode1:
bin/hadoop namenode -format
B) if it is not started for the first time, run the following command on namenode1.
bin/hdfs namenode -initializeSharedEdits
C) Start namenode on namenode1:
sbin/hadoop-daemon.sh start namenode
D) run the following command on namenode2:
sbin/hadoop-daemon.sh start namenode -bootstrapStandby
If it fails, copy the data in the DFS. namenode. Name. dir directory of namenode1 directly to the DFS. namenode. Name. dir directory of namenode2.
Then start namenode on namenode2:
sbin/hadoop-daemon.sh start namenode
Now both namenode1 and namenode2 are started and are in the "standby" status.
E) run the following command on namenode1:
bin/hdfs haadmin -transitionToActive nn1
In this way, the namenode1 State becomes "active ".
3. Start datanode on the datanode Machine
sbin/hadoop-daemon.sh start datanode
By now, HDFS of hadoop2.0 can be used normally and the HA function is available.
Check
You can view the status of active namenode (namenode1) and standby namenode (namenode2) on the following page.
Http: // namenode1: 50070/dfshealth. jsp
Http: // namenode2: 50070/dfshealth. jsp
In addition, you can run common HDFS shell commands to test whether HDFS is normal.
Ha failover Test
Stop namenode1 (simulate namenode1), and HDFS will be unavailable.
Run the following command on namenode2:
bin/hdfs haadmin -transitionToActive nn2
After the command is run successfully, the status of namenode2 changes to "active", and HDFS returns to normal.
On namenode1, run the following command:
bin/hdfs haadmin -failover nn1 nn2
At this time, namenode2 changes to "active", and namenode1 changes to "standby ".
Reprinted please indicate the source:Http://blog.csdn.net/iAm333