Introduction
The Namenode in Hadoop is like the heart of man, it is very important that you should never stop working. In the HADOOP1 era, there was only one namenode. If the Namenode data is missing or does not work, the entire cluster cannot be recovered. This is a single point in the Hadoop1 and a hadoop1 unreliable performance, as shown in Figure 1. HADOOP2 solved the problem.
The high reliability of HDFs in hadoop2.2.0 means that 2 Namenode can be started at the same time. One of them is in the http://www.aliyun.com/zixun/aggregation/7317.html "> Working state, and the other is on standby." This way, when a namenode server is down, you can manually or automatically switch to another namenode to provide services without losing data.
These namenode share data to ensure that the data is in the same state. Data can be shared between multiple Namenode through Nnetwork File system or quorum Journal Node. The former is a file system shared by Linux, which belongs to the configuration of the operating system;
Here we describe how to use quorum Journal node configuration by hand.
When the cluster is started, 2 Namenode can be started at the same time. Only one of these namenode is active and the other belongs to the standby state. The active state means that the service is provided, and the standby state means that it is in hibernation, synchronizing data only and is always ready to provide the service, as shown in Figure 2.
Figure 2
Schema
In a typical HA cluster, each namenode is a stand-alone server. At any one time, only one namenode is active and the other is in a standby state. In which, the active state of Namenode is responsible for all client operations, standby state of the Namenode in subordinate position, maintaining the data status, ready to switch.
Two Namenode for data synchronization, they communicate with each other through a set of independent processes called Journalnodes. Most journalnodes processes are notified when there are any changes to the Namenode namespace of the active state. The standby state Namenode has the ability to read change information in JNS and has been monitoring the change of edit log to apply the change to its own namespace. Standby can ensure that the namespace state is fully synchronized when the cluster fails, as shown in Figure 3.
In order to ensure fast switching, it is necessary for the namenode of the standby state to know the location of all the data blocks in the cluster. To do this, all datanodes must be configured with two Namenode addresses, sending data block location information and heartbeat to both of them.
For HA clusters, it is critical to ensure that only one namenode is active at the same time. Otherwise, the data state of the two namenode will diverge, may lose data, or produce erroneous results. To ensure this, JNS must ensure that only one namenode at the same time can write data to itself.
Hardware Resources
In order to deploy the HA cluster, you should prepare the following things:
* Namenode server: Servers running Namenode should have the same hardware configuration.
* Journalnode Server: The running Journalnode process is very lightweight and can be deployed on other servers. Note: At least 3 nodes must be allowed. Of course you can run more, but it must be odd numbers, such as 3, 5, 7, 9, and so on. When running N nodes, the system can tolerate at least (N-1)/2 nodes fail without affecting normal operation.
In the HA cluster, the namenode of the standby state can complete the checkpoint operation, so there is no need to configure secondary namenode, Checkpointnode, Backupnode. If you do configure it, you will also get an error.
Configuration
The HA cluster needs to differentiate a HDFS cluster using the Nameservice ID. In addition, HA also uses a word, called Namenode ID. Different namenode in the same cluster are differentiated using different Namenode IDs. In order to support all Namenode using the same configuration file, in the configuration parameters, the "Nameservice ID" needs to be prefixed with the Namenode ID.
The HA configuration content is in the file hdfs-site.xml. The key configuration items are described below.
Dfs.nameservices the logical name of the namespace. If you use HDFs Federation, you can configure the names of multiple namespaces by separating them with commas.
Dfs.nameservices
Mycluster
Dfs.ha.namenodes. Unique identifier name for all namenode in the [Nameservice ID] namespace. You can configure multiple, separated by commas. The name is the one that lets Datanode know all the namenode of each cluster. Currently, only two Namenode can be configured per cluster.
Dfs.ha.namenodes.mycluster
Nn1,nn2
Dfs.namenode.rpc-address. [Nameservice ID]. [Name node ID] The RPC address that each namenode listens on. As shown below
Dfs.namenode.rpc-address.mycluster.nn1
machine1.example.com:8020
Dfs.namenode.rpc-address.mycluster.nn2
machine2.example.com:8020
Dfs.namenode.http-address. [Nameservice ID]. [Name Node ID] The HTTP address that each Namenode listens on. As shown below
Dfs.namenode.http-address.mycluster.nn1
machine1.example.com:50070
Dfs.namenode.http-address.mycluster.nn2
machine2.example.com:50070
If security policy is enabled, you should also configure htts-address information for each namenode, similar to this.
Dfs.namenode.shared.edits.dir This is the URI of the Namenode read-write JNS group. This uri,namenodes allows you to read and write edit log content. The format of the URI "Qjournal://host1:port1;host2:port2;host3:port3/journalid". Host1, Host2, host3 refer to the address of journal node, which must be an odd number, at least 3, where Journalid is the unique identifier for the cluster, and the same journalid is used for multiple federated namespaces. Configured as follows
Dfs.namenode.shared.edits.dir
Qjournal://node1.example.com:84.5;node2.example.com:8485;node3.example.com:8485/mycluster
Dfs.client.failover.proxy.provider. [Nameservice ID] This configures a Java class that HDFS clients connect to active Namenode.
Dfs.client.failover.proxy.provider.mycluster
Org.apache.Hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
Dfs.ha.fencing.methods Configure the processing class when an active Namenode error occurs. When an active Namenode error occurs, it is generally necessary to close the process. The process can be either SSH or shell.
If you use SSH, configure the following
Dfs.ha.fencing.methods
Sshfence
Dfs.ha.fencing.ssh.private-key-files
/home/exampleuser/.ssh/id_rsa
This method is simple in configuration and recommended for use.
Fs.defaultfs the default path prefix when the client connects HDFs. If the value of the Nameservice ID previously configured is mycluster, this can be configured as part of the authorization information.
You can configure the following in Core-site.xml
Fs.defaultfs
Hdfs://mycluster
Dfs.journalnode.edits.dir This is the path where the Journalnode process maintains its logical state. This is the absolute path to the Linux server file.
Configured as follows
Dfs.journalnode.edits.dir
/path/to/journal/node/local/data
Deployment
Once the configuration is complete, you can start the Journalnode process. Executes the command "hadoop-daemon.sh Journalnode" on each journalnode machine.
In the case of a new HDFs cluster, the Format command "HDFs Namenode-format" is executed first, followed by the start of the Namenode process.
If there is a formatted namenode, it is already started. Then the Namenode data should be synchronized to another unformatted Namenode. Executes the command "HDFs namenode-bootstrapstandby" on unformatted Namenode.
If you are turning a non-ha cluster into an HA cluster, you should run the command "Hdfs–initializesharededits", which initializes the data in Journalnode.
After doing these things, you can start two namenode. After the success of the launch, the two namenode are observed through the Web page, which is standby.
The following executes the command "HDFs haadmin-failover--forcefence serviceid serviceId2". The Namenode state will be safely switched. The latter one becomes active. This is the time to see the correct results through a Web page.
Administrator command
Executing the command "HDFs haadmin" displays a list of child commands, as follows
Usage:dfshaadmin [NS]
[-transitiontoactive]
[-transitiontostandby]
[-failover [--forcefence] [--forceactive]]
[-getservicestate]
[-checkhealth]
[-help]
If you want to see the specific usage, use "HDFs haadmin-help". which,
Transitiontoactive and Transitiontostandby are used to switch between different states. These commands have no protection errors and are rarely used.
Failover initializes a recovery. The command switches from one invalid Namenode to another.
Getservicestate gets the status of the current Namenode.
Checkhealth Check the status of the Namenode. Returns 0 normally, otherwise it returns a value other than 0.
The following is a description of my detailed installation process:
1. Determining cluster structure
I use 5 virtual machines here, the memory is 512MB, the hard disk is 20GB, the following is my role assignment to these 5 machines.
IP Address host name Namenodejournalnodedatanode
192.168.80.100hadoop100 whether
192.168.80.101hadoop101 whether
192.168.80.102HADOOP102 is
192.168.80.103HADOOP103 is
192.168.80.104HADOOP104 is
2. Set up Linux, install JDK
First install a virtual machine in VMware, set SSH password-free login, set static IP to 192.168.80.100, set host name as hadoop100, edit Hosts file, install JDK, etc., refer to the author's previous article. No longer be tired.
Then modify the configuration file for Hadoop, which is described below.
2.1 Edit File $hadoop_home/etc/hadoop/hadoop-env.sh, modify one line as follows
Export JAVA_HOME=/USR/LOCAL/JDK
Remove the java_home in front of the here and change the value to its own installed JDK path;
2.2 Edit file $hadoop_home/etc/hadoop/core-site.xml, as shown below
Hadoop.tmp.dir
/usr/local/hadoop/tmp
Fs.default.name
hdfs://hadoop100:9000
In the above configuration, name is a Hadoop.tmp.dir value that represents the directory where Hadoop holds data, including Namenode data, and datanode data. The path is specified arbitrarily, as long as the folder actually exists.
Name is the Fs.defaultfs value that represents the logical name of the HDFS path. Because we will start 2 namenode, each namenode position is different, then switch, the user also wants to modify the code, very troublesome, so use a logical path, users can not worry about the namenode switch caused by the path inconsistency problem.
2.3 Edit file $hadoop_home/etc/hadoop/hdfs-site.xml, as shown below
Dfs.replication
2
Dfs.nameservices
Cluster1
Dfs.ha.namenodes.cluster1
Hadoop100,hadoop101
dfs.namenode.rpc-address.cluster1.hadoop100
hadoop100:9000
Dfs.namenode.rpc-address.cluster1.hadoop101
hadoop101:9000
dfs.namenode.http-address.cluster1.hadoop100
hadoop100:50070
Dfs.namenode.http-address.cluster1.hadoop101
hadoop101:50070
Dfs.namenode.shared.edits.dir
qjournal://hadoop100:8485;hadoop101:8485;hadoop102:84
85/cluster1
Dfs.client.failover.proxy.provider.cluster1
Org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredF
Ailoverproxyprovider
Dfs.ha.fencing.methods
Sshfence
Dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
Dfs.journalnode.edits.dir
/usr/local/hadoop/tmp/journal
The above configuration information is explained in the previous description.
2.4 Edit File $hadoop_home/etc/hadoop/slaves, as shown below
hadoop102
hadoop103
hadoop104
Represents the above three nodes as Datanode nodes.
3. Use virtual machines to replicate 4 virtual machines. Modify the IP and hostname and modify the Hosts file. Be sure to use SSH password-free logins between 5 nodes.
4. Execute command to start cluster
The following commands strictly pay attention to the order of execution, can not be reversed!
4.1 Start Journalnode Cluster
On hadoop100, hadoop101, hadoop102, execute command hadoop-daemon.sh start Journalnode
4.2 Format a Namenode
Execute command HDFs Namenode–format in hadoop100
4.3 Start a Namenode
Execute command hadoop-daemon.sh start Namenode in hadoop100
4.4 Format Another Namenode
Execute command HDFs namenode-bootstrapstandby in hadoop101
4.5 Start another Namenode
Execute command hadoop-daemon.sh start Namenode in hadoop101
At this point, use the browser to access http://hadoop100:50070 and http://hadoop101:50070. If you can see two pages, prove that Namenode started successfully. At this time, the state of the two Namenode is standby.
4.6 Converting active
Execute command HDFs haadmin-transitiontoactive hadoop100 in hadoop100
Using the browser to access http://hadoop100:50070 and http://hadoop101:50070, you will find that the hadoop100 node becomes active,hadoop101 or standby.
4.7 Start Datanodes
The hadoop100 executes the command hadoop-daemons.sh start Datanode starts with 3 datanode nodes.
This is when the HA cluster starts.
If you want to experiment. Namenode switch, execute command hdfs haadmin–failover–forceactive hadoop100 hadoop101
When you observe the state of hadoop100 and hadoop101, you will find that it has changed.
If you want to upload the data, you also need to modify the value of the fs.default.name in the Core-site.xml, instead of hdfs://hadoop101:9000.
If some of the classmate configuration fails, you can use my one-click Run script, the Hadoop directory is/usr/local/hadoop, and a key script is placed in the directory.