The most complete and detailed ha high reliable and simple configuration of Hadoop2.2.0 cluster in China

Source: Internet
Author: User
Keywords Name dfs CAN execute

Introduction

The Namenode in Hadoop is like the heart of man, it is very important that you should never stop working. In the HADOOP1 era, there was only one namenode. If the Namenode data is missing or does not work, the entire cluster cannot be recovered. This is a single point in the Hadoop1 and a hadoop1 unreliable performance, as shown in Figure 1. HADOOP2 solved the problem.

The high reliability of HDFs in hadoop2.2.0 means that 2 Namenode can be started at the same time. One of them is in the http://www.aliyun.com/zixun/aggregation/7317.html "> Working state, and the other is on standby." This way, when a namenode server is down, you can manually or automatically switch to another namenode to provide services without losing data.

These namenode share data to ensure that the data is in the same state. Data can be shared between multiple Namenode through Nnetwork File system or quorum Journal Node. The former is a file system shared by Linux, which belongs to the configuration of the operating system;

Here we describe how to use quorum Journal node configuration by hand.

When the cluster is started, 2 Namenode can be started at the same time. Only one of these namenode is active and the other belongs to the standby state. The active state means that the service is provided, and the standby state means that it is in hibernation, synchronizing data only and is always ready to provide the service, as shown in Figure 2.

  

Figure 2

Schema

In a typical HA cluster, each namenode is a stand-alone server. At any one time, only one namenode is active and the other is in a standby state. In which, the active state of Namenode is responsible for all client operations, standby state of the Namenode in subordinate position, maintaining the data status, ready to switch.

Two Namenode for data synchronization, they communicate with each other through a set of independent processes called Journalnodes. Most journalnodes processes are notified when there are any changes to the Namenode namespace of the active state. The standby state Namenode has the ability to read change information in JNS and has been monitoring the change of edit log to apply the change to its own namespace. Standby can ensure that the namespace state is fully synchronized when the cluster fails, as shown in Figure 3.

  

In order to ensure fast switching, it is necessary for the namenode of the standby state to know the location of all the data blocks in the cluster. To do this, all datanodes must be configured with two Namenode addresses, sending data block location information and heartbeat to both of them.

For HA clusters, it is critical to ensure that only one namenode is active at the same time. Otherwise, the data state of the two namenode will diverge, may lose data, or produce erroneous results. To ensure this, JNS must ensure that only one namenode at the same time can write data to itself.

Hardware Resources

In order to deploy the HA cluster, you should prepare the following things:

* Namenode server: Servers running Namenode should have the same hardware configuration.

* Journalnode Server: The running Journalnode process is very lightweight and can be deployed on other servers. Note: At least 3 nodes must be allowed. Of course you can run more, but it must be odd numbers, such as 3, 5, 7, 9, and so on. When running N nodes, the system can tolerate at least (N-1)/2 nodes fail without affecting normal operation.

In the HA cluster, the namenode of the standby state can complete the checkpoint operation, so there is no need to configure secondary namenode, Checkpointnode, Backupnode. If you do configure it, you will also get an error.

Configuration

The HA cluster needs to differentiate a HDFS cluster using the Nameservice ID. In addition, HA also uses a word, called Namenode ID. Different namenode in the same cluster are differentiated using different Namenode IDs. In order to support all Namenode using the same configuration file, in the configuration parameters, the "Nameservice ID" needs to be prefixed with the Namenode ID.

The HA configuration content is in the file hdfs-site.xml. The key configuration items are described below.

Dfs.nameservices the logical name of the namespace. If you use HDFs Federation, you can configure the names of multiple namespaces by separating them with commas.

Dfs.nameservices

Mycluster

Dfs.ha.namenodes. Unique identifier name for all namenode in the [Nameservice ID] namespace. You can configure multiple, separated by commas. The name is the one that lets Datanode know all the namenode of each cluster. Currently, only two Namenode can be configured per cluster.

Dfs.ha.namenodes.mycluster

Nn1,nn2

Dfs.namenode.rpc-address. [Nameservice ID]. [Name node ID] The RPC address that each namenode listens on. As shown below

Dfs.namenode.rpc-address.mycluster.nn1

machine1.example.com:8020

Dfs.namenode.rpc-address.mycluster.nn2

machine2.example.com:8020

Dfs.namenode.http-address. [Nameservice ID]. [Name Node ID] The HTTP address that each Namenode listens on. As shown below

Dfs.namenode.http-address.mycluster.nn1

machine1.example.com:50070

Dfs.namenode.http-address.mycluster.nn2

machine2.example.com:50070

If security policy is enabled, you should also configure htts-address information for each namenode, similar to this.

Dfs.namenode.shared.edits.dir This is the URI of the Namenode read-write JNS group. This uri,namenodes allows you to read and write edit log content. The format of the URI "Qjournal://host1:port1;host2:port2;host3:port3/journalid". Host1, Host2, host3 refer to the address of journal node, which must be an odd number, at least 3, where Journalid is the unique identifier for the cluster, and the same journalid is used for multiple federated namespaces. Configured as follows

Dfs.namenode.shared.edits.dir

Qjournal://node1.example.com:84.5;node2.example.com:8485;node3.example.com:8485/mycluster

Dfs.client.failover.proxy.provider. [Nameservice ID] This configures a Java class that HDFS clients connect to active Namenode.

Dfs.client.failover.proxy.provider.mycluster

Org.apache.Hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

Dfs.ha.fencing.methods Configure the processing class when an active Namenode error occurs. When an active Namenode error occurs, it is generally necessary to close the process. The process can be either SSH or shell.

If you use SSH, configure the following

Dfs.ha.fencing.methods

Sshfence

Dfs.ha.fencing.ssh.private-key-files

/home/exampleuser/.ssh/id_rsa

This method is simple in configuration and recommended for use.

Fs.defaultfs the default path prefix when the client connects HDFs. If the value of the Nameservice ID previously configured is mycluster, this can be configured as part of the authorization information.

You can configure the following in Core-site.xml

Fs.defaultfs

Hdfs://mycluster

Dfs.journalnode.edits.dir This is the path where the Journalnode process maintains its logical state. This is the absolute path to the Linux server file.

Configured as follows

Dfs.journalnode.edits.dir

/path/to/journal/node/local/data

Deployment

Once the configuration is complete, you can start the Journalnode process. Executes the command "hadoop-daemon.sh Journalnode" on each journalnode machine.

In the case of a new HDFs cluster, the Format command "HDFs Namenode-format" is executed first, followed by the start of the Namenode process.

If there is a formatted namenode, it is already started. Then the Namenode data should be synchronized to another unformatted Namenode. Executes the command "HDFs namenode-bootstrapstandby" on unformatted Namenode.

If you are turning a non-ha cluster into an HA cluster, you should run the command "Hdfs–initializesharededits", which initializes the data in Journalnode.

After doing these things, you can start two namenode. After the success of the launch, the two namenode are observed through the Web page, which is standby.

The following executes the command "HDFs haadmin-failover--forcefence serviceid serviceId2". The Namenode state will be safely switched. The latter one becomes active. This is the time to see the correct results through a Web page.

Administrator command

Executing the command "HDFs haadmin" displays a list of child commands, as follows

Usage:dfshaadmin [NS]

[-transitiontoactive]

[-transitiontostandby]

[-failover [--forcefence] [--forceactive]]

[-getservicestate]

[-checkhealth]

[-help]

If you want to see the specific usage, use "HDFs haadmin-help". which,

Transitiontoactive and Transitiontostandby are used to switch between different states. These commands have no protection errors and are rarely used.

Failover initializes a recovery. The command switches from one invalid Namenode to another.

Getservicestate gets the status of the current Namenode.

Checkhealth Check the status of the Namenode. Returns 0 normally, otherwise it returns a value other than 0.

The following is a description of my detailed installation process:

1. Determining cluster structure

I use 5 virtual machines here, the memory is 512MB, the hard disk is 20GB, the following is my role assignment to these 5 machines.

IP Address host name Namenodejournalnodedatanode

192.168.80.100hadoop100 whether

192.168.80.101hadoop101 whether

192.168.80.102HADOOP102 is

192.168.80.103HADOOP103 is

192.168.80.104HADOOP104 is

2. Set up Linux, install JDK

First install a virtual machine in VMware, set SSH password-free login, set static IP to 192.168.80.100, set host name as hadoop100, edit Hosts file, install JDK, etc., refer to the author's previous article. No longer be tired.

Then modify the configuration file for Hadoop, which is described below.

2.1 Edit File $hadoop_home/etc/hadoop/hadoop-env.sh, modify one line as follows

Export JAVA_HOME=/USR/LOCAL/JDK

Remove the java_home in front of the here and change the value to its own installed JDK path;

2.2 Edit file $hadoop_home/etc/hadoop/core-site.xml, as shown below

Hadoop.tmp.dir

/usr/local/hadoop/tmp

Fs.default.name

hdfs://hadoop100:9000

In the above configuration, name is a Hadoop.tmp.dir value that represents the directory where Hadoop holds data, including Namenode data, and datanode data. The path is specified arbitrarily, as long as the folder actually exists.

Name is the Fs.defaultfs value that represents the logical name of the HDFS path. Because we will start 2 namenode, each namenode position is different, then switch, the user also wants to modify the code, very troublesome, so use a logical path, users can not worry about the namenode switch caused by the path inconsistency problem.

2.3 Edit file $hadoop_home/etc/hadoop/hdfs-site.xml, as shown below

Dfs.replication

2

Dfs.nameservices

Cluster1

Dfs.ha.namenodes.cluster1

Hadoop100,hadoop101

dfs.namenode.rpc-address.cluster1.hadoop100

hadoop100:9000

Dfs.namenode.rpc-address.cluster1.hadoop101

hadoop101:9000

dfs.namenode.http-address.cluster1.hadoop100

hadoop100:50070

Dfs.namenode.http-address.cluster1.hadoop101

hadoop101:50070

Dfs.namenode.shared.edits.dir

qjournal://hadoop100:8485;hadoop101:8485;hadoop102:84

85/cluster1

Dfs.client.failover.proxy.provider.cluster1

Org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredF

Ailoverproxyprovider

Dfs.ha.fencing.methods

Sshfence

Dfs.ha.fencing.ssh.private-key-files

/root/.ssh/id_rsa

Dfs.journalnode.edits.dir

/usr/local/hadoop/tmp/journal

The above configuration information is explained in the previous description.

2.4 Edit File $hadoop_home/etc/hadoop/slaves, as shown below

hadoop102

hadoop103

hadoop104

Represents the above three nodes as Datanode nodes.

3. Use virtual machines to replicate 4 virtual machines. Modify the IP and hostname and modify the Hosts file. Be sure to use SSH password-free logins between 5 nodes.

4. Execute command to start cluster

The following commands strictly pay attention to the order of execution, can not be reversed!

4.1 Start Journalnode Cluster

On hadoop100, hadoop101, hadoop102, execute command hadoop-daemon.sh start Journalnode

4.2 Format a Namenode

Execute command HDFs Namenode–format in hadoop100

4.3 Start a Namenode

Execute command hadoop-daemon.sh start Namenode in hadoop100

4.4 Format Another Namenode

Execute command HDFs namenode-bootstrapstandby in hadoop101

4.5 Start another Namenode

Execute command hadoop-daemon.sh start Namenode in hadoop101

At this point, use the browser to access http://hadoop100:50070 and http://hadoop101:50070. If you can see two pages, prove that Namenode started successfully. At this time, the state of the two Namenode is standby.

4.6 Converting active

Execute command HDFs haadmin-transitiontoactive hadoop100 in hadoop100

Using the browser to access http://hadoop100:50070 and http://hadoop101:50070, you will find that the hadoop100 node becomes active,hadoop101 or standby.

4.7 Start Datanodes

The hadoop100 executes the command hadoop-daemons.sh start Datanode starts with 3 datanode nodes.

This is when the HA cluster starts.

If you want to experiment. Namenode switch, execute command hdfs haadmin–failover–forceactive hadoop100 hadoop101

When you observe the state of hadoop100 and hadoop101, you will find that it has changed.

If you want to upload the data, you also need to modify the value of the fs.default.name in the Core-site.xml, instead of hdfs://hadoop101:9000.

If some of the classmate configuration fails, you can use my one-click Run script, the Hadoop directory is/usr/local/hadoop, and a key script is placed in the directory.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.