Hadoop2.0 qjm mode of HA configuration

Source: Internet
Author: User
Tags failover

Based on the article "Installation and basic configuration of Hadoop2.0" (see http://www.linuxidc.com/Linux/2014-05/101173.htm), this paper continues to introduce hadoop2.0 QJM (Quorum Journal Manager) mode of HA configuration (hadoop2.0 architecture, specific version is hadoop2.2.0). This article only describes the main preparation of HA manual switch, automatically switch to the next article continue to introduce (see http://www.linuxidc.com/Linux/2014-05/101176.htm).

--------------------------------------Split Line--------------------------------------

Related reading :

Build a Hadoop environment on Ubuntu 13.04 http://www.linuxidc.com/Linux/2013-06/86106.htm

Ubuntu 12.10 +hadoop 1.2.1 Version cluster configuration http://www.linuxidc.com/Linux/2013-09/90600.htm

Build a Hadoop environment on Ubuntu (standalone mode + pseudo distribution mode) http://www.linuxidc.com/Linux/2013-01/77681.htm

Configuration of the Hadoop environment under Ubuntu http://www.linuxidc.com/Linux/2012-11/74539.htm

A single version of the Hadoop Environment Graphics tutorial detailed http://www.linuxidc.com/Linux/2012-02/53927.htm

Build a Hadoop environment (build with virtual machine Virtual two Ubuntu system in WINODWS Environment) http://www.linuxidc.com/Linux/2011-12/48894.htm

--------------------------------------Split Line--------------------------------------

1 preparation

The machine role described in this article contains 2 Namenode:

    • Namenode1
    • Namenode2


Where Namenode1 is active namenode;namenode2 is standby namenode.

Consists of 3 Journalnode:

    • Journalnode1
    • Journalnode2
    • Journalnode3

The number of Journalnode machines is odd and can be 3,5,7...,2n+1.

Other machine roles not covered in this article can refer to the hadoop2.0 installation and basic configuration article.

2 configuration

The HA configuration only involves Core-site.xml and hdfs-site.xml two configuration files, and other configurations can refer to the article "Hadoop2.0 Installation and Basic Configuration".

2.1 Core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://mycluster</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/tmp/hadoop2.0</value>

</property>

</configuration>


2.2 Hdfs-site.xml

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>/home/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/home/dfs/data</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

<property>

<name>dfs.nameservices</name>

<value>mycluster</value>

</property>

<property>

<name>dfs.ha.namenodes.mycluster</name>

<value>nn1,nn2</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn1</name>

<value>namenode1:8020</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn2</name>

<value>namenode2:8020</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn1</name>

<value>namenode1:50070</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn2</name>

<value>namenode2:50070</value>

</property>

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://journalnode1:8485;journalnode2:8485;journalnode3:8485/mycluster</value>

</property>

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/home/dfs/journal</value>

</property>

<property>

<name>dfs.client.failover.proxy.provider.mycluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<property>

<name>dfs.ha.fencing.methods</name>

<value>sshfence</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/root/.ssh/id_rsa</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.connect-timeout</name>

<value>6000</value>

</property>

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>false</value>

</property>

</configuration>

Some of the above parameters need to be explained here.

Dfs.ha.automatic-failover.enabled

Here is the main standby automatic switch off, need to manually switch. In the next article, we will introduce the configuration of zookeeper to enable automatic switchover of primary and standby.

Fs.ha.namenodes.mycluster

The nn1,nn2 in <value> are the Namenode IDs of active Namenode and standby Namenode respectively, and you can also get a Namenode ID yourself, as long as the parameters are consistent.

Dfs.namenode.shared.edits.dir

Configure a set of URIs for Journalnode (3,5,7,..., 2n+1) for active Namenode and standby Namenode read/write edits files (the principle can refer to the previous article "hadoop2.0 ha Introduction"), The mycluster in <value> is dfs.nameservices consistent. You can make a nameservice ID yourself, as long as you are consistent in the parameters.

Dfs.journalnode.edits.dir

is the directory on the Journalnode node that holds the edits files for active Namenode and standby namenode sharing.

Dfs.ha.log-roll.period

The time interval for the edits file rotation of the active namenode, which is not set in front of this parameter, the default value is 120 seconds. That is, standby namenode will require active namenode to cut out a edits file in 120 seconds, and then synchronize the file via Journalnode.

Active Namenode will cut out a new edits file in 120 seconds, and give these edits files a number, the more new edits file number.

At the beginning of the log rotation, sir. A new "inprogress" edits file (with the file name with "InProgress") indicates that the log is being generated and the rotation is not completed. After 120 seconds, the log is completed, the file is renamed, the file name with a current maximum number (filename does not "inprogress"). Then generate a new "inprogress" edits file and start the next edits file rotation.

When the main standby switch occurs, the rotation of the edit file is triggered, so that the standby Namenode will synchronize the remaining edits files and the metadata will remain up to date when switching to the active state.

Dfs.ha.tail-edits.period


Standby Namenode how often to detect new edits files. It will only check for edits files that have completed rotation and will not check for "inprogress" edits files.

Dfs.ha.fencing.methods


The system has only one Namenode node at any time??? The active state. In the main standby switch, standby Namenode will become active state, the original active Namenode can no longer be in active state, or two namenode at the same time in the active state will cause the so-called "brain crack" problem. So in the failover to set the way to prevent 2 Namenode are in the active state, can be Java class or script.

There are two ways of fencing, sshfence and Shell

Sshfence method refers to the SSH login to the active Namenode node to kill the Namenode process, so you need to set up SSH login without password, but also to ensure that the rights to kill the Namenode process.

The Shell method is to run a shell script/command to prevent a "brain crack" problem, and the script needs to write itself.

Note that the QJM way itself has fencing function, can guarantee that only one namenode can write edits file to Journalnode, so it is not necessary to set fencing method can prevent "brain crack" problem. However, in the event of failover, the original active Namenode may still be receiving read requests from the client, so the client is likely to read some stale data (because the new active Namenode data is updated in real time). Therefore, it is recommended that you set the fencing method. If you do not want to set the fencing method, you can set a method that returns success (without fencing effect), such as "Shell (/bin/true)". This is purely for the fencing method to return successfully, do not need to really have fencing effect. This can improve the usability of the system, even if the fencing mechanism fails to maintain the availability of the system.

3 start

3.1 Start the Journalnode on the Journalnode machine first

$Hadoop _home/sbin/hadoop-daemon.sh Start Journalnode

3.2 Start the Namenode on the Namenode machine

Where Namenode1 is active namenode,namenode2 for standby Namenode

3.2.1

If this is the first boot, run the format command on Namenode1

$HADOOP _home/bin/hadoop Namenode-format

If it is a non-first boot, run the following command on Namenode1

$HADOOP _home/bin/hdfs Namenode-initializesharededits

Here's what we need to explain.

First boot is when the installation is configured with Ha,hdfs and no data. You need to format namenode1 with the Format command.

Non-first boot means that an hdfs without HA is already running, there is already data on HDFs, and now you need to configure HA to join a namenode. At this time namenode1 through the initializesharededits command to initialize the Journalnode, the edits file share to Journalnode.

3.2.2

Then start Namenode on the namenode1.

$HADOOP _home/sbin/hadoop-daemon.sh Start Namenode

3.2.3

Run the following command on the Namenode2

$HADOOP _home/sbin/hadoop-daemon.sh Start Namenode-bootstrapstandby

There is also a need to explain this.

Namenode2 is not required for format.

Namenode2 needs to run the Bootstrapstandby command to synchronize the NAMENODE1 metadata, and Namenode1 's metadata remains consistent.

The process is: Previously said namenode1 through the initializesharededits command has been edits file sharing to Journalnode, Now Namenode2 needs to synchronize Namenode1 metadata with Journalnode edits files through the Bootstrapstandby command, thus keeping the metadata and namenode1 consistent.

Note that there are enough edits files on the Journalnode to ensure that the metadata on the namenode1 and Namenode2 remains consistent. If the Bootstrapstandby command fails, you can manually copy the Namenode1 metadata (that is, all the data in the directory ${dfs.namenode.name.dir} to Namenode2 ${ Dfs.namenode.name.dir} This directory, and then run the Bootstrapstandby command.

3.2.4

Then start Namenode on the Namenode2.

$HADOOP _home/sbin/hadoop-daemon.sh Start Namenode


At this time, Namenode1 and Namenode2 are all started, are "standby" state.

3.2.5

Run on the Namenode1

$HADOOP _home/bin/hdfs haadmin-transitiontoactive nn1


In this way, the state of the namenode1 becomes "active".

3.3 Start the Datanode on the Datanode machine

$HADOOP _home/sbin/hadoop-daemon.sh Start Datanode


HDFs can then be used normally, and the HA function has been started.

3.4 Check

The status of Active Namenode (NAMENODE1) and Standby Namenode (NAMENODE2) can be viewed from the following pages

http://namenode1:50070/dfshealth.jsp

http://namenode2:50070/dfshealth.jsp

Run the common HDFs shell command to test whether HDFs is normal.

4 Testing

Stop Namenode1 Namenode (analog namenode1 hangs up), you will find HDFs can not be used.

Run the following command on the Namenode2

$HADOOP _home/bin/hdfs haadmin-transitiontoactive nn2

The state of Namenode2 becomes "active", when HDFS returns to normal.

Run the command on the NAMENODE1 switch from

$HADOOP _home/bin/hdfs haadmin-failover nn1 nn2

Then the state of Namenode2 becomes "active" and the Namenode1 state becomes "standby".

5 structure diagram of the QJM mode ha

The structure of the QJM mode ha involves the active Namenode,standby namenode,journalnode,datanode,client, which describes the relationship between them through a graph.

6 Practical Tips

    • Some of the configuration and operation done at the Namenode1 are also done on the Namenode2, keeping 2 machines consistent.
    • Note the difference between first boot (ha at first boot) and non-first boot (adding HA feature after running for a period of time).
    • Because the automatic switch of HA is prone to ssh login and permissions problems, and the Internet also has data test automatic switching sometimes will not be successful, so in the production environment or the proposed manual switching method, so more reliable, there are problems can also be timely check.

In the next article, we will continue to introduce the configuration of the primary and standby automatic switching of HA based on this article. The primary and standby automatic switching of HA is realized by configuring zookeeper.

Resources

Http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html

Hadoop2.0 qjm mode of HA configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.