Chd4b1 (hadoop-0.23) for namenode ha installation Configuration

Source: Internet
Author: User

Chd4b1 (hadoop-0.23) for namenode ha installation Configuration

Cloudera chd4b1 version already contains namenode ha, the Community also put namenode ha branch HDFS-1623 merge to trunk version, can achieve hot backup of dual namenode, but currently only supports manual switch, does not support automatic switch, switch progress in the community see: https://issues.apache.org/jira/browse/HDFS-3042

Namenode ha (chd4b1) Documentation:

Https://ccp.cloudera.com/display/CDH4B1/CDH4+Beta+1+High+Availability+Guide

The following describes how to install and configure dual namenode ha in chd4b1:

1. First install Java and configure SSH. This is the same as the previous version. This is omitted here.

2. Download and decompress cdh4b1

Download: https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs

Unzip: tar-zxvf hadoop-0.23.0-cdh4b1.tar.gz

3. Set environment variables:

Configure java_home in the hadoop-cdh4b1 configuration file under the hadoop-config.sh/libexec directory, in Red:

# Attempt to set java_home if it is not set

Export java_home =/home/liuminggang.pt/java_current

If [[-Z $ java_home]; then

4. Add configuration items in the hadoop configuration file:

(The configuration file is configured directly in the hadoop-cdh4b1/etc/hadoop directory), I used a total of five machines here:

10.250.8.106 namenode

10.250.8.107 namenode

10.250.8.108 datanode

10.250.8.109 datanode

10.250.8.110 datanode

4.1 configure the core-site.xml file:

<Property>

<Name> hadoop. tmp. dir </Name>

<Value>/home/Pt/hadoop-cdh4b1/hadoop_tmp </value>

<Description> a base for other temporary directories. </description>

</Property>

<Property>

<Name> fs. defaultfs </Name>

<Value> HDFS: // goon </value> // goon is the nameservice ID. You can set it here.

</Property>

4.2 configure the hdfs-site.xml file:

Configuration: DFS. namenode. Name. dir

 <Property>

<Name> DFS. namenode. Name. dir </Name>

<Value>/home/Pt/hadoop-cdh4b1/cluster-data </value>

</Property>

Configuration: DFS. Federation. nameservices

<Property>

<Name> DFS. Federation. nameservices </Name>

<Value> goon </value> // goon is the name of nameservice.

</Property>

Configuration: DFS. Ha. namenodes. [nameservice id]. The following goon is the nameservice ID, and nn1 and nn2 are the serviceid of each namenode.

<Property>

<Name> DFS. Ha. namenodes. Goon </Name>

<Value> nn1, nn2 </value> // goon

</Property>

Configure the RPC communication address: DFS. namenode. RPC-address. [nameservice id].

<Property>

<Name> DFS. namenode. rpc-address.goon.nn1 </Name>

<Value> 10.250.8.106: 9000 </value>

</Property>

<Property>

<Name> DFS. namenode. rpc-address.goon.nn2 </Name>

<Value> 10.250.8.107: 9000 </value>

</Property>

Configure the HTTP Communication address: DFS. namenode. http-address. [nameservice id].

<Property>

<Name> DFS. namenode. http-address.goon.nn1 </Name>

<Value> 10.250.8.106: 50070 </value>

</Property>

<Property>

<Name> DFS. namenode. http-address.nn2 </Name>

<Value> 10.250.8.107: 50070 </value>

</Property>

Configuration: DFS. namenode. Shared. edits. dir

DFS. namenode. Shared. edits. DIR is the master-slave namenode shared directory, which requires both the master and slave can be read and written. I use NFS to achieve sharing, NFS implementation can be referred to: http://www.vpsee.com/2010/06/install-nfs-4-0-on-centos/

<Property>

<Name> DFS. namenode. Shared. edits. dir </Name>

<Value>/home/Pt/data_nfs </value>

</Property>

Configure the client failover:

<Property>

<Name> DFS. Client. failover. Proxy. provider. Goon </Name> <value> org. Apache. hadoop. HDFS. server. namenode. Ha. configuredfailoverproxyprovider </value>

</Property>

Configuration: fencing, DFS here. ha. fencing. there are two methods to implement methods: sshfence and shell. Below I implement sshfence and DFS. ha. fencing. ssh. private-key-Files: SSH key file.

<Property>

<Name> DFS. Ha. Fencing. methods </Name>

<Value> sshfence </value>

</Property>

<Property>

<Name> DFS. Ha. Fencing. Ssh. Private-key-files </Name>

<Value>/home/LPT/. Ssh/id_rsa </value>

</Property>

In addition, the datanode node must be configured as follows:

<Property>

<Name> DFS. datanode. Data. dir </Name>

<Value>/home/Pt/hadoop-cdh4b1/datanode-data </value>

</Property>

4.3 yarn-site.xml Configuration

See: http://blog.sina.com.cn/s/blog_4a1f59bf010116rh.html

5. initialize namenode

First, format the master-slave namenode: Bin/HDFS namenode-format to make the master-slave DFS. namenode. name. the metadata of namenode under the Dir directory is consistent with that of DFS. namenode. shared. edits. dir contains all recent edit files. You can set the DFS of the master node. namenode. name. copy the metadata in the Dir directory to the DFS of the slave node. namenode. name. dir and shared DFS. namenode. shared. edits. dir directory.

5. Start the cluster:

Start HDFS: sbin/start-dfs.sh

Start mapreduce: sbin/start-yarn.sh

Note: the cluster is started in namenode ha configuration mode. After namenode is started, it is in the STANDBY state. At this time, the cluster cannot be used. To make the cluster service available, you should use the dfshaadmin command. You can use bin/HDFS haadmin-dfshaadmin-transitiontoactive nn1 to make the nn1 configured above active and nn2 in the stanby state. In addition, dfshaadmin has the following functions:

Usage: dfshaadmin [-ns <nameserviceid>]

[-Transitiontoactive <serviceid>]

[-Transitiontostandby <serviceid>]

[-Failover [-- forcefence] [-- forceactive] <serviceid>]

[-Getservicestate <serviceid>]

[-Checkhealth <serviceid>]

[-Help <command>]

Among them: Bin/HDFS haadmin-dfshaadmin-failover nn1 nn2 changes nn2 to active nn1 to standby. This command may result in: failover failed:/10.250.8.107: 9000 is not ready to become active. In this case, run the bin/HDFS haadmin-dfshaadmin-failover -- forceactive nn1 nn2 command to avoid the above problems, to know what the above parameters specifically mean, you can look at the hadoop-0.23 or chd4b1 source code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.