Configuring HDFs Federation for a Hadoop cluster that already exists

Last Update:2016-05-06 Source: Internet

Author: User

Tags xsl free ssh

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

first, the purpose of the experiment
1. There is only one namenode for the existing Hadoop cluster, and a namenode is now being added.
2. Two namenode constitute the HDFs Federation.
3. Do not restart the existing cluster without affecting data access.

second, the experimental environment
4 CentOS Release 6.4 Virtual machines with IP address
192.168.56.101 Master
192.168.56.102 slave1
192.168.56.103 Slave2
192.168.56.104 Kettle

One of the kettle is a new "clean" machine that has been configured with password-free SSH and will be added as a namenode.

Software version:
Hadoop 2.7.2
HBase 1.1.4
Hive 2.0.0
Spark 1.5.0
Zookeeper 3.4.8
Kylin 1.5.1

Existing configuration:
Master as the Namenode, Secondarynamenode, resourcemanager,hbase of Hadoop Hmaster
Slave1, slave2 as the datanode of Hadoop, Nodemanager,hbase Hregionserver
At the same time master, Slave1, slave2 as three zookeeper servers

third, the configuration steps
1. Edit the Hdfs-site.xml file on master and the contents of the modified file are as follows.

<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>< Configuration><property><name>dfs.namenode.name.dir</name><value>file:/home/grid/ hadoop-2.7.2/hdfs/name</value></property><property><name>dfs.datanode.data.dir</ name><value>file:/home/grid/hadoop-2.7.2/hdfs/data</value></property><property>< Name>dfs.replication</name><value>1</value></property><property><name> dfs.webhdfs.enabled</name><value>true</value></property><!--New Properties--><property > <name>dfs.nameservices</name> <value>ns1,ns2</value></property><property > <name>dfs.namenode.rpc-address.ns1</name> <value>master:9000</value></property ><property> <name>dfs.namenode.http-address.ns1</name> <value>master:50070</value></property><property> <name>dfs.namenode.secondary.http-address.ns1</name> <value>master:9001</value></property><property> <name> Dfs.namenode.rpc-address.ns2</name> <value>kettle:9000</value></property><property > <name>dfs.namenode.http-address.ns2</name> <value>kettle:50070</value></property ><property> <name>dfs.namenode.secondary.http-address.ns2</name> <value>kettle:9001 </value></property></configuration>

2. Copy the Hdfs-site.xml file on master to the other nodes on the cluster

SCP hdfs-site.xml SLAVE1:/HOME/GRID/HADOOP-2.7.2/ETC/HADOOP/SCP Hdfs-site.xml slave2:/home/grid/hadoop-2.7.2/etc/ hadoop/

3. Copy the Java directory, Hadoop directory, environment variable files from master to Kettle

scp-rp/home/grid/hadoop-2.7.2 kettle:/home/grid/scp-rp/home/grid/jdk1.7.0_75 kettle:/home/grid/# Execute scp-p/etc with Root /profile.d/* kettle:/etc/profile.d/

4. Start a new Namenode, Secondarynamenode

# Execute Source/etc/profileln-s hadoop-2.7.2 hadoop$hadoop_home/sbin/hadoop-daemon.sh start namenode$hadoop_home on kettle /sbin/hadoop-daemon.sh Start Secondarynamenode

The Namenode, Secondarynamenode process is started after execution, as shown in 1.

Figure 1

5. Refresh Datanode Collection of newly added Namenode

# Execute on any machine in the cluster can be $hadoop_home/bin/hdfs dfsadmin-refreshnamenodes Slave1:50020$hadoop_home/bin/hdfs dfsadmin- Refreshnamenodes slave2:50020

At this point, the HDFS Federation configuration is complete, viewing the status of two namenode from the web as shown in Figure 3, respectively 2.

Figure 2

Figure 3

Iv. Testing

# Upload a text file to HDFs Hadoop dfs-put/home/grid/hadoop/notice.txt/# run Hadoop on both Namenode nodes # # on Master, execute Hadoop jar/ Home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount/notice.txt/output# Perform Hadoop jar/home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar on Kettle wordcount /notice.txt/output1

Use the following command to view the two output results, as shown in Figure 5, respectively, 4.

Hadoop Dfs-cat/output/part-r-00000hadoop dfs-cat/output1/part-r-00000

Figure 4

Figure 5

Reference:
Http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html

Configuring HDFs Federation for a Hadoop cluster that already exists

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More