Configuring HDFs Federation for a Hadoop cluster that already exists

Source: Internet
Author: User
Tags xsl free ssh

first, the purpose of the experiment
1. There is only one namenode for the existing Hadoop cluster, and a namenode is now being added.
2. Two namenode constitute the HDFs Federation.
3. Do not restart the existing cluster without affecting data access.

second, the experimental environment
4 CentOS Release 6.4 Virtual machines with IP address
192.168.56.101 Master
192.168.56.102 slave1
192.168.56.103 Slave2
192.168.56.104 Kettle

One of the kettle is a new "clean" machine that has been configured with password-free SSH and will be added as a namenode.

Software version:
Hadoop 2.7.2
HBase 1.1.4
Hive 2.0.0
Spark 1.5.0
Zookeeper 3.4.8
Kylin 1.5.1

Existing configuration:
Master as the Namenode, Secondarynamenode, resourcemanager,hbase of Hadoop Hmaster
Slave1, slave2 as the datanode of Hadoop, Nodemanager,hbase Hregionserver
At the same time master, Slave1, slave2 as three zookeeper servers

third, the configuration steps
1. Edit the Hdfs-site.xml file on master and the contents of the modified file are as follows.
<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>< Configuration><property><name>dfs.namenode.name.dir</name><value>file:/home/grid/ hadoop-2.7.2/hdfs/name</value></property><property><name>dfs.datanode.data.dir</ name><value>file:/home/grid/hadoop-2.7.2/hdfs/data</value></property><property>< Name>dfs.replication</name><value>1</value></property><property><name> dfs.webhdfs.enabled</name><value>true</value></property><!--New Properties--><property > <name>dfs.nameservices</name> <value>ns1,ns2</value></property><property > <name>dfs.namenode.rpc-address.ns1</name> <value>master:9000</value></property ><property> <name>dfs.namenode.http-address.ns1</name> <value>master:50070</value></property><property> <name>dfs.namenode.secondary.http-address.ns1</name> <value>master:9001</value></property><property> <name> Dfs.namenode.rpc-address.ns2</name> <value>kettle:9000</value></property><property > <name>dfs.namenode.http-address.ns2</name> <value>kettle:50070</value></property ><property> <name>dfs.namenode.secondary.http-address.ns2</name> <value>kettle:9001 </value></property></configuration>
2. Copy the Hdfs-site.xml file on master to the other nodes on the cluster
SCP hdfs-site.xml SLAVE1:/HOME/GRID/HADOOP-2.7.2/ETC/HADOOP/SCP Hdfs-site.xml slave2:/home/grid/hadoop-2.7.2/etc/ hadoop/
3. Copy the Java directory, Hadoop directory, environment variable files from master to Kettle
scp-rp/home/grid/hadoop-2.7.2 kettle:/home/grid/scp-rp/home/grid/jdk1.7.0_75 kettle:/home/grid/# Execute scp-p/etc with Root /profile.d/* kettle:/etc/profile.d/
4. Start a new Namenode, Secondarynamenode
# Execute Source/etc/profileln-s hadoop-2.7.2 hadoop$hadoop_home/sbin/hadoop-daemon.sh start namenode$hadoop_home on kettle /sbin/hadoop-daemon.sh Start Secondarynamenode

The Namenode, Secondarynamenode process is started after execution, as shown in 1.


Figure 1

5. Refresh Datanode Collection of newly added Namenode
# Execute on any machine in the cluster can be $hadoop_home/bin/hdfs dfsadmin-refreshnamenodes Slave1:50020$hadoop_home/bin/hdfs dfsadmin- Refreshnamenodes slave2:50020
At this point, the HDFS Federation configuration is complete, viewing the status of two namenode from the web as shown in Figure 3, respectively 2.


Figure 2


Figure 3


Iv. Testing
# Upload a text file to HDFs Hadoop dfs-put/home/grid/hadoop/notice.txt/# run Hadoop on both Namenode nodes # # on Master, execute Hadoop jar/ Home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount/notice.txt/output# Perform Hadoop jar/home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar on Kettle wordcount /notice.txt/output1
Use the following command to view the two output results, as shown in Figure 5, respectively, 4.
Hadoop Dfs-cat/output/part-r-00000hadoop dfs-cat/output1/part-r-00000
Figure 4


Figure 5


Reference:
Http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html

Configuring HDFs Federation for a Hadoop cluster that already exists

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.