Separation experiment of namenode and secondarynamenode Based on Hadoop0.20.2

Source: Internet
Author: User

Separation experiment of namenode and secondarynamenode Based on Hadoop0.20.2

When configuring a Hadoop cluster, we often store namenode and secondarynamenode on one node. This is actually very dangerous. If this node crashes, the entire cluster cannot be recovered. The following describes how to separate namenode from secondarynamenode. Of course, there are still many shortcomings and issues to be improved. You are welcome to give your advice.

Note: I originally thought that the content (host name) in the masters configuration file refers to the namenode host name, but it actually refers to secondarynamenode, the slavers configuration file indicates all nodes that run datanode and tasktracker (generally the same node. And these two files only run namenode and jobtracker (generally both on namenode nodes namenode by core-site.xml fs. default. name specified, jobtracker by mapred-site.xml mapred. job. the node specified by tracker is used, so other nodes can not be configured.

Do not forget to modify the content in the masters file of the namenode node.

Back to context (this experiment is based on the environment created by the cluster in this article)

1. Clone the node where namenode is located, that is, create a new node, including file configuration under the conf directory.
All files, directory structures, and environment variables must be the same. You can refer to add a new node to the cluster. The related configurations are as follows:
Host Name secondary

IP address 192.168.5.16

Hosts file:

192.168.5.13 namenode

192.168.5.16 secondary

SSH password-free Login

Concerning the hosts file and ssh, I think secondarynamenode only communicates with namenode, so you only need to establish a password-free connection with the namenode node, and the content of the hosts file can only write information about the namenode node and itself, note that the hosts file in the namenode node must also add the information of the secondarynamenode node.

2 file configuration

(1) modify the hdfs-site.xml file in the namenode node:

<Property>
<Name> dfs. secondary. http. address </name>
<Value> 192.168.5.16: 50090 </value>
<Description> NameNode get the newest fsimage via dfs. secondary. http. address </description>
</Property>

In the masters file, modify it to secondary.


(2) modify the hdfs-site.xml file in the secondarynamenodenamenode node:

<Property>
<Name> dfs. http. address </name>
<Value> 192.168.5.13: 50070 </value>
<Description> Secondary get fsimage and edits via dfs. http. address </description>
</Property>

Modify core-site.xml files
<Property>
<Name> fs. checkpoint. period </name>
<Value> 3600 </value>
<Description> The number of seconds between two periodic checkpoints. </description>
</Property>


<Property>
<Name> fs. checkpoint. size </name>
<Value> 67108864 </value>
</Property>


<Property>
<Name> fs. checkpoint. dir </name>
<Value>/home/zhang/hadoop0202/secondaryname </value>

</Property>

Fs. checkpoint. period and fs. checkpoint. size is a condition for SecondaryNameNode nodes to start backup. when either of the two conditions is met, the SecondaryNameNode node will start backup. The first one is set to the interval (one hour by default) fs. checkpoint. the time (in seconds) set by period, and the second is the size of the operation log file up to fs. checkpoint. size.

3. Restart hadoop or run the command directly on secondary.

Hadoop-daemon.sh start secondarynamenode command to start secondaryNamenode

After restarting, we can see

In namenode, there is no SecondaryNameNode Java Process (Sorry, I forgot to detach it. There is indeed a SecondaryNameNode Java Process on the namenode node before detach)

The Java Process of SecondaryNameNode appears on the secondary node.

Verify that there is an image file in the secondaryname directory on the secondary node (since fs in the setup core-siet.xml file. checkpoint. the period parameter is 3600, representing an hour. We need to modify the parameters for the experiment effect. For the modification effect, refer to the article "how to control the occurrence frequency of namenode checkpoints)

Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.