Hadoop cluster management-SecondaryNameNode and NameNode

Source: Internet
Author: User

Simply put, it is easy for beginners to think That SecondaryNameNode (snn) is a hot standby process of NameNode (nn. It is not. Snn is an integral part of the HDFS architecture, but it is often misunderstood by its name. In fact, it is used to save the backup of HDFS metadata information in namenode, and reduce the restart time of namenode. You still need to do some work to configure and use snn correctly in the Hadoop process. In the default configuration of hadoop, The snn process runs on the machine named namenode by default. However, if an error occurs on this machine, it is a great disaster to restore the HDFS file system. A better way is to configure the snn process to run on another machine.

In hadoop, namenode is responsible for the persistent storage of HDFS metadata and processing the interaction feedback from clients on various HDFS operations. To ensure interaction speed, the metadata of the HDFS file system is loaded into the memory of the namenode machine and stored to the disk for persistent storage. To ensure that this persistence process does not become the bottleneck of HDFS operations, hadoop adopts the following method: No snapshot of the current file system is persisted, the list of actions for HDFS for the last period will be saved to an Editlog file in namenode. When the namenode is restarted, in addition to the load fsImage accident, the HDFS operation recorded in the EditLog file will be replaced to restore the final state before the HDFS restart.

SecondaryNameNode periodically merges the HDFS operations recorded in the EditLog into a checkpoint, and then clears the EditLog. Therefore, the restart of namenode will Load the latest checkpoint and replay the hdfs operations recorded in the EditLog. Because the EditLog records the list of operations from the last checkpoint to the present, so it will be relatively small. If there is no periodic merge process for snn, it will take a long time to restart namenode each time. In this way, periodic merge can reduce the restart time. It also guarantees the integrity of the HDFS system.

This is what SecondaryNameNode does. Therefore, snn does not share the stress on HDFS interactive operations on namenode. Even so, when the namenode machine goes down or the namenode process goes wrong, the namenode daemon process can manually copy a metadata copy from the snn to restore the HDFS file system.

The reason for running the SNN process on a non-NameNode machine is as follows:
  1. Scalability: to create a new HDFS snapshot, You need to copy all the metadata information loaded to the memory in namenode. This operation requires the same memory as the memory occupied by namenode, the memory allocated to the namenode process is actually a limitation on the HDFS file system. If the distributed file system is very large, the memory of the namenode machine may be occupied by all namenode processes.
  2. Fault Tolerance: When snn creates a checkpoint, it copies the checkpoint to several copies of metadata. Running this operation on another machine also provides fault tolerance for distributed file systems.
Configure to run SecondaryNameNode on another machine

A running instance of HDFS is started through the $ HADOOP_HOME/bin/start-dfs.sh (or start-all.sh) script on the namenode machine. This script starts the namenode process on the machine that runs the script, and the DataNode process is started on the server Load balancer instance. The list of Server Load balancer instances is saved in the conf/Server Load balancer file, with one machine in one row. An snn process will be started on another machine, which is specified by the conf/masters file. Therefore, note that the machine specified in the conf/masters file does not mean that the jobtracker or namenode process must run on this machine, because these processes are running on launch bin/start-dfs.sh or bin/start-mapred.sh (start-all.sh) machines. Therefore, the name of the masters file is very confusing and should be called secondaries. Follow these steps:

  1. Write all the machines that want to run the secondarynamenode process to the masters file, one row at a time.
  2. Modify the conf/hadoop-site.xml file on the machine configured in the masters file with the following options:
Java code
  1. <Property>
  2. <Name> dfs. http. address </name>
  3. <Value> namenode.hadoop-host.com: 50070 </value>
  4. </Property>

 

Core-site.xml: Here there are 2 parameters configurable, but in general we do not modify. Fs. checkpoint. period indicates how long the hdfs image is recorded. The default value is 1 hour. Fs. checkpoint. size indicates the size of a record. The default value is 64 MB.

 

Java code
  1. <Property>
  2. <Name> fs. checkpoint. period </name>
  3. <Value> 3600 </value>
  4. <Description> The number of seconds between two periodic checkpoints.
  5. </Description>
  6. </Property>
  7. <Property>
  8. <Name> fs. checkpoint. size </name>
  9. <Value> 67108864 </value>
  10. <Description> The size of the current edit log (in bytes) that triggers
  11. A periodic checkpoint even if the fs. checkpoint. period hasn't expired.
  12. </Description>
  13. </Property>
  14. <Property>
  15. <Name> fs. checkpoint. dir </value>
  16. <Value> yourdir </value>
  17. </Property>

 

 

3. Check the configuration. After the configuration is complete, we need to check whether the configuration is successful. You can check the file directory on the machine that runs secondarynamenode to determine whether the configuration is successful. Enter jps to check whether the secondarynamenode process exists. If yes, check whether there are backup records in the corresponding directory. For example:

This directory usually exists in hadoop. tmp. dir/dfs/namesecondary.

Iv. Recovery

1. How to restore after the configuration is complete. First, kill the namenode process and delete the data in the hadoop. tmp. dir directory. Manufacturing master failure.

2. Create an empty folder at the location specified by the configuration parameter dfs. name. dir; assign the position of the checkpoint directory to the configuration parameter fs. checkpoint. dir;

Copy the file in namesecondary to fs. checkpoint. dir;

./Hadoop namenode-importCheckpoint;

Start NameNode and add-importCheckpoint. (This sentence is plagiarized with hadoop-0.20.2/hadoop-0.20.2/docs/cn/hdfs_user_guide.html # Secondary + NameNode, look at the documentation, There are instructions)

3. hadoop namenode-importCheckpoint is used when namenode is started.

V. Summary

1. You can configure multiple secondarynamenode files. You can write a few more files in the master file.

2. Remember to manually copy the data to the namenode machine to restore the data. It is not automatic (see the recovery operation written above ).

3. the backup cycle can be modified. If you do not want to back up an image once an hour, you can change the backup cycle to a short time. Fs. checkpoint. period value in core-site.xml

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.