HA introduction of Hadoop2.x

Source: Internet
Author: User
This article mainly introduces Hadoop2.0 HA (HighAvalability high availability). Before introducing the HA configuration of hadoop2.0, This article first introduces the basic principles and two methods of hadoop2.0HA. 1. Overview before Hadoop2.0, there was only one NameNode and there was a single point of failure (although Hadoop1.0 had SecondaryNameNode, checkPointNode

This article mainly introduces HA (High Avalability High availability) of Hadoop2.0. Before introducing HA configuration of hadoop2.0, This article first introduces the basic principles and two methods of hadoop2.0 HA. 1. Overview before Hadoop2.0, there was only one NameNode and there was a single point of failure (although Hadoop1.0 had SecondaryNameNode, checkPointNode

This article mainly introduces HA (High Avalability High availability) of Hadoop2.0. Before introducing HA configuration of hadoop2.0, This article first introduces the basic principles and two methods of hadoop2.0 HA.

1. Overview

Before Hadoop2.0, there was only one NameNode and there was a single point of failure (although Hadoop1.0 had SecondaryNameNode, checkPointNode, buckcupnode, but the single point of failure still exists). The HA mechanism was introduced in hadoop2.0. The HA mechanism of Hadoop2.0 officially introduces two methods, one is the NFS (Net Work File System) method, and the other is the QJM (Quorum Journal Manager) method.


2. Basic Principles

The HA mechanism of Hadoop2.0 has two NameNode: active NameNode and active; standby NameNode and standby. The two States can be switched, but not both are active. Up to one is active. Only active NameNode provides external services, and standby NameNode does not provide external services. The active NameNode and standby NameNode are synchronized through NFS or JN (journalnode QJM mode.

Active NameNode writes the latest operation records to a local edits file and transmits them to NFS or JN. Standby NameNode checks regularly. The nearest edits file is read from NFS or JN, And the edits file and fsimage file are merged into a new fsimage, after merging, the system notifies the active NameNode to obtain the new fsimage. After the new fsimage file is obtained by active NameNode, replace the original fsimage file.

In this way, the data of active NameNode and standby NameNode is synchronized in real time, and standby NameNode can be switched to active NameNode at any time (for example, active NameNode fails ). In addition, the former hadoop1.0 SecondaryNameNode, checkpoint, and buckcupnode functions are provided. the edits files and fsimage files are merged to keep the fsimage Files updated. Therefore, after the HA mechanism of hadoop2.0 is enabled, SecondaryNameNode, checkpointnode, and buckcupnode are unnecessary.


3. NFS Mode

NFS is used for data sharing between active NameNode and standby NameNode. Active NameNode writes the latest edits file to NFS, while standby NameNode reads data from NFS. The disadvantage of this method is that if there is a network problem between active NameNode or standby NameNode and NFS, it will cause problems with their previous data synchronization.


4. QJM (Quorum Journal Manager) Mode

QJM can solve the problem of NFS fault tolerance. Active NameNode and standby NameNode share data through a set of journalnodes (numbers are odd, which can be 3, 5, 7,... 2n + 1. Active NameNode writes the latest edits file to 2n + 1 journalnode. If n + 1 write is successful, the write operation is successful, then standby NameNode can be read from journalnode. We can see that the QJM method is fault tolerant and can tolerate the failure of n journalnodes.


5. Master-slave node switchover

Active NameNode and standby NameNode can be switched at any time. When active NameNode fails, you can also switch standby NameNode to active state, called active NameNode. Manual switch and automatic switch are supported. During manual switching, the NameNode status is changed by executing the HA management command, from standby to active or from active to standby. When the active NameNode fails, the standby NameNode automatically switches to the active state, replacing the original active NameNode as the new active NameNode, and HDFS continues to work normally.

Zookeeper must be configured for automatic failover between the master and slave nodes. Active NameNode and standby NameNode record their statuses to Zookeeper in real time, and Zookeeper monitors their status changes. When Zookeeper finds that active NameNode is down, it automatically switches standby NameNode to active NameNode.


6. Practice tips

1): The QJM method has obvious advantages. First, it has a fencing function. Instead, it enhances the robustness of the system through multiple journal nodes, therefore, we recommend that you use QJM in the production environment.

2): The journalnode consumes a small amount of resources and does not require additional machines to start the journalnode. You can select several machines from the hadoop cluster as the journalnode.


From: http://www.linuxidc.com/Linux/2014-05/101174.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.