Hadoop2.0 ha Introduction (High avalability, highly available)

Last Update:2018-07-26 Source: Internet

Author: User

Tags switches zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article first introduces the basic principles and 2 ways of hadoop2.0 HA (High avalability).

1 overview

Before hadoop2.0, there was only one namenode, there was a single point of issue (although hadoop1.0 had secondarynamenode,checkpointnode,backcupnode these, but the single point problem still exists).

The HA mechanism was introduced in hadoop2.0. Hadoop2.0 's ha mechanism officially describes 2 ways, one is the NFS (Network File System) mode, and the other is the QJM (Quorum Journal Manager) approach.

2 Fundamentals

hadoop2.0 ha Mechanism has two namenode, one is active Namenode, the state is active, the other is standby namenode, the state is standby. The state of the two can be toggled, but not all two are active, and only 1 are active states at a maximum. Only active Namenode provide external services, standby namenode is not external services. The data is synchronized between active Namenode and standby namenode via NFS or JN (JOURNALNODE,QJM mode).

Active Namenode writes the most recent action record to a local edits file (edits file) and transfers it to NFS or JN.

Standby Namenode Periodic Check, from NFS or JN to read the latest edit file, and then merge the edits file and the Fsimage file into a new fsimage, when the merge is complete, the active will be notified. Namenode get this new fsimage. After the active Namenode obtains this new Fsimage file, it replaces the old fsimage file.

In this way, the real-time synchronization of the data of active Namenode and standby namenode is maintained,

Standby namenode can switch to active namenode at any time (for example, active Namenode hangs).

And there is a function of the original hadoop1.0 secondarynamenode,checkpointnode,backcupnode: Merging edits files and fsimage files so that the Fsimage files remain updated. So after starting the HA mechanism of hadoop2.0, secondarynamenode,checkpointnode,buckcupnode these are not needed.

3 NFS Mode

NFS acts as a storage for data sharing between active Namenode and standby namenode. The active Namenode writes the most recent edits file to NFS, and standby Namenode reads the data from NFS. The disadvantage of this approach is that if an active namenode or standby namenode has a network problem with NFS, it can cause synchronization problems with their previous data.

4 QJM (Quorum Journal Manager) mode

The QJM approach solves the problem of insufficient NFS fault tolerant mechanisms. The data is shared between active namenode and standby namenode through a set of Journalnode (the number is odd and can be 3,5,7...,2n+1). Active Namenode writes the most recent edits file to 2n+1 Journalnode, and as long as N+1 writes succeed, it is considered successful, and standby namenode can be read from Journalnode. As you can see, the QJM approach has a fault tolerant mechanism that can tolerate n journalnode failures.

5 switching of the primary and standby nodes

Active Namenode and Standby namenode can be switched at any time. When the active Namenode is hung, the standby namenode can also be switched to the active state to become active Namenode. Can be manually switched and automatically switched. Manual switching is done by executing ha-managed commands to change the state of the Namenode, from standby to active, or from active to standby. When the active Namenode hangs, the standby namenode automatically switches to the active state, replacing the original active Namenode as the new active Namenode,hdfs to continue working.

Automatic switchover of the primary and standby nodes requires configuration of the zookeeper. Active Namenode and Standby namenode record their status in real time to zookeeper, zookeeper monitor their state changes. When Zookeeper discovers that the active Namenode hangs, it automatically switches standby Namenode to active Namenode.

6 Practical Tips QJM Way has obvious advantages, one is that it has fencing function, and the second is to enhance the robustness of the system through multiple journal nodes, so it is recommended to use QJM in the generation environment. Journalnode consumes very little resources and does not require additional machines specifically to start Journalnode, which can be selected from a Hadoop cluster as a journalnode.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More