Introduction to the JobTrackerHA solution in CDH

Source: Internet
Author: User
Author: Dong | Sina Weibo: XI Cheng understands | reprinted, but the original source and author information and copyright statement must be indicated in the form of a hyperlink. Website: dongxicheng. orgmapreducecdh4-jobtracker-ha everyone knows that HadoopJobTracker has a single point of failure, and there has been no perfect open source solutions. In Hadoop

Author: Dong | Sina Weibo: XI Cheng understand | can be reproduced, but must be in the form of hyperlink to indicate the original source of the article and the author information and copyright statement URL: http://dongxicheng.org/mapreduce/cdh4-jobtracker-ha/ we all know that Hadoop JobTracker has a single point of failure, and there has been no perfect open-source solutions. In Hadoop

Author:Dong| Sina Weibo: XI Cheng understands | reprinted, but the original source and author information and copyright statement of the article must be indicated in hyperlink form
Web: http://dongxicheng.org/mapreduce/cdh4-jobtracker-ha/

As we all know, Hadoop JobTracker has a single point of failure and there has never been a complete open-source solution. In Hadoop, JobTracker generally does not have to solve the error tolerance of JobTracker because the failure probability of JobTracker is much lower than that of NameNode.

In the latest version 4.2.0, Cloudera provides a complete set of JobTracker HA solutions. This article will introduce this solution.

Before introducing the CDH solution, briefly introduce the basic workflow of JobTracker HA, which can be summarized as follows:

(1) Active JobTracker records job running information through logs;

(2) If Active JobTracker is found to be faulty, switch to a Stanby JobTracker;

(3) Stanby JobTracker restores the job runtime information through logs;

(4 )? The preceding switchover process is transparent to the JobTracker client (JobClient, TaskTracker, and Web HTTP.

For almost all current Hadoop versions, (1) and (3) have been solved, and (2) (4) have not been solved.

Shows the JobTracker HA solution of Cloudera, which consists of the following modules:

(1 )??? JobTrackerHADaemon

It runs on the JobTracker side and is used to control the start and stop of JobTracker.

(2 )??? JobTrackerHAServiceProtocol

Running on the JobTracker end is actually an RPC Server that receives and processes JobTracker processing requests from the MRHAAdmin (Administrator), such as converting JobTracker to Active or Standy state.

(3 )??? MRHAAdmin

The Toolkit provided for the Administrator. The administrator can use some of the functions to control the status of each JobTracker.

(4 )??? JobTrackerProxies

The re-encapsulation of the original RPC client enables each client to transparently send RPC requests to the new Active JobTracker when Active JobTracker fails.

(5 )??? JobTrackerHAHttpRedirector

Redirects HTTP requests from the Web end. When Active JobTracker fails, all access requests from Active JobTracker are redirected to the new Active JobTracker.

To upgrade and switch JobTracker, the Administrator only needs to use some commands to set the current Active JobTracker to Stanby and change another Stanby JobTracker to Active. Then, the Hadoop internal logic is as follows:

The preceding section describes the architecture of JobTracker HA in manual trigger switching mode. The architecture of JobTracker HA for automatic switch using Zookeeper is as follows:

The entire architecture is almost unchanged, but after Zookeeper finds that Active JobTracker is faulty, it selects a new Active JobTraker through a certain Election Algorithm and starts the JobTracker.

One obvious disadvantage of CDH's JobTracker HA solution is thatExcessive job recovery granularity. We know that JobTracker HA has three levels of job recovery granularity, namely: 1) Job (after JobTracker restarts, it automatically submits jobs that are running before, but all jobs, tasks that have been run, are running, and are not yet run before the restart must be re-run) and completed tasks (after the JobTracker restarts, the tasks that have been run by the job are restored, but previously running and not yet running tasks need to be rescheduled for execution) and all tasks (JobTracker restarts and restores all jobs in the same status before, that is, all running and running tasks remain in the previous state, and only need to re-schedule the tasks that have not yet been run). These three levels increase in difficulty, but the benefits increase in turn. For CDH 4.2.0, it only achieves the job-level recovery granularity and is the simplest and least profitable implementation method.

References:

(1) CDH JobTracker HA introduction and installation solution:

Https://ccp.cloudera.com/display/CDH4DOC/Configuring+High+Availability+for+the+JobTracker+ (MRv1)

(2) CDH 4.2.0 source code download: http://archive.cloudera.com/cdh4/cdh/4/

Note that the CDH 4 release contains both MRv1 and MRv2 (YARN), and only MRv1 has JobTracker HA implementation, however, the source code of the library it depends on is in MRv2 (MRv1 only has a jar package ).

(3) download the source code of CDH 4.2.0 MRv2 (YARN:

Http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.2.0.tar.gz

(4) download the source code of CDH 4.2.0 MRv1:

Http://archive.cloudera.com/cdh4/cdh/4/mr1-2.0.0-mr1-cdh4.2.0.tar.gz

Original article, reprinted Please note:Reposted from Dong's blog

Link:Http://dongxicheng.org/mapreduce/cdh4-jobtracker-ha/

Author: Dong, Author: http://dongxicheng.org/about/

Copyright©2013
This feed is for personal, non-inclucial use only.
The use of this feed on other websites breaches copyright. If this content is not in your news reader, it makes the page you are viewing an infringement of the copyright. (Digital Fingerprint:
)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.