Hadoop cluster (CDH4) practices (0) Preface

Source: Internet
Author: User
Tags sqoop
Directory structure: Hadoop cluster (CDH4) practices (0) preface Hadoop cluster (CDH4) Practices (1) Hadoop (HDFS) Build Hadoop cluster (CDH4) practices (2) build Hadoop cluster (CDH4) using HBaseZookeeper (3) Build Hadoop cluster (CHD4) using Hive (4) Build Hadoop cluster (CHD4) using Oozie (5) Sqoop Security

Directory structure: Hadoop cluster (CDH4) practices (0) preface Hadoop cluster (CDH4) Practices (1) Hadoop (HDFS) Build Hadoop cluster (CDH4) practices (2) build Hadoop cluster (CDH4) using HBaseZookeeper (3) Build Hadoop cluster (CHD4) using Hive (4) Build Hadoop cluster (CHD4) using Oozie (5) Sqoop Security

Directory structure
Hadoop cluster (CDH4) practices (0) Preface
Hadoop cluster (CDH4) Practice (1) Hadoop (HDFS) Construction
Hadoop cluster (CDH4) Practice (2) HBase & Zookeeper Construction
Hadoop cluster (CDH4) Practice (3) Hive Construction
Hadoop cluster (CHD4) Practice (4) Oozie Construction
Hadoop cluster (CHD4) Practice (5) Sqoop Installation

Content
Hadoop cluster (CDH4) practices (0) Preface

Enter the text below
When I was a beginner at Hadoop, I wrote a series of Hadoop introductory articles. The first article is "Hadoop cluster practice (0) complete architecture design".
In the previous series of articles, I also explained some of the concepts of Hadoop, mainly aiming at some questions I have encountered.
At the same time, in the previous series of articles, I also listed some small operation demos to deepen my understanding of various tools.

So why does this series of articles seem to be repeated.
In fact, the main reasons are as follows:
1. The previous article is based on the Ubuntu 10.10 system and also applies to the new version of Ubuntu, but there are more cases where CentOS is used as the production environment;
At the same time, Ubuntu has some changes that are inconsistent with the pace of the open-source community, so there is a tendency to sing down Ubuntu.
2. With the standardized and rapid development of EPEL and other extension libraries, CentOS now has a rich software library of the same size as Ubuntu. It is also very convenient to install and deploy software through YUM;
3. the previous articles were based on CDH3. Currently, with the development of Hadoop, CDH4 has become the mainstream and has some features not available in CDH3. I think the most useful features include:
A) NameNode HA, unlike secondary namenode, CDH4 provides an HA method to ensure dual-node NameNode;
B) TaskTracker provides a fault tolerance mechanism to ensure that the failure of parallel computing is not caused by a node error during parallel computing;

Therefore, this article is based on the CDH4 environment on the CentOS 6.4 x86_64 system.
However, the Namenode HA and TaskTracker fault tolerance tests have not been completed yet.
At the same time, this article uses a non-YARN method, but the same MRv1 computing framework as CDH3, in order to ensure that the code developed in the company's previous online environment can run accurately.

Next, let's start the entire practical drill process:
Hadoop cluster (CDH4) Practice (1) Hadoop (HDFS) Construction
Hadoop cluster (CDH4) Practice (2) HBase & Zookeeper Construction
Hadoop cluster (CDH4) Practice (3) Hive Construction
Hadoop cluster (CHD4) Practice (4) Oozie Construction
Hadoop cluster (CHD4) Practice (5) Sqoop Installation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.