How to learn Hadoop? Hadoop Development

Source: Internet
Author: User

Hadoop is a platform for storing massive amounts of data on distributed server clusters and running distributed analytics applications, with the core components of HDFS and MapReduce. HDFS is a distributed file system that can read distributed storage of data systems;MapReduce is a computational framework that distributes computing tasks based on Task Scheduler by splitting computing tasks.

Hadoop is an essential framework for big data development, so if you want to learn big data, you have to master the knowledge of Hadoop , so what doesHadoop learn?

First,Hadoop Environment Construction

1. Introduction to Hadoop Eco-environment

2. locations and relationships in Hadoop cloud computing

3. introduction of Hadoop application cases at home and abroad

4. Hadoop concept, version, history

5. Hadoop Core Composition introduction and HDFs,mapreduce architecture

6. Hadoop Standalone mode installation and testing

7. the cluster structure of Hadoop

8. detailed installation steps for Hadoop pseudo-distribution

9. View Hadoop from the command line and browser

Hadoop Startup scripting analysis

One . Hadoop fully distributed environment building

Hadoop security mode, Recycle Bin Introduction

Second,HDFS Architecture and Shell and Java Operation

1. How the HDFS layer works

2. Hdfsdatanode,namenode detailed

3. single point of failure (sp0f) and high availability (HA)

4. accessing HDFS via API

5. Common compression algorithm introduction and installation use

6. Maven Introduction and installation, using maveninEclipse to build maven local repository

third,Mapreduce Learning

1. Four stages of the Mapreduce introduction

2. JobandTask description

3. Default working mechanism

4. Create MR application development to get the highest temperature of the year

5. Run the MR job on Windows

6. Mapper,Reducer

7. Inputsplit and outputsplit

8. Shuffle:Sort,partitioner,Group,combiner

9. Debug the program with counter

Installing Hadoop in Windows

Install the hadoop plugin in Eclipse and Access Hadoop Resources

Write an ant script in Eclipse

YARN Scheduling framework event distribution mechanism

Remote Debugging Explorer

The protocol analysis of the underlying Google protobuf for Hadoop

The Hadoop underlying IPC principle and RPC

iv.Hadoop Highly Available -ha

1. Introduction of hadoop2.x cluster structure system

2. hadoop2.x Cluster Construction

3. High Availability (HA) for NameNode

4. HDFS Federation

5. High Availability (HA) for ResourceManager

6. Hadoop Cluster FAQs and workarounds

How to learn Hadoop? Hadoop Development

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.