Upgrade: Hadoop Combat Development (cloud storage, MapReduce, HBase, Hive apps, Storm apps)

Source: Internet
Author: User

Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without knowing the underlying details of the distribution. Leverage the power of the cluster for high-speed operations and storage. Hadoop implements a distributed filesystem (Hadoop Distributed File System), referred to as HDFs. HDFs is characterized by high fault tolerance and is designed to be deployed on inexpensive (low-cost) hardware. And it provides high transfer rates (HI throughput) to access application data for applications with very large datasets (large data set). HDFs relaxes (relax) POSIX requirements (requirements) so that data in the form of a stream can be accessed (streaming access) in the file system.


Hadoop is a software framework that enables distributed processing of large amounts of data. But Hadoop is handled in a way that is reliable, efficient, and scalable. Hadoop is reliable because it assumes that compute elements and storage will fail, so it maintains multiple copies of working data, ensuring that it can redistribute processing against failed nodes. Hadoop is efficient because it works in parallel and speeds up processing by parallel processing. Hadoop is also scalable and can handle petabytes of data. In addition, Hadoop relies on community server, so it costs less and can be used by anyone.
Hadoop comes with a framework written in the Java language, so it's ideal to run on a Linux production platform. This course is explained by using a Linux platform for simulation, based on real-world scenarios.

Highlight one: Full upgrade of the courseThis course is the original popular course "in-depth hadoop real-world development" and "Hadoop Application Development Combat" upgrade version, in the course content, added more new features of Hadoop, such as Namenode Ha,hdfs Federation, yarn and so on. Storm was introduced into the course as a new content. In the case of the course use, not only the classic application of the Old Course is adopted and strengthened, but also the other classic cases are introduced. Highlight two: Comprehensive technical point, perfect systemThis course takes into account the improvement of knowledge system of Hadoop course, draws out the most applied, deepest and most practical technologies in practical development, and through this course, you will reach the new high point of technology and enter the world of cloud computing. In the technical aspect you will master the basic Hadoop cluster, Hadoop hdfs principle, Hadoop hdfs Basic command, namenode working mechanism, HDFS basic configuration management; MapReduce principle; hbase system architecture; HBase Table Structure HBase How to use mapreduce;mapreduce advanced programming, Hive Primer, hive combined with mapreduce;hadoop cluster installation, Namenode Ha;hdfs Federation and many other knowledge points. Highlight Three: basic + actual combat = Application, balance learning and practiceEach stage of the course has a practical application project, so that students can quickly grasp the application of knowledge points, such as in the first stage, the course combined with the HDFS application, explained the image server design, and how to use the Java API to operate on HDFs, in the second phase; The course combines hbase to realize the various functions of the microblog project so that learners can ingenious. In the third stage: HBase and MapReduce combined with the realization of a single query and statistical system, in the fourth phase, hive combat, through the actual combat data statistics system, so that students in the shortest time to master hive advanced applications. Highlight four: Lecturer-rich experience in cloud platform operation for Telecom groupLecturer Ming Yi has a wealth of experience in the telecommunications group, is currently responsible for all aspects of the cloud platform, and has many years of in-house training experience.  The lecture content is completely close to the enterprise demand, not on paper. Hadoop version: Hadoop 2.4.1 Hive version: Hive 0.13.1 hbase version: HBase 0.98.6.1 CentOS Version: 6.5

01, Course introduction, HDFS Architecture and principles, build the CentOS development environment

> Hadoop Background

> HDFs design Objectives, application scenarios, architecture analysis

> Installing a CentOS virtual machine with virtual

> Virtual Machine Environment configuration

02,HDFS configuration installation for standalone and cluster

> Hadoop Standalone installation and configuration

> Hadoop cluster installation and configuration

> Use of Hadoop command line and WebUI

03,hdfs Applications-Cloud storage systems (1)

> Cloud Storage System Introduction and basic architecture

> Build the Eclipse and MAVEN development environment

> Create and configure Struts2 apps with maven

> Using bootstrap to build a UI framework

04,hdfs Applications-Cloud storage systems (2)

> Installing and Configuring Redis

> User Management Module Development

05,hdfs Applications-Cloud storage systems (3)

> Gson Introduction and usage examples

> implementation of normal file upload, delete, download

06,hdfs Applications-Cloud storage systems (4)

> Implementing HDFs-based uploads, downloads, and deletions

> HDFs Small File Management method: Sequencefile and Har

07, in-depth hdfs-namenode and Datanode

Introduction to the architecture of > HDFs

> How HDFs reads and writes files

> Fsimage and Editlog

> Rack Awareness

> HDFs Basic Management

08, in-depth Hdfs-hdfs federation

> HDFs Node Management

> HDFs Upgrade and Rollback

> HDFS Federation

> How to use Viewfs

09,namenode HA

> Zookeeper Configuration

> Namenode HA (Dual-machine) installation and configuration

10,yarn and MapReduce

> Configuring Yarn (Standalone and cluster)

How > MapReduce Works

> First Mapredcue Program

> Yarn command-line tools

11,mapreduce Apps-Search tips (1)

> Introduction to Working Principles (AJAX)

> Using jquery's AutoComplete controls to build the UI

12,mapreduce Apps-Search tips (2)

> Inherit the MapReduce program

> Using Redis to save intermediate data

> How to make statistics for incremental and full-scale data

> "Potential friend Referral" algorithm introduction

13,mapreduce Sampling Tools and partitiion

> How sampling and partitioning works

> Randomsampler,inputsampler,intervalsampler

> Totalorderpartitioner (Global Sort)

14,map Join and reduce join

> Reduce side Join

> Map side Join

> How to customize Data types

> How to use Distributedcache

15,mapreduce Application-pagerank

> PageRank Algorithm detailed explanation

> How to implement PageRank algorithm with MapReduce

Introduction to Hive

> Hive's architecture

> CLI, Hive Server, Hwi Introduction

> Configure hive to use MySQL to store meta data

> Basic use of the CLI

+, hive App-Search tips (1)

> Tomcat Log Parsing

> Using regular expressions to parse the Tomcat log

> Using regular Expressions in queries

Hive Application-Search tips (2)

> Calling Python scripts in hive queries for Redis insertions

19,HQL (1)

> HQL Foundation: DDL,DML

> Data types: Atoms and collections

> textfile default encoding and custom encoding

20,HQL (2)

> Hive Query

> Regular expressions, basic functions, aggregate functions, table functions

> Nested queries, case-when statements, like and rlike

> GroupBy and having etc.

21,hive Custom Functions

> How to write Custom functions

> Connect Redis in Custom functions

> Using Cachefile in Custom functions

22,compression in Hadoop

> Compression in Hadoop Introduction

> Using compression in MapReduce and hive

> Installation and Configuration Lzo

23,24, Introduction to HBase

> HBase Architecture

> HBase cluster installation

> Using HBase Shell

25,26,27,hbase Applications-word order Query

> Hbase Java API

> Struts2 and JSP

> Jquery Datatable and Datepicker

28,29,30,hbase Applications-Weibo

> Table Structure Design

> Follow friends

> Tweet

> My homepage

31,32,storm Getting Started

> Storm Architecture and Principle introduction

> Install storm

> Implementation of the first topology

> Storm's grouping

33,queue Spout and Drpc

34,35,storm Application-Voice bill Billing

> Construction Billing topology

> Implement queue spout and MySQL inbound bolts

> Implementing roaming and long-distance type calculations

Implementation of > Billing logic

> Integration of all functions into topology

> Verification of each function module

Upgrade: Hadoop Combat Development (cloud storage, MapReduce, HBase, Hive apps, Storm apps)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.