01_hadoop Study Note Content description

Source: Internet
Author: User
Tags sqoop

Hadoop Description of the study note content _00

1, see Cloud sail Big Data Dream Qi Teacher's "Enterprise-class Hadoop 1.x Application Development Foundation Course" around April 2014 version.

2, the blog is in the Dream Qi Teacher's notes on the changes, convenient is their own later review study, may also be able to help the need to change the peer to provide a little help, in this very grateful to dream Qi teacher.

3, this series is on the centos6.4+hadoop1.2.1 experiment passed.

4, because I just contact, to Linux is only a preliminary understanding, the experimental process encountered more wonderful problems, also in this blog write.

5, Cloud sail Big Data official website published some public video resources, we can go to study.

First Topic

Linux system Environment Construction and basic command use: class using virtual machines, CentOS 6.4 64-bit operating system, basic commands to be familiar with, a class.

Section two to five topics (core of theHadoop 1.x series, Fundamentals)

Hadoop Native (standalone) mode and pseudo-distributed mode installation: Hadoop 1.x theory knowledge, architecture system, installation mode, Understanding HDFS file system, running MapReduce program WordCount, how to view Hadoop source code, Hadoop 1.x The structure of the package and so on, three time classes .

HDFS architecture, Shell operations, Java API usage, and application cases: in-depth explanation of HDFS related content, including HDFs architecture and design, advantages and disadvantages, how to store files, how to access the HDFs file system, HDFs shell command line, Java API mode Some small cases in the enterprise, such as small file storage processing, like Baidu Network disk analysis (using HDFS), and so on, three to four times .

Introduction to MapReduce, Framework principles, in-depth learning and related MR interview questions: in-depth knowledge of mapreduce, architecture, execution processes, mapreduce execution details, and MapReduce Authoring (WordCount): Data types, input and output formats, Combine, Partitioner, Sort and Group, inserted into the enterprise in the MapReduce simple use case, seven to eight sessions .

Hadoop cluster installation Management, NameNode security mode, and Hadoop 1.x Chuanjiang review: A course for Hadoop operations engineers, the installation of a cluster (based on a pseudo-distributed installation), NN SafeMode, and the use of the Hadoop Administrator command. Add nodes (machines), offload nodes (machines), monitor Hadoop clusters, and three hours of session .

sixth to tenth topic (Hadoop 1.x ecosystem,HBase and Hive)

Introduction to HBase, storage principles, shell commands, Java API operations, and application cases: distributed Databases (NOSQL databases),

Similar to Oracle database, store billions of rows of data, tens of thousands of data. Quasi-real-time query, and MR good integration, the calculation processing data. Architecture, Access (Shell and API), MapReduce, management, in-depth explanations, about four sessions.

Zookeeper cluster installation, review HBase and MYSQL 5.1 installation and basic use: mainly for hbase and Hive basic theory. Zookeeper coordinates hbase,mysql as Hive metadata management. Two sessions.

Hive installation, configuration metadata, HiveQL statement learning, and application cases

Chuanjiang Review HDFS, MapReduce, HBase, Hive, and Sqoop installation and data import and export: Overall review, Chuanjiang Hadoop, HBase and Hive, how to use in the enterprise, how to consider, combine the three to consider. Sqoop is used for import and export of data, and exports the data in the relational database to the mutual import between HBase and Hive. Three sessions.

Troubleshooting summary, Task scheduling Azkaban installation and use: Troubleshooting the entire Hadoop 1.x course, explaining the project, explaining the task scheduling framework, managing the job, and managing Hive.

11th Topic

Introduction to Hadoop 2.2.0, cluster installation and commercial version of Hadoop: Introduction to Hadoop 2.X, Hadoop 2.4.0 for the basics, theoretical explanation: Unlike Hadoop 1.x, where is the advantage. Installation: Distributed installation, HDFS and MapReduce program testing. Introduction to the commercial version of Hadoop: the Apache Open source Hadoop version includes, CDH Hortonworrsk,interl, Huawei, IBM, two sessions.

12th Topic

Cloudera Hadoop Introduction, CM4.8 Installation and Deployment CDH4.5: Introduction to the commercial version of Hadoop CDH, and management tools

CM installation, two sessions.

01_hadoop Study Note Content description

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.