| First, get started with Hadoop and learn what Hadoop is |
Second, Distributed File System HDFs, is the basic course of database administrator |
1. Hadoop creates a background 2. Location and relationship of Hadoop in big data, cloud computing 3. Introduction of Hadoop application cases at home and abroad 4. Analysis of the employment situation of Hadoop in China and introduction of curriculum outline 5. Overview of Distributed Systems 6. Introduction to the Hadoop ecosystem and its components 7. Description of the Hadoop core MapReduce Example |
1. Introduction to the Distributed File system HDFs 2. Introduction to the system composition of HDFs 3, the part of HDFs detailed 4. Copy storage Policy and routing rules 5, NameNode Federation 6. Command line interface 7. Java interface 8, the client and HDFs data flow explained 9. Availability of HDFs (HA) |
| Iii. primary MapReduce, a basic course for Hadoop developers |
Iv. Advanced MapReduce, a key course for advanced Hadoop developers |
1. How to understand map and reduce calculation model 2. Analyzing the execution process of the pseudo-distributed mapreduce operation 3. Yarn Model 4. Serialization 5. Type and format of MapReduce 6. MapReduce Development Environment Construction 7. MapReduce Application Development 8. More examples, familiar with the principle of mapreduce algorithm |
1. Reduce the input size using compression separation 2. Reduce intermediate data with combiner 3. Write partitioner to optimize load balancing 4. How to customize the collation 5. How to customize grouping rules 6. MapReduce optimization 7, programming actual combat |
| V. Hadoop cluster and management, an advanced course for database administrators |
Vi. Zookeeper basic knowledge, build the basic framework of Distributed system |
1, the construction of Hadoop cluster 2. Monitoring of Hadoop cluster 3, the management of Hadoop cluster 4. Running MapReduce program under cluster |
1, zookeeper embodies the structure 2, the installation of zookeeper cluster 3. Operation Zookeeper |
| Vii. basic knowledge of hbase, column-oriented real-time distributed database |
Eight, hbase cluster and its management |
1. HBase definition 2. HBase vs. RDBMS 3. Data Model 4. System Architecture 5. MapReduce on HBase 6, the design of the table |
1, the building process of the cluster to explain 2, the monitoring of the cluster 3, the management of the cluster |
| Nine, HBase client |
X. Basic knowledge of pig, another framework for Hadoop computing |
1. HBase Shell and Demo 2. Java Client and Code demo |
1. Pig Overview 2. Install Pig 3, using pig to complete the mobile phone traffic statistics business |
| Xi. Hive, a Hadoop framework for computing using SQL |
12. Framework for data conversion between Sqoop,hadoop and RDBMS |
1. Basic knowledge of Data Warehouse 2. Hive definition 3. Introduction to Hive Architecture 4. Hive Cluster 5, Client Introduction 6. HIVEQL definition 7. Comparison of HIVEQL and SQL 8. Data type 9. Table and Table partitioning concept 10. Table operation and CLI client demo 11. Data Import and CLI client demo 12. Query data and CLI client demo 13. Data connection and CLI client demo 14. Development and demonstration of user-defined function (UDF) |
1, Configuration Sqoop 2. Use Sqoop to import data from MySQL into HDFs 3. Use Sqoop to export data from HDFs to MySQL 13. Storm1. Storm basics: Including Storm's basic concepts and storm applications scenarios, architecture and fundamentals, Storm vs. Hadoop 2. Storm Cluster Setup: Detailed information on storm cluster installation and installation Problems 3, Storm component introduction: Spout, bolt, stream groupings, etc. 4, Storm message reliability: Message failed to re-send 5. Integration of Hadoop 2.0 and Storm: Storm on YARN 6, Storm programming combat |
| 14. Forum Log Analysis Project |
The project's data comes from a blog of a website forum, which is tailor-made for this course and is ideal for our Hadoop course learning. Some students think should introduce more projects, in fact, after a few projects, will find that the project is the same idea, but the business is different. Once you have written this project, you have a clearer understanding of how Hadoop frameworks are used in the project, and the combination of Hadoop and Java EE. |