Learn the big data technology course and learn it with confidence. Let's get started.

Source: Internet
Author: User
Tags object serialization

If you are confident that you can stick to your learning, you can start to take action now!

I. Big Data Technology Basics

1. Linux operation Basics

  • Introduction and installation of Linux
  • Common Linux commands-File Operations
  • Common Linux commands-user management and permissions
  • Common Linux commands-system management
  • Common Linux commands-password-free login configuration and Network Management
  • Install common software on Linux
  • Linux local Yum source configuration and Yum Software Installation
  • Linux Firewall Configuration
  • Linux advanced text processing commands: Cut, sed, awk
  • Linux scheduled task crontab

2. shell programming

  • Shell programming-basic syntax
  • Shell programming-Process Control
  • Shell programming-Functions
  • Shell programming-comprehensive case-automated deployment script
    Big Data Learning Group 142973723

3. Memory Database redis

  • Introduction to redis and nosql
  • Redis Client Connection
  • Redis string-type data structure operations and applications-object Cache
  • List-type data structure operations and application cases of redis-task scheduling queue
  • Redis hash and Set Data Structure operations and application cases-shopping cart
  • Redis sortedset data structure operations and application cases-ranking list

4. Distributed Coordination Service zookeeper

  • Introduction and application scenarios of zookeeper
  • Zookeeper cluster installation and deployment
  • Zookeeper data nodes and command line operations
  • Java client basic operations and event listening for zookeeper
  • Zookeeper core mechanism and data nodes
  • Zookeeper application case-Distributed Shared Resource lock
  • Zookeeper application case-server online/offline dynamic awareness
  • Data Consistency principle of zookeeper and Leader Election Mechanism

5. Enhanced Java advanced features

  • Java multithreading Basics
  • Java synchronization keywords
  • Java concurrent packet thread pool and Its Application in open source software
  • Java concurrent packet messaging team and Applications in open source software
  • Java JMS technology
  • Java Dynamic proxy reflection

6. Lightweight RPC framework development

  • RPC principle Learning
  • NIO principle Learning
  • Netty common API Learning
  • Lightweight RPC framework requirement analysis and Principle Analysis
  • Lightweight RPC framework development

Ii. Offline Computing System

1. hadoop Quick Start

  • Hadoop background
  • Distributed System Overview
  • Offline data analysis process
  • Cluster creation
  • Cluster usage

2. HDFS Enhancement

  • HDFS concepts and features
  • Shell (command line client) operations of HDFS
  • HDFS Working Mechanism
  • Working Mechanism of namenode
  • Java API operations
  • Case 1: Develop shell collection scripts

3. mapreduce details

  • Customize the RPC framework of hadoop
  • Mapreduce programming specifications and Examples
  • Mapreduce program running mode and Debug Method
  • Internal Mechanism of mapreduce program running mode
  • Main workflow of mapreduce computing framework
  • Custom Object serialization Method
  • Mapreduce programming case

4. mapreduce Enhancement

  • Mapreduce sorting
  • Custom partitioner
  • Mapreduce combiner
  • Mapreduce Working Mechanism

5. mapreduce practice

  • Maptask concurrency mechanism-file Slicing
  • Maptask concurrency settings
  • Inverted index
  • Mutual friends

6. Federation introduction and hive usage

  • Hadoop ha Mechanism
  • HA cluster installation and deployment
  • Cluster O & M test-datanode dynamic online/offline
  • Namenode status switch management for cluster O & M Testing
  • Balance of data blocks for cluster O & M Testing
  • Changes in HDFS-API under HA
  • Hive Introduction
  • Hive Architecture
  • Install and deploy hive
  • Initial use of hvie

7. Hive enhancement and flume Introduction

  • HQL-DDL basic syntax
  • HQL-DML basic syntax
  • Hive join
  • Hive parameter configuration
  • Hive user-defined functions and transform
  • Hive executes hql instance analysis
  • Notes on hive Best Practices
  • Hive Optimization Strategy
  • Hive case studies
  • Flume Introduction
  • Install and deploy Flume
  • Case: Collect directories to HDFS
  • Case: collect files to HDFS

Iii. Stream computing

1. From getting started to proficient in Storm

  • What is storm?
  • Storm Architecture Analysis
  • Storm Architecture Analysis
  • Storm programming model, tuple source code, concurrency Analysis
  • Storm wordcount case and common API analysis
  • Storm cluster deployment practices
  • Storm + Kafka + redis business indicator calculation
  • Storm source code download and compilation
  • Strom cluster startup and source code analysis
  • Storm task submission and source code analysis
  • Storm data sending Process Analysis
  • Storm Communication Mechanism Analysis
  • Storm message Fault Tolerance Mechanism and source code analysis
  • Storm multi-Stream Project Analysis
  • Compile your own streaming task execution framework

2. Storm upstream and downstream and architecture Integration

  • What is a message queue?
  • Kakfa Core Components
  • Kafka cluster deployment practices and common commands
  • Kafka Configuration File Sorting
  • Kakfa javaapi Learning
  • Analysis of Kafka file storage mechanism
  • Apsaradb for redis basic and standalone environment deployment
  • Redis data structure and typical cases
  • Flume Quick Start
  • Flume + Kafka + storm + redis Integration

Iv. memory computing system spark

1. Scala Programming

  • Scala Programming
  • Scala-related software installation
  • Scala basic syntax
  • Scala methods and functions
  • Scala functional programming features
  • Scala array and set
  • Scala programming exercise (single-host wordcount)
  • Scala object-oriented
  • Scala pattern matching
  • Actor Programming
  • Option and partial function
  • Practice: Concurrent wordcount of actor
  • Kerihua
  • Implicit conversion

2. akka and RPC

  • Akka concurrent programming framework
  • Practice: RPC Programming Practice

3. Spark Quick Start

  • Spark Introduction
  • Spark Environment Construction
  • RDD Introduction
  • RDD conversion and action
  • Practice: RDD Comprehensive Exercises
  • RDD advanced Operator
  • Custom partitioner
  • Practice: website visits
  • Broadcast variable
  • Practice: Calculate geographic locations based on IP addresses
  • Custom sorting
  • Use jdbc rdd to import and export data
  • Detailed description of worldcount Execution Process

4. RDD details

  • RDD dependency
  • RDD Cache Mechanism
  • RDD checkpoint Mechanism
  • Spark task Execution Process Analysis
  • RDD Stage Division

5. Spark-SQL application

  • Spark-SQL
  • Spark combined with hive
  • Dataframe
  • Practice: spark-SQL and dataframe Cases

6. sparkstreaming Application Practice

  • Introduction to spark-streaming
  • Spark-streaming Programming
  • Practice: stagefulwordcount
  • Flume combined with spark streaming
  • Combine Kafka with spark streaming
  • Window functions
  • Elk technology stack Introduction
  • Install and use elasticsearch
  • Storm Architecture Analysis
  • Storm programming model, tuple source code, concurrency Analysis
  • Storm wordcount case and common API analysis

7. Spark core source code analysis

  • Spark Source code compilation
  • Spark remote debug
  • Source code analysis of spark job submission Line Process
  • Spark communication process source code analysis
  • Source code analysis of sparkcontext creation process
  • Source code analysis of the communication process between driveractor and clientactor
  • Source code analysis of the executor startup process by worker
  • Executor registration process source code analysis to driveractor
  • Executor registration process source code analysis to driver
  • Dagscheduler and taskscheduler source code analysis
  • Shuffle process source code analysis
  • Task execution process source code analysis

V. Machine Learning Algorithms

1. Python and numpy Libraries

  • Machine Learning Overview
  • Machine Learning and Python
  • Python-Quick Start
  • Python-Data Types
  • Python-process control statements
  • Python language-function usage
  • Python-modules and packages
  • Phthon language-object-oriented
  • Python Machine Learning Algorithm Library-numpy
  • Mathematical knowledge required for Machine Learning-Probability Theory

2. Common Algorithm Implementation

  • KNN classification algorithm-algorithm principles
  • KNN classification algorithm-code implementation
  • KNN classification algorithm-Case Study of hand writing Recognition
  • Lineage regression classification algorithm-algorithm principle
  • Lineage regression classification algorithm-Algorithm Implementation and demo
  • Naive Bayes classification algorithm-algorithm principle
  • Naive Bayes classification algorithm-Algorithm Implementation
  • Naive Bayes classification algorithm-spam recognition application case
  • Kmeans clustering algorithm-algorithm principle
  • Kmeans clustering algorithm-Algorithm Implementation
  • Kmeans clustering algorithm-application of geographic location Clustering
  • Decision tree classification algorithm-algorithm principles
  • Decision tree classification algorithm-Algorithm Implementation

In the near future, the multi-intelligence era will surely enter our daily lives. If you are interested in the future of cutting-edge industries, let us join hands to lead the future of AI. Big Data Learning Group 142973723

Learn the big data technology course and learn it with confidence. Let's get started.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.