Spark Learning System Finishing (basic, intermediate, advanced article covered content)

Source: Internet
Author: User

Novice just start to learn more confused, refer to the following, and then find relevant information to learn

1 Spark Basics
1.1 Spark ecology and installation deployment
During the installation process, understand the basic steps of the operation.
Installation deployment
Introduction to spark Installation
Source code compilation for spark
Spark Standalone Installation
Spark Standalone ha Installation
Spark Application Deployment Tool Spark-submit
Spark Ecology
Spark (Memory compute framework)
Sparksteaming (flow calculation framework)
Spark SQL (Ad-hoc)
Mllib (machine learning)
GraphX (bagel will be replaced)
1.2 Spark run architecture and parsing
The operating architecture of Spark
Basic terminology
Run the schema
Spark on standalone running process
Spark on YARN Run process
Spark Run instance resolution
Spark on Standalone instance parsing
Spark on Yarn Instance parsing


1.3 Spark's monitoring and tuning
Monitoring of Spark
Spark UI Monitoring, default port is 4040
Ganglia monitoring, Big Data monitoring open source framework
Spark Tuning
Fundamentals of tuning methods
1.4 Spark programming model
The programming model of Spark
Spark programming Model parsing
The features, operations, and dependencies of the RDD
Configuration of the Spark application
Spark Programming Example parsing
Processing of logs
1.5 Spark Streaming principle
Spark Streaming architecture
Features of Dstream
Dstream the difference between the operation and the RDD
Optimization of Spark streaming
Spark Streaming Instance Analysis
Common Instance programs:
Text instance
Window action
Network data processing
1.6 Spark SQL principle
Catalyst Optimizer for Spark SQL
Spark SQL kernel
Spark SQL and Hive
Examples of Spark SQL
Example operation of Spark SQL demo
Spark SQL Programming, need to find some resources on the network


2 Intermediate Articles
2.1 Spark's multi-language programming
The Scala programming of Spark
Spark's Python programming (Java must be familiar, needless to say)
The corresponding application instance, understanding the basic processing mode.


2.2 Spark Machine Learning Primer
The principle of machine learning
Mllib Introduction, example Analysis
2.3 GraphX Getting Started
The basis of graph theory
Graphx's introduction
GRAPHX Routine Analysis
2.4 Understanding Spark's differences and connections with other projects
Spark and MapReduce, Tez
Spark's derivative project Blinkdb,rspark
2.5 Follow Spark's author's blog and documentation for authoritative websites


3 Advanced Articles
3.1 In-depth understanding of Spark's architecture and processing patterns

3.2 Spark Source analysis and Reading
Spark Core Core Module,
Master the processing logic of the following core functions:
Sparkcontext
Executor
Deploy
RDD and Storage
Scheduler and Task
Spark Examples
3.3 Think about how to optimize and improve, to master its advantages and disadvantages,
Deep thinking can lead to interesting topics.

Spark Learning System Finishing (basic, intermediate, advanced article covered content)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.