Comprehensive in-depth analysis of spark2--knowledge points, source code, Tuning, JVM, graph calculation, project
Course View Address: http://www.xuetuwuyou.com/course/220
The course out of self-study, worry-free network: http://www.xuetuwuyou.com
A total of 14 chapters, 316 sections, the course from spark-related technical aspects of all-round analysis, and finally the actual project: User interactive behavior Analysis System, DMP User Portrait system, to spark to do a comprehensive application to explain, can say a set in hand, play all the invincible hand!
1th Chapter: Scala
Task 1:java and Scala comparison
Task 2: Why learn Scala
Task 3:scala Compiler installation
Task 4: First Scala program written
Task 5:scala Tool Installation
Task 6: Programming with Idea
Task 7:idea hit Jar pack
Task 8: Declaration of a variable
Task 9:scala Data type
Task 10:if Expression
Task 11: Code block
Task 12: Cyclic-while
Task 13: Cyclic-for
Task 14:scala operator
Task 15: Definition of a method
Task 16: Defining functions
Task 17: Decorative Design
Task 18:java to explain functional programming
Task 19: Knowledge Review
Task 20: Fixed-length arrays and edge-length arrays
Task 21: Conversion and traversal of arrays
Task 22: An array of commonly used algorithms
Task 23:map Collection
Task 24: Tuple operations
Task 25:list Collection Operations
Task 26:scala Implement Word Count
Task 27:set Collection Operations
Task 28:lazy Features
Task 29:scala Course Description
Task 30: Definition of class
Task 31: View the fixed class file
Task 32: Primary and secondary constructors
Task 33: Morning Knowledge Review
Task 34: Objects
Task 35:apply method
Task 36:trait (trait)
Task 37: Extend the application
Task 38: Inheritance
Task 39: Abstract class
Task 40: Pattern Matching
Task 41:scala String Printing
Task 42: Sample Class
Task 43:option (Some,none)
Task 44: Partial function
Task 45: Closures
Task 46:curring
Task 47: Hermit parameters
Task 48: Hermit Conversion
Task 49: The Hermit Conversion Opportunity 2 case Demo
Task 50: Hermit conversion Case 1
Task 51: Hermit conversion Case 2
Task 52: Upper and lower bounds
Task 53: Upper Bound
Task 54: Nether Case
Task 55: View Boundaries
Task 56: Co-change
Task 57: Contravariance
Task 58: Knowledge Summary
Task 59:socket Job
Task 60: Job Requirements analysis
Task 61: Job Code implementation
Task 62: About actor Knowledge description
Basic concept explanation of Task 63:actor
Task 64:actor Case Demo
Task 65: Case two requirements analysis
Task 66: Case Code Demo (top)
Task 67: Case Code demo (bottom)
The 2nd Chapter: Sparkcore
Task 68: How to learn about open source technology
Task 69: What is spark
Four characteristics of Task 70:spark
Task 71:4spark fast Use (top)
Task 72:spark fast Use (bottom)
Task 73: What is an RDD
Task 74: Demonstrate what an RDD is
Task 75:spark The running process of the task
Task 76:9hadoop Cluster Construction
Task 77:spark Cluster Construction
Task 78:sparkha Cluster Construction
Task 79:scala developing Spark program Demo
Task 80:java7 developing SPARK programs
Task 81:java8 developing SPARK programs
Task 82:idea How to play Maven package
Task 83: Submit a task to the spark cluster
How task 84:rdd are created
Task 85: Description of the spark script
Task 86:transformation and Action principles
Task 87: Broadcast variables
Task 88: Accumulating variables
Task 89: Share variables using demo
Task 90: Persist
Task 91:checkpoint
Task 92: Additional Notes on persistence
Task 93:standalone Run mode
Task 94:spark-on-yarn
Task 95:spark-on-yarn Principle Description
Task 96:historyserver Service Configuration
Task 97:map-flatmap-filter
Task 98:sortbykey-reducebykey
Task 99:join-union-cogroup
Task 100:intersection-distinct-cartes
Task 101:mappartitions-repartition-coal
Task 102:coalesce and repartition Difference supplement
Task 103:aggregatebykey-mappartitionswi
Task 104: Description of the action operator
Task 105: Description of the Collect operator
Task 106:spark two-time ordering
Task 107: Narrow dependencies and wide dependencies
Task 108: Examples of narrow dependencies and wide dependencies
Task 109: noun Interpretation
Task 110:stage Partitioning algorithm
Scheduling of Task 111:spark tasks
3rd: Spark Tuning
Task 112: Avoid creating a duplicate Rdd
Task 113: Reuse the same rdd as much as possible
Task 114: Persist a multiple-use RDD
Task 115: Try to avoid using the shuffle class operator
Task 116: Shuffle operations with map-side pre-aggregation
Task 117: Using high-performance operators
Task 118: Broadcast Large variables
Task 119: Optimize serialization performance with Kryo
Task 120: Optimizing Data structures
Task 121: Data localization
Task 122: The principle of data skew and how to position data skew
Task 123: Preprocess Data using Hive ETL
Task 124: Filter A handful of keys that cause tilt
Task 125: Increase the degree of parallelism of shuffle operations
Task 126: Two-phase aggregation (local aggregation + global aggregation)
Task 127: Convert reduce join to map join
Task 128: Sample tilt key and split join operation
Task 129: Join with the random prefix and the expansion rdd
Task 130: Comprehensive Application of solutions
Task 131: Various shuffle versions
Task 132:shuffle Tuning
Task 133:spark Resource Tuning
Task 134:spark 1.5 version memory model
Memory model for Task 135:spark two
Task 136:whole-stagecodegeneration
4th Chapter: JVM Tuning
Schema of the task 137:JVM
Task 138: How the three regions work together
Task 139: Heap Structure
Task 140:JDK eight memory model
Task 141: Heap Memory Overflow Case Demo
Task 142:ma Tool Brief Introduction
Task 143:GC Log Format description
Task 144: Heap Memory Configuration Demo
Task 145: Stack parameter configuration
Task 146: Garbage Collection Algorithm Introduction
Task 147:stop-the-world
Task 148: Garbage Collection algorithm
Task 149: Introduction to the garbage collector
Task 150: Common collector configuration Demo
Task 151:cms garbage collector
Task 152:HADOOPJVM Tuning Demo
Task 153: Introduction to the garbage collector
Task 154: Introduction to Performance monitoring tools
Task 155: Large objects go straight into the old age
The 5th chapter: Sparkcore Source Code Analysis
Task 156: How to find the source code
Task 157: How to associate the source code
Task 158:master START process
Task 159:master and worker start-up process
Task 160:sparak-submit Submission Process
Task 161:sparkcontext Initialization
Task 162: Create a TaskScheduler
Task 163:DAGSCHEDUELR Initialization
Task 164:taskschedulerimp Start
Task 165:master Resource Scheduling algorithm
Task 166:TASKSCHEDULERIMLUML diagram
Task 167:executor Registration
Start UML diagram for Task 168:executor
Task 169:spark Task Submission
Task 170:task Task Run
Task 171:spark Task Submission Detail process
Task 172:spark Task submission Process drawing summary
Mission 173:blockmanager in-depth analysis
Mission 174:cachemanager in-depth analysis
The 6th chapter: Sparksql
Task 175: Description of the default number of partitions
Mission 176:sparkcore Official Case Demo
Mission 177:spark's Past life
Release Notes for Task 178:spark
Task 179: What is Dataframe
Task 180:dataframe First Experience
Task 181:rdd turn Dataframe mode one
Task 182:rdd converted to Dataframe Mode II
Task 183:rdd VS DataFrame
-load of Task 184:sparksql data source
-save of Task 185:sparksql data source
JSON and parquet of task 186:sparksql data sources
JDBC of task 187:sparksql data source
Task 188:spark hive for the data source
Task 189:thriftserver
Task 190:sparksql Case Demo
Task 191:sparksql and Hive integration
UDF of Task 192:sparksql
UDAF of Mission 193:sparksql
window function of Task 194:sparksql
Task 195:goupby and Agg
Task 196: Knowledge Summary
The 7th chapter: Kafka
Task 197: Why Kafka appear
The core concept of task 198:kafka
Task 199:kafka core concept again carding
Task 200: Introduction to various languages
Task 201: The benefits of the messaging system
Task 202: Classification of the message system and (Pull,push) differences
Task 203:kafka The architecture of the cluster
Construction of Task 204:kafka cluster
Task 205: Cluster Test Demo
Ha for Task 206:kafka data
The design of Task 207:kafka
Task 208:kafak Code Test
Task 209: Jobs
Offset of Task 210:kafka
The 8th chapter: sparkstreaming
Task 211: Brief talk about the future of sparkstreaming
Running process of Task 212:sparkstreaming
Task 213:dstream Drawing detailed
Task 214: Flow of streaming calculations
Task 215:socketstreaming Case Demo
Task 216:hdfsdstream Case Demo
Task 217:updatestatebykey Case Demo
Task 218:transform blacklist filter Demo
Task 219:window Action Case Demo
Task 220:transform blacklist filter Demo Supplement
Task 221:foreachrdd Case Demo
Task 222:kafka-sparkstreaming Integration Demo
Task 223:kafka Multi-threaded consumption data
Task 224:kafka consuming data in parallel using the thread pool
The 9th chapter: Streaming Tuning
Fault tolerance of task 225:sparkstreaming
Task 226:sparkstreaming VS Storm
Task 227:sparkstremiang and Kafka Integration (manual control offset
The degree of parallelism of task 228:sparkstreaming tuning
Task 229:sparkstreaming Tuning the memory
The serialization of Task 230:sparkstreaming tuning
The JVM&GC of Task 231:sparkstreaming tuning
Task 232:sparkstreaming Tuning individual task runs slowly
Resource instability of Task 233:sparkstreaming tuning
Task 234:sparkstreaming Data volume explosion
The 10th chapter: Streaming Source code
Mission 235:1sparkstreaming Source Guide Introduction
Task 236:sparkstreaming Operating principle
The principle of task 237:sparkstreaming communication model
Initialization of Task 238:stremaingcontext
Task 239:receiver START Process Guide
Task 240:receiver START Process UML Summary
Principle analysis of Task 241:block generation
Analysis of task 242:block generation and storage principle
Task 243: Responsibility chain Mode
Task 244:blockrdd Build and Job task submission
Task 245:blockrdd Build and Job Task submission Summary
The 11th chapter: SPARKGRAPHX
Task 246: Graph Calculation Introduction
Task 247: Figure Calculation Case Demo
Task 248: Basic Composition of graphs
Task 249: Figure Storage
Task 250: Find a friend case demo
The 12th chapter: SPARK2VSSPARK1
Task 251:spark new features
Task 252:rdd&dataframe&dataset
Task 253:rdd&dataframe&dataset
Task 254:sparksession Access Hive Supplemental Instructions
Task 255:dataframe and Datasetapi merging
The 13th Chapter: Integrated Project: User interactive behavior Analysis system
Task 256: Introduction to the project process
Task 257: Overall overview of the project
Task 258: Data sources for big data projects
Task 259: Project Background
Task 260: Common Concepts
Task 261: Project Requirements
Task 262: Project Consolidation process
Task 263: Thinking raised from the design of the table
Task 264: Get task parameters
Task 265: Requirements for a data message
Task 266: Requirements filter sessions based on criteria
Task 267: An illustrative example of demand
Task 268: Demand One-click order Payment category TOPN (top)
Task 269: Demand One-click order Payment category TOPN (bottom)
Task 270: Demand analysis of two requirements
Task 271: Requirements two data information
Task 272: Demand Two get user behavior data
Task 273: Requirement two user table and information table join
Task 274: Demand analysis of second demand
Task 275: Requirements two custom UDF functions
Task 276: Requirements Two-custom UDAF function
Task 277: Demand statistics on the number of product clicks in each region
Task 278: Demand two city information table and commodity information table join
Task 279: Demand two regions hot commodity statistics
Task 280: Requirements Two results Persistence Guide database
Task 281: Two Summary of requirements
Task 282: Demand analysis of three requirements
Task 283: Requirements three data information
Task 284: Three ways to comb the demand
Task 285: Requirements Three get data from Kafka
Task 286: Three requirements for blacklist filtering of data
Task 287: Demand three dynamic blacklist (top)
Task 288: Demand three dynamic blacklist (next)
Task 289: Demand three real-time statistics every city in the provinces advertising click
Task 290: Demand three real-time statistics of the provinces traffic click
Task 291: Demand three real-time statistics ads click Trend
Task 292: Three Summary of requirements
The 14th chapter: DMP User Portrait system
Task 293: Project Background
Task 294:DSP Process
Task 295: Project Process description
Task 296:utils Tool class development
Task 297: Demand-function development
Task 298: Package commits the code to the cluster to run
Task 299: Requirements II Description
Task 300: Reporting Requirements description
Task 301: Statistical data distribution in various provinces and cities
Task 302: Define a Word table statistic function
Task 303: Provincial City Report statistics
Task 304:app Report Statistics
Task 305: User Portrait requirements
Task 306: Tag
Task 307: Merging Context labels
Task 308: Context label test run
Task 309: Why do we need figure calculation
Task 310: Basic Concepts of graphs
Task 311: Simple Case Demo
Task 312: Thinking of merging context tags
Task 313: Simple Case Demo Description
Task 314: Continue to comb your ideas
Task 315: Generate a user relationship table
Task 316: Merge labels
Comprehensive in-depth analysis of spark2--knowledge points, source code, Tuning, JVM, graph calculation, project