This course focuses onSpark, the hottest, most popular and promising technology in the big Data world today. In this course, from shallow to deep, based on a large number of case studies, in-depth analysis and explanation of Spark, and will contain completely from the enterprise real complex business needs to extract the actual case. The course will cover Scala programming, spark core programming, spark SQL and spark streaming, spark kernel and source profiling, performance tuning, Enterprise case scenarios, and more. Starting completely from scratch, students can master Spark's enterprise-class big data development in one stop, improve their workplace competitiveness, achieve better promotions or job-hopping, or transform from traditional software development engineers like Java EE to spark Big data development engineers. Or for those of you who are working on Hadoop big Data development, you can broaden your skills stack and increase your value.
1. Curriculum Development EnvironmentDevelopment tools: Eclipse, Scala IDE for Eclipse; spark:1.3.0 and 1.5.1hadoop:2.4.1hive:0.13zookeeper:3.4.5kafka:2.9.2-0.8.1 Other tools: SecureCRT, WinSCP, VirtualBox, etc.
2. Introduction to the contentThis course focuses on Scala programming, Hadoop and Spark cluster Setup, spark core programming, spark kernel source depth profiling, spark performance tuning, Spark SQL, spark streaming. The main features of this course include: 1, code-driven to explain the various technical points of spark (absolutely not according to the PPT theory), 2, on-site hands-on drawings to explain the spark principle and source code (absolutely not the source and PPT), 3, covering all Spark function points (spark RDD, Spark SQL, spark streaming, primary function to advanced features, a lot of); 4, Scala full case of practical lectures (nearly hundreds of interesting cases); 5, Spark case actual combat code, Almost all of Java and Scala are available in two versions and tutorials (one-time proficiency in Java and Scala development Spark), 6, a large number of full-network unique knowledge points: Sorting based on the Wordcount,spark two order, spark Group to take TOPN, Two ways to convert Dataframe and RDD, Spark SQL built-in functions, windowing functions, UDFs, Udaf,spark streaming Kafka Direct API, Updatestatebykey, transform, sliding windows , Foreachrdd performance optimizations, integration with Spark SQL, persistence, checkpoint, fault tolerance, and transactions. 7, multiple from the actual needs of the enterprise extraction of complex cases: daily UV and sales statistics cases, TOP3 hot commodity statistics cases, daily top3 hot spot Search word statistics, advertising billing log real-time blacklist filter cases, hot search words sliding statistics cases, TOP3 popular products real-time statistics case 8, Deep analysis spark core source code and spark streaming source code, to the source for detailed comments and explanations (History of the most detailed source) 9, a comprehensive explanation of spark, spark SQL, spark streaming performance tuning, This includes a full network of unique shuffle performance tuning (detailed performance tuning of each technology point) 10, covering spark two important versions, Spark 1.3.0 and Spark 1.5.1 (walk at the forefront of spark, covering the latest advanced features)
Super heavyweight free upgrade notification!
This course upgrade, a total of 132 said, 60 hours or so, content expansion nearly one times. Phase upgrades are carried out from the start to the mastery stages. The main contents are summarized as follows:1. Scala Advanced Programming: A guide to Scala's high-level coding skills. 2, Spark core programming Advanced: The History of the most detailed spark core programming, including standalone cluster operation and Spark-submit all the details, supplemented by a large number of experiments, and added to explain almost all operator operations, and add a lot of practical cases and mobile app access traffic log analysis of comprehensive cases. 3, Spark Core principle advanced: The whole network exclusive explanation of spark commonly used in the internal principle of 10 operators. 4, Spark SQL Combat development: Explain Thrift JDBC/ODBC Server and other advanced content, and add news site key indicators offline statistics comprehensive case. 5, Spark streaming combat development: To explain the flume data source and other advanced content, and add news site key indicators real-time statistics comprehensive case. 6, Spark Operations Management advanced: Full combat and walkthrough of Spark's operations and management of various high-level technologies, including zookeeper and file system-based HA and master-slave switching, a variety of job monitoring methods, and the entire network exclusive Spark dynamic resource allocation technology and fair Scheduler technology. China Huperzine: In the domestic bat company and first-line internet companies engaged in big data development and architecture work, responsible for a number of large-scale big Data system architecture and development. Proficient in big data technologies such as Hadoop, Storm, and Spark. Have a wealth of internal technology sharing, technical training and technical lectures experience.
Instructor FAQ qq:2310879776
first, the Scala programming detailed:1th talk about-spark's past Life 2nd Lecture-Course Introduction, Features and Values 3rd-scala programming Details: The basic grammar 4th speaking-scala programming Detailed: Conditional control and cycle 5th Talk-scala Programming Detailed: Function introduction 6th-scala Programming Details: Function of the default parameters and named parameters 7th-scala Programming Details: function Getting Started variable length parameter 8th-scal A programming explanation: the process of getting started with a function, the lazy value and the exception 9th talk about-scala programming: array operation arrays, Arraybuffer and traversing the array 10th talk-scala programming Details: Array operation array conversion 11th talk-scala programming Detailed: Map and tuple 12th-scala programming Details: Object-oriented programming, such as 13th-scala programming Details: Object-oriented programming of the object 14th-scal A programming explanation: The inheritance of object-oriented programming the 15th lecture-scala programming in detail: Object-oriented programming of the trait 16th-scala programming Detailed: Functional programming 17th-scala programming Details: Functional programming of the set Operation 18th-scala programming Details: Pattern matching 19th-scala programming Details: Type parameter 20th lecture-scala programming Details: Implicit conversion and implicit parameter 21st lecture-scala programming: Actor Introduction
second, the curriculum environment to build:22nd Lecture-Curriculum Environment Construction: CentOS 6.5 cluster Building 23rd Lecture-Course Environment construction: Hadoop 2.4.1 Cluster Construction 24th Lecture-Curriculum environment construction: Hive 0.13 Construction 25th Lecture-Curriculum environment construction: ZooKeeper 3.4.5 Cluster Construction 26th Lecture-Curriculum Environment construction: kafka_2.9.2-0.8.1 Cluster Construction 27th Lecture-Course Environment construction: Spark 1.3.0 Cluster construction
third, spark core programming:28th Talk-spark core programming: Spark basic working principle and RDD 29th talk-spark core programming: Using Java, Scala and Spark-shell Development WordCount program 30th talk about-spark Core programming: WordCount Program principle Depth Analysis 31st-spark Core programming: Spark Architecture Principle 32nd-spark Core Programming: Creating Rdd Combat (Collection , local files, HDFs file) 33rd Talk-spark Core programming: Operation Rdd actual combat (transformation and action case combat) 34th-spark Core Programming: Transformation Operation Development cases 35th talk-spark core programming: Action Operation Development Case Real War 36th Talk-spark Core programming: Rdd Persistent detailed 37th-spark core programming: Shared Variables (broadcast Variable and accumulator) 38th-spark Core programming: Advanced programming based on sorting mechanism WordCount program 39th-spark Core programming: Advanced programming of the second order of the actual combat 40th-spark core programming: Advanced Programming TOPN and group to take TOPN combat
Iv. Spark Kernel Source depth analysis:41st,-spark kernel source depth analysis: Spark kernel Architecture depth analysis of the 42nd-spark kernel source depth analysis: wide dependence and narrow dependence of depth analysis of the 43rd-spark kernel source Depth analysis: Two types of the depth of the analysis of the model of yarn 44th-spark Core Source depth Analysis: Spar Kcontext initialization principle and source code analysis of the 45th-spark kernel source Depth analysis: Master master and Standby switching mechanism principle analysis and source code analyze the 46th-spark kernel source Depth Analysis: the principle of master registration mechanism and source analysis 47th-spark kernel source depth analysis: Mast Analysis of the principle of ER state change processing mechanism and source code analyzing the 48th-spark kernel source Depth Analysis: Master resource Scheduling algorithm principles and source analysis of the 49th-spark Core Source Depth Analysis: Worker principle analysis and source code analysis 50th Talk-spark kernel source depth analysis: Job trigger process principle analysis and source code analysis 51st talk-spark kernel source depth Analysis: dagscheduler principle Analysis and source code analyses (Stage Division algorithm and task best position algorithm) 52nd-spark kernel Source depth analysis: TaskS Cheduler principle Analysis and source Code Analysis (Task assignment algorithm) 53rd talk-spark kernel source depth Analysis: Executor principle Analysis and source code analysis 54th-spark Kernel Source Depth Analysis: Task principle analysis and source analysis 55th-spark kernel source depth analysis: Shuffle Analysis of the principle and source code (general shuffle and optimized shuffle) 56th-spark Kernel Source Depth Analysis: blockmanager principle and source analysis (spark underlying storage mechanism) 57th-spark Kernel Source Depth Analysis: CacheManager principle Analysis and Source analysis 58th Talk-spark kernel source depth Analysis: Checkpoint principle Analysis and source code
Five, Spark performance optimization:59th.-spark Performance Optimization: Performance Optimization Overview 60th-spark performance Optimization: Diagnosing memory consumption 61st-spark performance Optimization: High performance serialization Class library 62nd-spark performance optimization: Optimizing Data structure 63rd-spark Performance optimization: Persistent or CHEC for multiple use of RDD Kpoint 64th-spark Performance Optimization: Using serialized persistence level 65th talk-spark performance optimizations: Java Virtual machine garbage collection tuning 66th talk-spark performance optimization: Improved parallelism 67th-spark performance Optimization: Broadcast shared data 68th-spark performance optimization: Data localization 69th Talk-spark performance optimization: Reducebykey and Groupbykey 70th-spark performance optimization: Shuffle performance optimization
Six, Spark SQL:71st Lecture-Curriculum environment building: Spark 1.5.1 New version features, source code compilation, cluster build 72nd talk-spark SQL: Previous life 73rd talk-spark Sql:dataframe use 74th speak-spark SQL: Use reflection to convert Rdd to dataframe 75th-spark sql: Convert Rdd to Dataframe 76th-spark sql: A common load and save operation for data sources 77th talk-spark Sql:parquet data source using programmatic loading data 78th talk about automatic partition inference of-spark Sql:parquet data source 79th lecture-spark sql:parquet data source Merge metadata 80th Talk-spark Sql:json Data source Complex comprehensive case 81st talk-spark sql:hive data source Complex comprehensive case 82nd-spark SQL:JDBC Data source Complex comprehensive case 83rd talk-spark SQL: Built-in functions and daily UV and sales statistics case 84th talk-spark SQL: Open Window function and TOP3 sales statistics case actual combat 85th-spark sql:udf Custom Function The actual combat 86th talk-spark SQL:UDAF Custom aggregate function 87th talk-spark sql: Working principle and performance optimization 87th talk-spark sql: daily top3 hot spot search terms with Spark core the actual combat 87th talk-spark SQL: Core source depth analysis (DataFrame lazy feature, optimizer optimization strategy, etc.) 87th-spark SQL: Extended knowledge hive on Spark
Seven, Spark streaming:88th-spark Streaming: Introduction to the real-time calculation of Big data 89th-spark Streaming:dstream and basic working principles 90th-spark Streaming: A comparative analysis with Storm 91st speaking-spark Streaming: Real-time WordCount program development 92nd Talk-spark Streaming:streamingcontext detailed 93rd Talk-spark Streaming: Input dstream and receiver details 94th-spark streaming: input dstream of the underlying data source and real-time wordcount based on HDFs case 95th Talk-spark Streaming: Input Dstream Kafka Data source actual combat (receiver-based) 96th-spark streaming: Input dstream Kafka Data source Combat (direct-based) 97th lecture-spark Streaming:dstream Transformation Operations Overview 98th-spark Streaming:updatestatebykey and Cache-based real-time WordCount case study 99th Talk-spark Streaming:transform and advertising billing log real-time blacklist filter case 100th talk-spark Streaming:window sliding window and hot search words sliding statistics case 101th lecture-spark Streaming:dstream output operation and Foreachrdd performance optimization in detail 102th-spark streaming: top3 Popular products used in conjunction with Spark SQL real-time statistics case 103th talk-spark Streaming: Caching and persistence mechanism in detail 104th-spark streaming:checkpoint mechanism detailed (Driver high reliability program detailed) 105th-spark streaming: Deployment, Upgrade and monitor real-time applications 106th-spark streaming: Fault tolerance mechanism and transaction semantics in detail 107th-spark streaming: A deep analysis of architecture principles 108th Talk-spark Streaming:streamingcontext initialization and receiver start-up principle and source analysis 109th-spark Streaming: Analysis of the principle of data reception and the source of the 110th-spark streaming: Data processingAnalysis of the principle and source code (block and batch relationship thorough analysis) 111th-spark streaming: Performance Tuning detailed 112th lecture-course summary (learned what?) What level has it reached? )
Advanced Spark Development (Upgrade content!) )
first, Scala programming advanced:113th-scala Programming Advanced: Scaladoc using the 114th-scala programming Advanced: 3 ways to jump out of a loop statement 115th-scala Programming Advanced: Multidimensional arrays, Implicit conversion of Java arrays to Scala arrays 116th-scala programming: Tuple zipper operation, implicit conversion of Java map to Scala map 117th speaking-scala programming: 2 Ways to expand the scope of an inner class, Inner class Get External class reference 118th-scala Programming: Package and import in real-time the 119th lecture-scala programming Advanced: Rewrite field's advance definition, Scala inheritance level, Object equality 120th-scala programming Advanced: The file operation of the actual practice of the 121th-scala programming Advanced: Partial function of the actual combat details 122th-scala Programming Advanced: Execute external Command 123th-scala programming Advanced: Regular expression support 124th-scala programming Advanced: Extract the actual combat details of the first 125 Speaking-scala Programming Advanced: Example of the extraction of the class of the 126th lecture-scala programming Advanced: Only one parameter of the extractor 127th speaking-scala programming Advanced: Comments on the actual details of the 128th-scala Programming Advanced: Commonly used annotations introduction 129th-scala Programming Advanced: XML BASIC operations 130th-scala Programming Advanced: XML embed Scala code 131th speaking-scala programming Advanced: XML modification elements of the actual combat details 132th-scala Programming Advanced: XML loading and writing external documents 133th-scala programming Advanced: Set element operation 134th-scala programming Advanced: Common methods of operation of the set 135th-scala programming Advanced: Map, FlatMap, collect, foreach Practical details 136th lecture-scala Programming Advanced: Reduce and fold actual combat
second, spark core programming Advanced: 137th-Environment Construction-centos 6.4 Virtual machine Installation 138th Lecture-Environment Construction-hadoop 2.5 pseudo-distributed cluster construction 139th-Environment construction-spark 1.5 pseudo-distributed cluster building 140th Lecture-Introduction of the first course upgrade outline and key notes 141th lecture-spark core programming Advanced-spark Cluster Architecture Overview 142th-spark Core Programming Advanced-spark cluster architecture Several special instructions 143th-spark core programming Advanced-spark The core terminology explains the 144th lecture-spark core programming Advanced-spark Standalone cluster architecture 145th talk-spark core programming advanced-Start master and worker scripts in detail 146th-spark Core Programming Advanced-Experiment: Start master and worker processes separately and start logs see 147th-spark core Programming Advanced-WO Rker node configuration and spark-evn.sh parameter details 148th-spark core Programming Advanced-Experiment: Local mode submit spark Job 149th lecture-spark Core Programming Advanced-Experiment: Standalone Client mode submit Spark Job 150th lecture-spark core programming Advanced-Experiment: Standalone Cluster mode submit Spark job 151th talk about-spark Core programming advanced-standalone Mode multi-job resource Scheduling 152th lecture-spark core Programming Advanced-standalone mode job monitoring and logging 153th talk-spark Core programming Advanced-Experiment: Run In-Job monitoring and manual print logs 154th talk about-spark Core programming Advanced-yarn-client mode principle of the 155th lecture-spark core Programming Advanced-yarn-cluster Mode principle explanation 156th-spark Core Programming Advanced-Experiment: yarn-client mode commit S Park assignment 157th-spark Core Programming Advanced-yarn Mode log view detailed 158th-spark Core programming Advanced-yarn mode related parameters detailed 159th-spark Core programming advanced-spark Engineering Packaging and spark-submit details 160th-spark Core programming Advanced-spark-submit example and basic parameter explanation 161th-spark core Programming Advanced-Experiment: Spark-submit simplest version submit spark Job 162th lecture-spark Core Programming Advanced-Experiment: Spark-submit pass parameters to main class 1th 63 Speaking-spark Core programming advanced-spark-Submit multiple examples and common parameters in detail 164th-spark core programming Advanced-sparkconf, Spark-submit and spark-defaults.conf 165th-spark core Programming-spark-submit configuration third-party dependency 166th-spark core programming advanced-spark Operator's closure principle detailed 167th-spark core programming Advanced-Real Verification: Invalid phenomenon of additive operation for closure variables 168th talk about-spark Core programming Advanced-Experiment: the inability to see the data printed in the operator 169th talk-spark core programming advanced-mappartitions and Student results query case 170th-spark core programming Advanced-mapparti Tionswithindex to the first class case 171th lecture-spark core programming advanced-sample and the company annual meeting draw case 172th-spark core programming-union and Company division merger case 173th talk-spark core programming-intersection and the public Divisional multi-Project personnel query case 174th-spark core programming Advanced-DISTINCT and Web site UV Statistics case 175th-spark core programming-aggregatebykey and Word count case 176th-spark core programming-cartesian and clothing collocation case Example 177th-spark core programming Advanced-COALESCE and the company department integration case 178th-spark core programming-repartition and the company new Department case 179th-spark core programming-takesampled and the company annual draw case 180th talk- Spark Core Programming Advanced-shuffle Operation Principle of the 181th-spark core programming Advanced-shuffle operation process data sorting 182th-spark core programming advanced-operators that trigger shuffle operations 183th-spark Core programming Advanced-shuffle operation A detailed explanation of the performance consumption 184th-spark core programming Advanced-shuffle operation all relevant parameters and performance tuning 185th-spark core Programming Advanced-Integrated Case 1: Mobile app Access traffic log analysis 186th-spark Core Programming Advanced-Comprehensive Case 1: Log File format analysis 187th-spark Core Programming Advanced-Comprehensive Case 1: Read the log file and create the Rdd 188th Lecture-spark core programming Advanced-Integrated Case 1: Creating a custom serializable class 189th talk-spark Core programming advanced-Comprehensive Case 1: Map Rdd to Key-vAlue format 190th-spark Core Programming Advanced-Comprehensive Case 1: Aggregation operations based on DeviceID-spark core programming Advanced-Comprehensive Case 1: custom two-order key class 192th talk-spark core programming Advanced-Synthesis Case 1: Two order key mapped to Rdd K EY 193th-spark Core Programming Advanced-Integrated Case 1: Performing two ordering and getting TOP10 data 194th-spark Core programming Advanced-Comprehensive Case 1: program run test and code debug 195th Talk-spark core programming advanced-Deploy second CentOS machine 196th talk-spark Core programming advanced-Deploying the second Hadoop node 197th-spark core programming-Adding a second Hadoop node dynamically to the cluster 198th speaking-spark core programming advanced-submitting spark jobs using yarn-client and Yarn-cluster
third, the spark core principle advanced:199th. Analysis of the inner realization principle of the advanced-union operator of-spark kernel principle No. 200-spark The principle of the inner realization of the advanced-groupbykey operator of the kernel principle No. 201-spark kernel theory analysis of the inner realization principle of the Advanced-reducebykey operator No. 202 Talk-spark Core theory analysis of inner realization principle of advanced-distinct operator No. 203 Talk about the principle of inner realization of-spark kernel principle advanced-cogroup operator No. 204-spark kernel theory Analysis of inner realization principle of advanced-intersection operator No. 205 talk-spark kernel principle advanced-joi Analysis of the internal realization principle of n operator No. 206 talk about the inner realization principle of the advanced-sortbykey operator of the-spark kernel principle No. 207-spark Core theory analysis of the inner realization principle of the advanced-cartesian operator No. 208 speaking-spark kernel principle analysis of inner realization principle of advanced-COALESCE operator NO. 209. Analysis of inner realization principle of advanced-repartition operator in-spark kernel theory
iv. Spark SQL Combat Development Advanced:NO. 210 talk-spark SQL actual development Advanced-hive 0.13 installation and test NO. 211 Speaking-spark SQL Combat Development Advanced-thrift JDBC, ODBC server No. 212 Talk-spark SQL Combat Development Advanced-CLI command line using the NO. 213-spark SQL Combat Development Advanced-Comprehensive Case 2: news website key indicators offline statistics NO. 214 lecture-spark SQL Combat Development Advanced-Comprehensive Case 2: page PV statistics and sequencing and enterprise-level project development process Description NO. 215-spark SQL Combat Development Advanced-Comprehensive Case 2: page UV statistics as well as sorting and count (distinct) bug Description No. 216 lecture-spark SQL Combat Development Advanced-Comprehensive case 2: New user registration ratio statistics NO. 217 talk-spark SQL actual development Advanced-comprehensive case 2: User Bounce rate Statistics No. 218 talk-spark SQL actual development Advanced-Comprehensive Case 2: Section Heat ranking statistics No. 219 talk-spark SQL Combat Development Advanced-Comprehensive Case 2: testing and commissioning
Five, Spark streaming combat development advanced:No. 220 Talk-spark streaming actual combat development Advanced-flume Installation No. 221-spark streaming actual combat development advanced-receive flume real-time data stream-flume style push-based method No. 222 Talk-spark Streaming actual combat development advanced-receive flume real-time data stream-poll-based approach to custom sink-spark streaming actual combat development advanced-Advanced Technology custom receiver No. 224 speaking-spark Streaming actual combat development Advanced-kafka Installation NO. 225-spark streaming actual combat development advanced-Comprehensive Case 3: news website key Indicators real-time statistics No. 226 talk-spark Streaming actual combat Development advanced-Comprehensive Case 3: page PV Real-time statistics No. 227 talk about-spark streaming actual combat development advanced-Comprehensive Case 3: page UV Real-time statistics No. 228 lecture-spark Streaming actual combat Development advanced-Comprehensive Case 3: Number of registered users real-time statistics No. 229-spark Streaming actual combat development advanced-Comprehensive case 3: User jumps out the amount of real-time statistics NO. 230 lecture-spark Streaming actual combat Development advanced-Comprehensive Case 3: Section PV Real-time statistics
Six, Spark Operations management advanced: No. 231,-spark Operation Management Advanced-based on zookeeper to achieve ha high availability and automatic primary and standby switching NO. 232-spark Operational Management Advanced-Experiment: Zookeeper-based ha high availability and automatic master and standby switching No. 233 talk-spark Operations Management Advanced-file system-based implementation ha High availability and manual primary and standby switching No. 234-spark Operations Management Advanced-Experiment: File system-based ha high availability and manual primary and standby switching No. 235 talk-spark Operations Management advanced-Job monitoring-experiment: through the spark Web UI for job monitoring No. 236 talk-spark Operations Management advanced-Job Monitoring-Experiment: Standalone mode view History Job's Web UI No. 237 talk-spark Operations Management advanced-Job Monitoring-experiment: Start Historyserver View the web of history jobs UI No. 238 Talk-spark Operations management advanced-Job monitoring-experiment: Using the Curl+rest API for job monitoring No. 239 talk-spark Operations management advanced-Job Monitoring-experiment: Spark metrics system and custom metrics Sink NO. 240-spark Operations Management Advanced-Job resource scheduling-static resource allocation principle NO. 241-spark Operations Management Advanced-Job resource scheduling-dynamic resource allocation principle No. 242 talk-spark Operations Management advanced-Job Resource Scheduling-Experiment: Standalone mode using dynamic resource allocation 2nd 43 Talk-spark Operations Management advanced-Job Resource Scheduling-experiment: Using dynamic resource allocation in yarn mode No. 244 talk-spark Operations Management advanced-Job resource scheduling-Multiple Job resource scheduling principles NO. 245-spark Operations Management Advanced-Job resource scheduling-fair Scheduler use detailed objective one. Proficiency in the Scala programming language, the ability to develop spark programs in Scala, and the ability to read spark source target two. Manually build Hadoop clusters, spark clusters, Hive, zookeeper, and Kafka cluster target three from scratch. Master Spark Core Programming, can develop a variety of complex Big Data offline batch program target four. A thorough understanding of the spark kernel source code, can be online on the program error when troubleshooting, according to the exception stack information to read the corresponding source code to solve the line failure target five. Ability to tune target six with a variety of technologies for common spark performance issues. Skilled in the development of large data interactive query program using Spark SQL, mastering common performance optimization technology goal seven. Skilled in using spark streaming to develop real-time computing programs for big data, understanding principles and Source Code,And can perform performance tuning highlights. With Spark 1.3.0/spark 1.5.1+hadoop 2.4.1 combination, spark in-depth explanation of the epoch-making version of 1.3.0, and the latest version of 1.5.1, technology is absolutely at the forefront of the industry. Highlights second, the code-driven explanation of all the technical points, on-site drawing to explain all the principles and concepts, can be hands-on combat, but also a thorough understanding. Highlights three, all functional points in accordance with the official outline, all technical points, function points, basic functions and advanced features, all explained to, full coverage: Highlights four, the whole case Scala contains dozens of interesting cases, and spark involves several complex cases extracted from the actual enterprise requirements scenario. Highlights five, almost all spark code actual combat, Case combat, all provide Java and Scala two version of the code, the entire network unique! Highlights six, a large number of exclusive high-level knowledge points and technical points, including spark two order, group take Topn,spark SQL built-in functions and window functions, Spark streaming driver high-availability programs and so on, the whole network unique! Highlight seven, on-site drawing to explain the source code, in-depth analysis of 80% core kernel source code, a lot of comments to the source code, in-depth detailed source explanation, the entire network only! Highlights eight, a comprehensive explanation of spark, spark SQL and spark streaming performance optimization technology, combined with on-site drawing to explain performance tuning, and in-depth explanation shuffle performance tuning, the whole network unique!
(upgraded) spark from beginner to proficient (Scala programming, Case combat, advanced features, spark core source profiling, Hadoop high end)