From the decision to write spark SQL source analysis of the article, to now one months of time, the land continues almost finished, here also do a integration and index, convenient for everyone to read, here give reading order:)
First Article The core process of Spark SQL source analysis
Second article Spark SQL Catalyst Source Code Analysis Sqlparser
Third article Spark SQL Catalyst Source Analysis Analyzer
Fourth Article Spark SQL Catalyst Source code Analysis TreeNode Library
Fifth Article Spark SQL Catalyst Source Code Analysis Optimizer
Sixth article Spark SQL Catalyst Source Code Analysis physical Plan
Seventh Article The specific implementation of the physical Plan to Rdd for the Spark SQL source analysis
Eighth Article Spark SQL Catalyst Source Code Analysis UDF
Nineth article Spark SQL Source Analysis In-memory Columnar storage source Analysis Cache table
Tenth article Spark SQL Source Analysis In-memory Columnar storage Source Analysis query
11th Spark SQL Source Analysis External DataSource external data source
Reading the source code is a good habit of learning the framework, helping to improve the personal level, but the most important is to summarize:)
Original articles, reproduced please specify:
Reprinted from: Oopsoutofmemory Shengli's blog, oopsoutofmemory
This article link address: http://blog.csdn.net/oopsoom/article/details/38257749
Note: This document is based on the attribution-NonCommercial use-prohibition of the deduction of the 2.5 China (CC by-nc-nd 2.5 CN) Agreement, which is welcome to reprint, forward and comment, but please retain the author's attribution and link to the article. Please contact me if you need to negotiate for commercial purposes or in connection with licensing.
Transferred from: http://blog.csdn.net/oopsoom/article/details/38257749
"Spark SQL Source Analysis series articles"