Spark Performance Tuning Series catalog:
General Tuning
Performance tuning to allocate more resources in real-world projects
Performance tuning to adjust parallelism in real projects
Performance tuning using Kryo serialization in real projects
Performance tuning to broadcast large variables in real-world projects
Performance tuning in real-world projects, using FASTUTIL to optimize data formats
Performance tuning to adjust data localization wait time in real projects
Performance tuning in real-world projects, refactoring the RDD architecture and the RDD Persistence JVM tuning
Overview of the JVM tuning principle and memory footprint reduction for cache operations
JVM Tuning Executor "out-of-heap memory" and connection time- shuffle tuning Shuffle Tuning Principle Overview The conditions of the output shuffle tuning of the merged map end of shuffle tuning map end memory cache vs. reduce side memory ratio Shuffle tuning Hashshufflemanager and Sortshufflemanager operator tuning operators to improve the mappartitions of map class operation performance Using coalesce to reduce the number of partition operators after the filter of operator tuning use foreachpartition optimizing write database performance operator tuning using repartition to solve the performance of spark SQL low parallelism Introduction of Reducebykey Local aggregation for operator tuning