[Interactive Q & A sharing] Stage 1 wins the public welfare lecture hall of spark Asia Pacific Research Institute in the cloud computing Big Data age

Source: Internet
Author: User
Tags shuffle

 

"Winning the cloud computing Big Data era"

Spark Asia Pacific Research Institute Stage 1 Public Welfare lecture hall [Stage 1 interactive Q & A sharing]

 

Q1: Can spark shuffle point spark_local_dirs to a solid state drive to speed up execution.

  • You can point spark_local_dirs to a solid state drive, which can greatly improve the spark execution speed;

  • At the same time, if you want to increase the spark Running Speed faster, you can specify multiple shuffle output directories to allow shuffle to read and write disks in parallel;


Q2: Solidation = true: only merge on the same machine, right?

  • Solidation = true is to merge on the same machine;

  • When merging, the bucket belonging to the same CER is put into the same file, which greatly reduces the number of shuffler files and improves performance;


Q3: Will spark and hadoop coexist in the future?

  • Spark and hadoop will coexist, spark + hadoop = a winning combination;

  • In the coexistence, hadoop mainly uses HDFS for data storage, and spark is responsible for integrated and diversified big data computing;


This article is from the spark Asia Pacific Research Institute blog, please be sure to keep this source http://rockyspark.blog.51cto.com/2229525/1565214

[Interactive Q & A sharing] Stage 1 wins the public welfare lecture hall of spark Asia Pacific Research Institute in the cloud computing Big Data age

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.