Apache Spark Architecture

Source: Internet
Author: User

  

1. Driver: Run the main () function of application and create the Sparkcontext.

2, Client: Users submit jobs clients.

3. Worker: Any node in the cluster that can run application code, running one or more executor processes.

4. Executor: The task executor running in the Worker, the Executor starts the thread pool to run the task, and is responsible for the memory or the disk on which the data exists. Every application will apply for their own Executor.
Processing tasks.

5, Sparkcontext: The entire application context, control the life cycle of the application.

6, the basic calculation unit of the Rdd:spark, a group of RDD to form the implementation of the direction of the non-ring diagram Rdd graph.

7. Dag Scheduler: Build a stage-based DAG workflow based on the job and submit the stage to TaskScheduler.

8, TaskScheduler: The Task is distributed to Executor execution.

9. Sparkenv: A thread-level context that stores references to important components of the runtime.

Apache Spark Architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.