Spark kernel secret -01-spark kernel core terminology parsing

Source: Internet
Author: User
Tags shuffle

Application:

Application is the spark user who created the Sparkcontext instance object and contains the driver program:


Spark-shell is an application because Spark-shell created a Sparkcontext object when it was started, with the name SC:

Job:

As opposed to Spark's action, each action, such as Count, Saveastextfile, and so on, corresponds to a job instance that contains multi-tasking parallel computations.

Driver Program:

The program that runs the main function and creates the Sparkcontext instance

Cluster Manager:

Cluster resource management external services, on Spark now has standalone, yarn, mesos and other three kinds of cluster resource manager, Spark's own standalone mode can meet most of the spark computing environment for cluster resource management needs, Yarn and Mesos are basically only considered when running multiple sets of computing frameworks in a cluster

Worker Node:

A working node in a cluster that can run application code, equivalent to the slave node of Hadoop

Executor:

On a worker node, the worker processes that are started on the application, the task is assigned to run in the process, and the data is stored in memory or disk, it must be noted that each application will have only one executor on a worker node, The tasks of the application are processed concurrently in a multithreaded manner within the executor.


Task:

A unit of work that is sent to executor by driver, typically a task handles a split data, each split is typically the size of a block chunk:


State:

A job is split into many tasks, each set of tasks is called State, and the MapReduce map is like the reduce task, which is based on the fact that state is usually started by reading external data or shuffle data, The end of a state is usually due to the occurrence of a shuffle (such as a reducebykey operation) or the end of the entire job, such as placing data on a storage system such as HDFs:


Spark kernel secret -01-spark kernel core terminology parsing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.