Spark RDD iterator sparkenv Function--(video note)

Source: Internet
Author: User
Tags shuffle spark rdd

Sparkenv is the SPARK environment variable

1, from which get cache can be

2. Manage and save run-time objects for master Workder driver.

3, Excutorid, excutor a type of driver, a specific processing of the task internal thread pool Excutor

4, Actorsystem, if run in driver is spark driver, if on excutor, then spark Excutor

5. Serializer Serializer

6, CacheManager

7, Mapoutputtracker, it is responsible for saving shuffle map output location information.

The data that is produced in a stage is written to LocalFileSystem by shuffle write, where it is stored and recorded by Mapoutputtracker.

Master slave mode, Mapoutputtrackerworker,worker summary master gets information on the driver on the mapoutputtrackermaster,work.

8, Shufflemanager

Hash

Sort

Pluggable, supports expansion

9, Broadcastmanager Broadcast,

For example

When you join, a small table can be broadcast to the machine on which the large table is located.

You can also broadcast the global information.

Spark broadcasts the task to a specific excutor;hadoop Mr, each time the configuration information is logged, and each task is reloaded.

10, Blocktransferservice

Read shuffle data, there is data size difference, different amount of data using different transmission mode. Netty Way is also the way of NiO.

11, Blockmanager

Manage memory and disk, etc... Manage the storage module itself.

12. SecurityManager Safety Module

13, Httpfileserver

A server that provides HTTP services for Excutor to download dependencies on the associated execution jar package.

14, Metricssystem

Used to collect statistical information.

This includes the status of the Excutor, as well as the status of the task.

It works for monitoring tools.

15, Shufflememorymanager

itself is used to manage memory during the execution of shuffle.

Application and allocation of memory used by shuffle,

Assuming n threads, each thread will request to 1/(2N) and can request up to 1/n of memory

n is a dynamic change, and the number of threads changes.

16, Sparkenv is created with Sparkcontext.

Spark RDD iterator sparkenv Function--(video note)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.