Big Data Imf-l38-mapreduce Insider decryption Lecture notes and summary

Source: Internet
Author: User
Tags hadoop mapreduce

Morning Course: 6:00am

Hadoop MapReduce Insider Decryption:

  1. Mr Schema decryption

  2. Java Operations Mr Combat

"Accompanying notes":

One: Yarn-based MapReduce architecture

1.MapReduce Code program is based on the implementation of mapper and reducer two phases, wherein Mapper is a computational task decomposition into many small tasks for parallel computing, reduce the final statistical work;

2.Hadoop 2.x start is yarn-based (1.x version is not concerned with yarn)

Yarn is the management of all resources of the cluster (such as memory and CPU), ResourceManager, a JVM process is scheduled on each node, NodeManager, receiving requests to wrap these resources in container way, when RM receives the job request,

650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M01/7A/D2/wKioL1a6lC-g0w4xAAC1Z_xFRSY653.png "style=" float: none; "title=" 1.png "alt=" Wkiol1a6lc-g0w4xaac1z_xfrsy653.png "/>

3. When ResourceManager receives the client-submitted request program, it will command NodeManager to start the first container of the program based on the status of the cluster resource on the node where the NodeManager resides. The container is the program's applicationmaster, responsible for the execution of the program's task scheduling, Applicationmanager turn to ResourceManager register their own, After registration, a specific container computing resource will be applied to the Reourcemanager.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/7A/D2/wKioL1a6lDCQu0E_AACRhOZXAaI915.png "title=" 2.png " Style= "Float:none;" alt= "Wkiol1a6ldcqu0e_aacrhozxaai915.png"/>


4. How many container does it take to applicationmaster a program in a street?

Application will run the main method of the program at startup, the method will have data input and related configuration, through which you can know how many container need;

(Container is a unit of computer resources, according to the client request calculation, the cluster will resolve the calculation job, the calculation results include the required contain resources)

Application to run the main method, know how many shards the parser has, how many shards correspond to container, and then consider other resources, such as shuffle, to allocate some resources.


Summary of 5.MapReduce running on yarn


Master-Slave structure

master node, only one :  ResourceManager

control nodes, each Job all have a Mrappmaster

from the node, there are a lot of :  Yarnchild

ResourceManager responsible for:

Receive client-submitted calculation tasks

job give mrappmaster execute

Monitoring Mrappmaster Status of Implementation

Mrappmaster responsible for:

responsible for a Job Task Scheduler performed

put Job Distribution to Yarnchild Execution

Monitoring Yarnchild Status of Implementation

Yarnchild responsible for:

Execution Mrappmaster Assigned calculation Tasks

Mrappmaster in 6.Hadoop MapReduce, equivalent to Driver,hadoop in spark The Yarnchildren in MapReduce corresponds to the coarsegrainedexecutorbackend in spark;

(Hadoop has a considerable amount of loss relative to spark resources)





This article is from the "in the Cloud" blog, be sure to keep this source http://ymzhang.blog.51cto.com/3126300/1741452

Big Data Imf-l38-mapreduce Insider decryption Lecture notes and summary

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.