Yarn am communicates with RM

Last Update:2014-09-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Appmaster requests resources from RM

// The appmaster sends a heartbeat request to RM, updates the resource request structure, and extracts the allocated resources from the allocated memory structure. The specific task allocation is the backend asynchronous heartbeat driving by NM: serviceinit // service to allocate containers from RM (if non-Uber) or to fake it (uber) containerallocator = createcontainerallocator (null, context); addifservice (containerallocator); dispatcher. register (containerallocator. eventtype. class, containerallocator); protected containerallocator createcontainerallocator (Final Clientservice, final appcontext context) {return new centers (clientservice, context); //} private final class extends abstractservice implements containerallocator, rmheartbeathandler {private final clientservice; private Final appcontext context; private containerallocator ;..... @ override protected void s Ervicestart () throws exception {If (job. isuber () {This. containerallocator = new locallocalinerallocator (this. clientservice, this. context, nmhost, nmport, nmhttpport, containerid);} else {This. containerallocator = new rmcontainerallocator (// This. clientservice, this. context);} (service) This. containerallocator ). init (getconfig (); (service) This. containerallocator ). start (); super. servicest Art (); Org. apache. hadoop. mapreduce. v2.app. rm; rmcontainerallocator class has this method protected synchronized void HEARTBEAT () throws exception {schedulestats. updateandlogifchanged ("before scheduling:"); List <container> allocatedcontainers = getresources (); // send a heartbeat message remotely from RM, note that there may not be any new resource request information in the heartbeat. // you only need to tell RM that it is still alive, or you just need to get the allocated resource from RM if (allocatedcontainers. size ()> 0) {scheduledrequests. assign (allocatedcontainers); // obtain the conta Specifically, the iner is assigned to the task (which should be re-ordered)} fields included in the resource request: priority, expected host, memory size, etc. (three-way replication by default, there may be 7 resource requests, 3 local, 3 rack, and 1 random)} The method of rmcontainerallocator parent class rmcommunicator protected void startallocatorthread () {allocatorthread = new thread (New runnable () {@ override public void run () {While (! Stopped. Get ()&&! Thread. currentthread (). isinterrupted () {try {thread. sleep (rmpollinterval); // by default, try {HEARTBEAT (); // send the heartbeat... private list <container> getresources () throws exception {int headroom = getavailableresources ()! = NULL? Getavailableresources (). getmemory (): 0; // first time it wocould be null allocateresponse response;/** if contact with RM is lost, the am will wait mr_am_to_rm_wait_interval_ms * milliseconds before aborting. during this interval, am will still try * to contact the RM. */try {response = makeremoterequest (); // The Key makeremoterequest method is the method defined by its parent class rmcontainerrequestor. Protected allocateresponse makeremot Erequest () throws ioexception {resourceblacklistrequest blacklistrequest = resourceblacklistrequest. newinstance (New arraylist <string> (blacklistadditions), new arraylist <string> (blacklistremovals); allocaterequest = // create a resource request allocaterequest. newinstance (lastresponseid, super. getapplicationprogress (), new arraylist <resourcerequest> (ASK), // This ask is a collection class, stores resourcerequest instances, // There are only new methods, where New arraylist <containerid> (release), blacklistrequest); allocateresponse; try {allocateresponse = scheduler. allocate (allocaterequest); // the key is to allocate resources. The sched here is not the scheduler // But applicationmasterprotocol, he will eventually call the scheduler to create a new protected applicationmasterprotocol schedator ;... protected void servicestart () throws exception {scheduler = createschedulerproxy ();.. prot Ected applicationmasterprotocol createschedulerproxy () {final configuration conf = getconfig (); try {return clientrmproxy. creatermproxy (Conf, applicationmasterprotocol. class); // applicationmasterprotocol protocol is critical // call the method in applicationmasterservice remotely} catch (ioexception e) {Throw new yarnruntimeexception (E );}} // The next trace of the value assignment of ask is where to call the value assignment method of // ask, and the last is the addcontainerreq method. This method calls PRI in rmcontainerallocator. Vate void addresourcerequesttoask (resourcerequest remoterequest) {// because objects inside the resource map can be deleted ask can end up // containing an object that matches new resource object but with different // numcontainers. so exisintg values must be replaced explicitly if (ask. contains (remoterequest) {ask. remove (remoterequest);} Ask. add (remoterequest);} protected void addcontainerreq (Containerrequest req) {// create resource requests for (string HOST: Req. hosts) {// data-local if (! Isnodeblacklisted (host) {addresourcerequest (req. priority, host, req. capability) ;}// nothing rack-local for now (string Rack: req. racks) {addresourcerequest (req. priority, Rack, req. capability);} // off-switch addresourcerequest (req. priority, resourcerequest. any, req. capability);} rmcontainerallocator void addmap (containerrequestevent event) {// addmap method containerrequest request = NULL; If (event. getearlierattemptfailed () {earlierfailedmaps. add (event. getattemptid (); Request = new containerrequest (event, priority_fast_fail_map); log.info ("added" + event. getattemptid () + "to list of failed Maps");} else {for (string HOST: event. gethosts () {parameter list <taskattemptid> List = mapshow.apping. get (host); If (list = NULL) {list = new shortlist <taskattemptid> (); mapshostmapping. P Ut (host, list);} List. add (event. getattemptid (); If (log. isdebugenabled () {log. debug ("added attempt req to host" + host) ;}for (string Rack: event. getracks () {referlist <taskattemptid> List = mapsrackmapping. get (Rack); If (list = NULL) {list = new shortlist <taskattemptid> (); mapsrackmapping. put (rack, list);} List. add (event. getattemptid (); If (log. isdebugenabled () {log. debug ("added Attempt req to rack "+ rack) ;}} request = new containerrequest (event, priority_map);} maps. put (event. getattemptid (), request); addcontainerreq (request); // call // addmap within this method is called protected synchronized void handleevent (containerallocatorevent event) {recalculatereduceschedule = true ;.................. scheduledrequests. addmap (reqevent); // maps are immediately scheduled protected void servicestart () Throws exception {This. eventhandlingthread = new thread () {@ suppresswarnings ("unchecked") @ override public void run () {containerallocatorevent; while (! Stopped. Get ()&&! Thread. currentthread (). isinterrupted () {try {event = rmcontainerallocator. This. eventqueue. Take (); // retrieve event} catch (interruptedexception e) {If (! Stopped. get () {log. error ("returning, interrupted:" + E);} return;} Try {handleevent (event); // call // Add the event to the mrappmaster, the added event is processed in the preceding method. Where has this method been called? Public void handle (containerallocatorevent event) {This. containerallocator. Handle (event );}

Rm side accepts appmaster heartbeat request

// Conclusion: applicationmaster finally reports resource requirements to RM through applicationmasterprotocol # allocate. The applicationmasterservice of RM provides services, and finally calls the scheduler's allocate // write new resource requirements into the memory structure, and return the allocated resources public class applicationmasterservice extends abstractservice implements applicationmasterprotocol {public allocateresponse allocate (allocaterequest request) throws yarnexception, ioexception {.. // allow only one thread in AM to do heartbeat at a Tim E. synchronized (lastresponse) {// send the status update to the appattempt. this. rmcontext. getdispatcher (). geteventhandler (). handle (New rmappattemptstatusupdateevent (appattemptid, request. getprogress (); List <resourcerequest> ask = request. getasklist (); // ask, release is the encapsulated request list <containerid> release = request. getreleaselist (// send new requests to appattempt. allocation allocation = This. rsch Edges. allocate (appattemptid, ask, release, blacklistadditions, blacklistremovals); // calls rscheduler on the RM end .. allocateresponse. setupdatednodes (updatednodereports);} // encapsulate a response and return allocateresponse. setallocatedcontainers (allocation. getcontainers (); allocateresponse. setcompletedcontainersstatuses (appattempt. pulljustfinishedcontainers (); allocateresponse. setresponseid (lastresponse. getresponseid () + 1); allocateresponse. setavailableresources (allocation. getresourcelimit (); allocateresponse. setnumclusternodes (this. rscheduler. getnumclusternodes (); // Add preemption to the allocateresponse message (if any) allocateresponse. setpreemptionmessage (generatepreemptionmessage (Allocation); // Adding nmtokens for allocated containers. if (! Allocation. getcontainers (). isempty () {allocateresponse. setnmtokens (rmcontext. getnmtokensecretmanager (). createandgetnmtokens (App. getuser (), appattemptid, // allocate method of FIFO scheduler... // update application requests application. updateresourcerequests (ASK); // write the resource request to the application's request memory structure. After the NM sends the heartbeat allocation, write it to the application's allocation memory structure, // The final map <priority, Map <string, resourcerequest> requests = // New hashmap <priority, Map <string, resourcerequest> ();... return new allocation (application. pullnewlyallocatedcontainers (), // collection class inside the application, which obtains the application from the allocated memory structure. getheadroom (); // The application is ficaschedulerapp synchronized public list <container> events () {list <container> returncontainerlist = new arraylist <container> (newlyallocatedcontainers. size (); For (rmconta Iner rmcontainer: newlyallocatedcontainers) {// It is only obtained from newlyallocatedcontainers. the value assigned by newlyallocatedcontainers is the rmcontainer assigned after the assigncontainer is called after nm sends heartbeat. handle (New rmcontainerevent (rmcontainer. getcontainerid (), rmcontainereventtype. acquired); returncontainerlist. add (rmcontainer. getcontainer ();} newlyallocatedcontainers. clear (); Return returncontainerlist;} synchronized public rmcontainer allocate (Nodetype type, ficaschedulernode node, priority, resourcerequest request, container ){.... // Add it to allcontainers list. newlyallocatedcontainers. add (rmcontainer); // assign a value to it // The FIFO schediner class calls the above method. This method is the final method of sending heartbeat to nm. Private int assigncontainer (ficaschedulernode, ficaschedulerapp application, priority priority, int assignablecontainers, resourcerequest request, nodetype typ E ){....} // create the Container = builderutils. newcontainer (containerid, nodeid, node. getrmnode (). gethttpaddress (), capability, priority, containertoken); // allocate! // Inform the application rmcontainer = application. allocate (type, node, priority, request, container); // In conclusion, the appmaster sends a request to RM, which is a resource request returned from the current memory structure, this process is asynchronous. When the NM sends a heartbeat, resources are allocated according to the resource request of the appmaster. // The resources are written to the memory structure and obtained by the appmaster (the sent resource request must be saved first, the memory structure of the resource request is saved in application ficaschedulerapp. showrequests ()

Yarn am communicates with RM

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More