Listener initializes job,jobtracker corresponding Tasktracker heartbeat, dispatcher assigns task analysis

Last Update:2017-02-27 Source: Internet

Author: User

Tags log split thread

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Listener initializes job,jobtracker corresponding Tasktracker heartbeat, dispatcher assigns task's source level analysis

Jobtracker and Tasktracker after the start (jobtracker START process Source-level analysis, Tasktracker START process Source-level analysis), Tasktracker communication through the heartbeat and Jobtracker, and get the task assigned to it. After the user submits the job to the Jobtracker, it is placed in the corresponding data structure, and the static is assigned. MapReduce Job submission process Source level Analysis (iii) This article has analyzed the user submits the work the final step, mainly constructs the work corresponding jobinprogress and joins jobs, informs all Jobinprogresslistener.

The default scheduler creates two Listener:jobqueuejobinprogresslistener and Eagertaskinitializationlistener, and the user-submitted jobs are encapsulated into jobinprogress Job to join the two listener.

One, jobqueuejobinprogresslistener.jobadded (job) will put this jobinprogress into Map<jobschedulinginfo, JobInProgress> In Jobqueue.

Two, eagertaskinitializationlistener.jobadded (job) puts this jobinprogress into list<jobinprogress> jobinitqueue, Then call Resortinitqueue () to sort the list by priority first, then start time, and then wake up all threads waiting on this object monitor Jobinitqueue.notifyall (). The Eagertaskinitializationlistener.start () method is already running when the scheduler start, creating a thread Jobinitmanager implements Runnable, Its run method is primarily to monitor the Jobinitqueue list, get the first jobinprogress if it is not empty, and then create a initjob implements Runnable initializes the thread and puts it into the thread pool Executorservice threadPool (this thread pool is implemented by the construction method when building the Eagertaskinitializationlistener object), initjob the thread's The Run method is a word ttm.initjob (job), the Jobtracker initjob (Job) method is invoked to initialize the JIP, and the actual call to Jobinprogress.inittasks () initializes the job. The Inittasks () method code is as follows:

/** * Construct the splits, etc.  
   This is invoked the async * thread So, split-computation doesn ' t block anyone.  
  *///task tasks are divided into two types: Maptask and Reducetask, and their management objects are taskinprogress. Public synchronized void Inittasks () throws IOException, Killinterruptedexception, unknownhostexception {if (t asksinited | |  
    Iscomplete ()) {return;  
        } synchronized (Jobinitkillstatus) {if (jobinitkillstatus.killed | | jobinitkillstatus.initstarted) {  
      Return  
    } jobinitkillstatus.initstarted = true;  
    } log.info ("initializing" + Jobid);  
    Final long starttimefinal = This.starttime;   
      Log job info as the user running the job try {Userugi.doas (new privilegedexceptionaction<object> () { @Override public Object Run () throws Exception {JobHistory.JobInfo.logSubmitted () (Getjobid (), CO  
        NF, Jobfile, Starttimefinal, hasrestarted ()); ReturnNull  
    }  
    });  
    catch (Interruptedexception IE) {throw new IOException (IE);  
          
    }//Log the job Priority setpriority (this.priority);  
          
    Generate security keys needed by Tasks//Generateandstoretokens (); Read input splits and create a map per a split//tasksplitmetainfo[] splits = createsplits (Jobid)  
    ;  
              if (nummaptasks!= splits.length) {throw new IOException ("Number of maps in jobconf doesn ' t match number of" + "Recieved splits for job" + Jobid + "!" + "nummaptasks=" + Nummaptasks + ", #splits =" + S  
    Plits.length); The number of Nummaptasks = Splits.length;//map task is the number of input split//sanity check the locations so we don ' t C Reate/initialize unnecessary tasks for (Tasksplitmetainfo split:splits) {netutils.verifyhostnames (split.g  
    Etlocations ()); } JobtrackeR.getinstrumentation (). Addwaitingmaps (Getjobid (), nummaptasks);  
    Jobtracker.getinstrumentation (). Addwaitingreduces (Getjobid (), numreducetasks);  
    This.queueMetrics.addWaitingMaps (Getjobid (), nummaptasks);  
      
    This.queueMetrics.addWaitingReduces (Getjobid (), numreducetasks); Maps = new Taskinprogress[nummaptasks]; Generate a taskinprogress for each map tasks to process an input split for (int i=0 i < nummaptasks; ++i) {inputlength = SPL  
      Its[i].getinputdatalength ();   
                                   Maps[i] = new Taskinprogress (Jobid, Jobfile,//type is Map task Splits[i],  
    Jobtracker, Conf, this, I, numslotspermap); } log.info ("Input size for Job" + Jobid + "=" + Inputlength +).  
      
    Number of splits = "+ splits.length); Set localitywaitfactor before creating cache Localitywaitfactor = Conf.getfloat (Locality_wait_factor, DE  
    Fault_locality_wait_factor); /* For MAThe P task, put it into the Nonrunningmapcache, is a map<node,list<taskinprogress>&gt, which, in the case of a map task, will be assigned to its input The node on which the split is located. Here, node represents a datanode or rack or data center.  
    Nonrunningmapcache will be used when Jobtracker assigns a map task to Tasktracker.  
        */if (Nummaptasks > 0) {//The Createcache () method generates a map cache Nonrunningmapcache for these taskinprogress objects that are not performing tasks.  
      The slave end of the tasktracker sends the heartbeat to master, it can take the task directly from this cache to execute.  
    Nonrunningmapcache = Createcache (splits, maxlevel);  
      
    }//Set the launch time This.launchtime = Jobtracker.getclock (). GetTime (); Create reduce tasks///second jobinprogress will create a monitoring object for reduce, which is simpler, based on the number of reduce specified in jobconf,///default only Create 1 reduce tasks. Monitoring and scheduling the reduce task is the Taskinprogress class, but the construction method is different,//taskinprogress will be based on different parameters to create specific maptask or reducetask.  
    Similarly,//inittasks () also generates Nonrunningreducecache members through the Createcache () method.  
    This.reduces = new Taskinprogress[numreducetasks]; for (int i = 0; i < numreducetasks; i++) { 
      Reduces[i] = new Taskinprogress (Jobid, Jobfile,//This is the reduce task Nummap  
      Tasks, I, Jobtracker, conf, this, numslotsperreduce); /*reducetask into Nonrunningreduces, which will be used when Jobtracker assigns the reduce task to tasktracker.  
    * * Nonrunningreduces.add (reduces[i]);  
    }//Calculate the minimum number of maps to is complete before//We should start scheduling reduces Completedmapsforreduceslowstart = (int) Math.ceil (conf.getfloat ("Mapred.reduce.slowstart.compl Eted.maps ", Default_completed_maps_percent_for_reduce_slowstart) * nummaptasks))  
          
    ; Same for estimating the total output of all maps Resourceestimator.setthreshhold (completedmapsforredu  
          
    Ceslowstart);  
  Create cleanup Two cleanup tips, one map and one reduce.   
 Create two cleanup tasks, one to clean up the map and one to clean up reduce.   Cleanup = new Taskinprogress[2]; Cleanup map tip. This map doesn ' t with any splits.  
    Just assign an empty//split.  
    Tasksplitmetainfo emptysplit = jobsplit.empty_task_split;  
    Cleanup[0] = new Taskinprogress (Jobid, Jobfile, Emptysplit, Jobtracker, conf, this, nummaptasks, 1);  
      
    Cleanup[0].setjobcleanuptask ();  
    Cleanup reduce tip. CLEANUP[1] = new Taskinprogress (Jobid, Jobfile, Nummaptasks, Numreducetasks, jobtracker, conf, th  
    is, 1);  
      
    Cleanup[1].setjobcleanuptask ();  
    Create two setup tips, one map and one reduce.   
    Create two initialization tasks, one initialization map, one initialization reduce.  
      
    Setup = new Taskinprogress[2]; Setup map tip. This is the map doesn ' t use any split.  
    Just assign an empty//split.  
    Setup[0] = new Taskinprogress (Jobid, Jobfile, Emptysplit, Jobtracker, conf, this, Nummaptasks + 1, 1);  
      
    Setup[0].setjobsetuptask (); Setup RedUCE tip. SETUP[1] = new Taskinprogress (Jobid, Jobfile, Nummaptasks, Numreducetasks + 1, jobtracker, conf,  
    this, 1);  
          
    Setup[1].setjobsetuptask ();  
      Synchronized (jobinitkillstatus) {Jobinitkillstatus.initdone = true;  
      if (jobinitkillstatus.killed) {throw new Killinterruptedexception ("Job" + Jobid + "killed in Init"); }//jobinprogress after the taskinprogress is created, the Jobstatus is finally constructed and the job is being executed,//and then the JobHistory.JobInfo.logInited () record is called  
    The execution log for OB.  
    Tasksinited = true; JobHistory.JobInfo.logInited (Profile.getjobid (), This.launchtime, Nummaptasks, numredu  
          
   Cetasks);  
            Log the number of map and reduce tasks Log.info ("Job + Jobid + initialized successfully with" + Nummaptasks + "Map tasks and" + Numreducetasks + "reduce tasks.");

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More