Client MapReduce commit to yarn process (top)

Source: Internet
Author: User

In MapReduce v1, jobclient is used to interact with Jobtracker to complete job submission, the user creates a job, sets parameters through jobconf, submits and monitors the progress of the job through Jobclient, In Jobclient, there is an internal member variable Jobsubmissionprotocol,jobtracker implements the interface, through which the client and Jobtracker communication completes the job submission

public void init (jobconf conf) throws IOException {  
  String tracker = conf.get ("Mapred.job.tracker", "local");  
  Tasklogtimeout = Conf.getint (  
    tasklog_pull_timeout_key, default_tasklog_timeout);  
  This.ugi = Usergroupinformation.getcurrentuser ();  
  If the Mapred.job.tracker is set to local, the native Localjobrunner is created, otherwise the RPC proxy if  
  ("local". Equals (tracker)) is created {  
    Conf.setnummaptasks (1);  
    This.jobsubmitclient = new Localjobrunner (conf);  
  else {  
    this.jobsubmitclient = Createrpcproxy (jobtracker.getaddress (CONF), conf);  
  }          

Called Sequentially:
Job.waitforcompletion ()
Job.submit ()
Jobclient.submitjobinternal ()
Jobsubmitclient.submitjob (Jobid, submitjobdir.tostring (), Jobcopy.getcredentials ())
Completing a job submission

and yarn Job submission Procotol is Clientrmprotocol, when submitting MRv2 job, first generates cluster information class cluster, There's a frameworkloader inside. The internal variable loads the Clientprotocolprovider implementation class from the configuration file. This is Localclientprotocolprovider and Yarnclientprotocolprovider respectively. The cluster class, in initialize, traverses the Frameworkloader and generates specific ClientProtocol from the Clientprotocolprovider. For example, in the Yarnclientprotocolprovider will determine whether the jobconf in the mapreduce.framework.name is yarn, if it will generate Yarnrunner
Yarnclientprotocolprovider Create method:

@Override Public
ClientProtocol Create (Configuration conf) throws IOException {  
  if (mrconfig.yarn_framework_ Name.equals (Conf.get (mrconfig.framework_name)) {return  
    new Yarnrunner (conf);  
  }  
  return null;  
}

ClientProtocol currently has two implementations of Yarnrunner and Localjobrunner,localjobrunner (mapreduce.framework.name for local) that are mainly performed locally mapreduce , the program can be easily debugged. The Yarnrunner is to submit the job to the yarn.
Yarnrunner initialization and ResourceManager establish RPC link (default is 8032 port), the real and RM communication protocol is CLIENTRMPROTOCOL, All operations of the client and RM interactions are committed through the Yarnrunner member variable rmclient (CLIENTRMPROTOCOL), such as Killapplication, Getnodereports, Getjobcounters, wait.

Public synchronized void Start () {  
  Yarnrpc RPC = yarnrpc.create (GetConfig ());  
  This.rmclient = (clientrmprotocol) rpc.getproxy (  
      clientrmprotocol.class, rmaddress, GetConfig ());  
  if (log.isdebugenabled ()) {  
    log.debug ("Connecting to ResourceManager at" + rmaddress);  
  }  
  Super.start ();  
}

After the initialization of the cluster class is completed, the application is generated, and the RM communication is applied for a application (getnewapplication), and a getnewapplicationresponse is obtained. It encapsulates the ApplicationID, and the smallest, largest resource the RM can provide Capacity

Public interface Getnewapplicationresponse {public  
  abstract ApplicationID Getapplicationid ();  
  Public Resource getminimumresourcecapability ();  
  Public Resource getmaximumresourcecapability ();  
  public void setmaximumresourcecapability (Resource capability);   
}

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/

Resource defines a set of cluster computing resources that are currently only included in the memory and CPU, where the CPU refers to virtual core, where a physical core can be thought of as being abstracted into multiple virtual core rather than one-to-one correspondence

Public abstract class Resource implements comparable<resource> {public  
  abstract int getmemory ();  
  public abstract void Setmemory (int memory);  
  public abstract int getvirtualcores ();  
  public abstract void Setvirtualcores (int vcores);  
}

The

Then requires the construction of Applicationsubmissioncontext, which contains information about starting Mr Am, such as the staging directory path of the submitted job in HDFs (Job.xml, Job.split, Job.splitmetainfo, Libjars, files, archives, etc.), user ugi information, Secure tokens. When the context construct is complete, call the Submitjob method of the Resmgrdelegate.submitapplication (appcontext)
Yarnrunner:

 @Override public jobstatus submitjob (Jobid jobid, String jobsubmitdir, Credentials ts) throws IOException, Interru  
    ptedexception {//construct necessary information to start the MR AM applicationsubmissioncontext appcontext =  
      
  Createapplicationsubmissioncontext (conf, jobsubmitdir, TS);  
      
  Submit to ResourceManager ApplicationID ApplicationID = resmgrdelegate.submitapplication (Appcontext);  
  Applicationreport appmaster = Resmgrdelegate.getapplicationreport (ApplicationID);  
          String Diagnostics = (Appmaster = null?)  
  "Application is Null": Appmaster.getdiagnostics ()); if (Appmaster = null | | appmaster.getyarnapplicationstate () = = Yarnapplicationstate.failed | | appmaster.getyarnap Plicationstate () = = yarnapplicationstate.killed) {throw new IOException ("Failed to run Job:" + Diagnostic  
  s);  
Return Clientcache.getclient (Jobid). Getjobstatus (Jobid); }

Finally, the job status information is obtained through the Getjobstatus method

Org.apache.hadoop.mapreduce.v2.api.records.JobId Jobid =  
  Typeconverter.toyarn (oldjobid);  
Getjobreportrequest request =  
    recordfactory.newrecordinstance (getjobreportrequest.class);  
Request.setjobid (Jobid);  
Jobreport (Getjobreportresponse) Invoke ("Getjobreport",  
    Getjobreportrequest.class, request)). Getjobreport ();

Author: csdn Blog Lalaguozhe

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.