In MapReduce v1, jobclient is used to interact with Jobtracker to complete job submission, the user creates a job, sets parameters through jobconf, submits and monitors the progress of the job through Jobclient, In Jobclient, there is an internal member variable Jobsubmissionprotocol,jobtracker implements the interface, through which the client and Jobtracker communication completes the job submission
public void init (jobconf conf) throws IOException {
String tracker = conf.get ("Mapred.job.tracker", "local");
Tasklogtimeout = Conf.getint (
tasklog_pull_timeout_key, default_tasklog_timeout);
This.ugi = Usergroupinformation.getcurrentuser ();
If the Mapred.job.tracker is set to local, the native Localjobrunner is created, otherwise the RPC proxy if
("local". Equals (tracker)) is created {
Conf.setnummaptasks (1);
This.jobsubmitclient = new Localjobrunner (conf);
else {
this.jobsubmitclient = Createrpcproxy (jobtracker.getaddress (CONF), conf);
}
Called Sequentially:
Job.waitforcompletion ()
Job.submit ()
Jobclient.submitjobinternal ()
Jobsubmitclient.submitjob (Jobid, submitjobdir.tostring (), Jobcopy.getcredentials ())
Completing a job submission
and yarn Job submission Procotol is Clientrmprotocol, when submitting MRv2 job, first generates cluster information class cluster, There's a frameworkloader inside. The internal variable loads the Clientprotocolprovider implementation class from the configuration file. This is Localclientprotocolprovider and Yarnclientprotocolprovider respectively. The cluster class, in initialize, traverses the Frameworkloader and generates specific ClientProtocol from the Clientprotocolprovider. For example, in the Yarnclientprotocolprovider will determine whether the jobconf in the mapreduce.framework.name is yarn, if it will generate Yarnrunner
Yarnclientprotocolprovider Create method:
@Override Public
ClientProtocol Create (Configuration conf) throws IOException {
if (mrconfig.yarn_framework_ Name.equals (Conf.get (mrconfig.framework_name)) {return
new Yarnrunner (conf);
}
return null;
}
ClientProtocol currently has two implementations of Yarnrunner and Localjobrunner,localjobrunner (mapreduce.framework.name for local) that are mainly performed locally mapreduce , the program can be easily debugged. The Yarnrunner is to submit the job to the yarn.
Yarnrunner initialization and ResourceManager establish RPC link (default is 8032 port), the real and RM communication protocol is CLIENTRMPROTOCOL, All operations of the client and RM interactions are committed through the Yarnrunner member variable rmclient (CLIENTRMPROTOCOL), such as Killapplication, Getnodereports, Getjobcounters, wait.
Public synchronized void Start () {
Yarnrpc RPC = yarnrpc.create (GetConfig ());
This.rmclient = (clientrmprotocol) rpc.getproxy (
clientrmprotocol.class, rmaddress, GetConfig ());
if (log.isdebugenabled ()) {
log.debug ("Connecting to ResourceManager at" + rmaddress);
}
Super.start ();
}
After the initialization of the cluster class is completed, the application is generated, and the RM communication is applied for a application (getnewapplication), and a getnewapplicationresponse is obtained. It encapsulates the ApplicationID, and the smallest, largest resource the RM can provide Capacity
Public interface Getnewapplicationresponse {public
abstract ApplicationID Getapplicationid ();
Public Resource getminimumresourcecapability ();
Public Resource getmaximumresourcecapability ();
public void setmaximumresourcecapability (Resource capability);
}
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/
Resource defines a set of cluster computing resources that are currently only included in the memory and CPU, where the CPU refers to virtual core, where a physical core can be thought of as being abstracted into multiple virtual core rather than one-to-one correspondence
Public abstract class Resource implements comparable<resource> {public
abstract int getmemory ();
public abstract void Setmemory (int memory);
public abstract int getvirtualcores ();
public abstract void Setvirtualcores (int vcores);
}
The
Then requires the construction of Applicationsubmissioncontext, which contains information about starting Mr Am, such as the staging directory path of the submitted job in HDFs (Job.xml, Job.split, Job.splitmetainfo, Libjars, files, archives, etc.), user ugi information, Secure tokens. When the context construct is complete, call the Submitjob method of the Resmgrdelegate.submitapplication (appcontext)
Yarnrunner:
@Override public jobstatus submitjob (Jobid jobid, String jobsubmitdir, Credentials ts) throws IOException, Interru
ptedexception {//construct necessary information to start the MR AM applicationsubmissioncontext appcontext =
Createapplicationsubmissioncontext (conf, jobsubmitdir, TS);
Submit to ResourceManager ApplicationID ApplicationID = resmgrdelegate.submitapplication (Appcontext);
Applicationreport appmaster = Resmgrdelegate.getapplicationreport (ApplicationID);
String Diagnostics = (Appmaster = null?)
"Application is Null": Appmaster.getdiagnostics ()); if (Appmaster = null | | appmaster.getyarnapplicationstate () = = Yarnapplicationstate.failed | | appmaster.getyarnap Plicationstate () = = yarnapplicationstate.killed) {throw new IOException ("Failed to run Job:" + Diagnostic
s);
Return Clientcache.getclient (Jobid). Getjobstatus (Jobid); }
Finally, the job status information is obtained through the Getjobstatus method
Org.apache.hadoop.mapreduce.v2.api.records.JobId Jobid =
Typeconverter.toyarn (oldjobid);
Getjobreportrequest request =
recordfactory.newrecordinstance (getjobreportrequest.class);
Request.setjobid (Jobid);
Jobreport (Getjobreportresponse) Invoke ("Getjobreport",
Getjobreportrequest.class, request)). Getjobreport ();
Author: csdn Blog Lalaguozhe