1. Implementation of a standard Mr-job entry:
// parameter True indicates check and print Job and Task Health System.exit (Job.waitforcompletion (true)? 0:1);
The internal implementation of the 2.job.waitforcompletion (true) method
//internal implementation of the Job.waitforcompletion () method Public BooleanWaitForCompletion (Booleanverbose)throwsIOException, Interruptedexception, classnotfoundexception {if(state = =jobstate.define) {submit ();//The core of this approach is the submit () } if(verbose) {//determines whether to print the detailed process of the job run, based on the parameters passed inMonitorandprintjob (); } Else { //Get the completion poll interval from the client. intCompletionpollintervalmillis =Job.getcompletionpollinterval (cluster.getconf ()); while(!Iscomplete ()) { Try{thread.sleep (completionpollintervalmillis); } Catch(Interruptedexception IE) {} }}
3. Internal implementation of the Job class Submit () method
Public voidSubmit ()throwsIOException, Interruptedexception, classnotfoundexception {ensurestate (jobstate.define); Setusenewapi ();//using the MapReduce new APIConnect ();// Returns a "client proxy object Cluster" (belonging to the Job class ) to establish RPC communication with the service-side RM FinalJobsubmitter Submitter =Getjobsubmitter (Cluster.getfilesystem (), cluster.getclient ()); Status= Ugi.doas (NewPrivilegedexceptionaction<jobstatus>() { PublicJobstatus Run ()throwsIOException, Interruptedexception, classnotfoundexception {//Submit Job returnSubmitter.submitjobinternal (Job. This, cluster); } }); State= jobstate.running;//set Jobstatus to RunningLog.info ("The URL to the job:" +Gettrackingurl ());}
3.1.1. View the internal implementation of the Connect () method
Private synchronized voidConnect ()throwsIOException, Interruptedexception, classnotfoundexception {if(Cluster = =NULL) {cluster=Ugi.doas (NewPrivilegedexceptionaction<cluster>() { PublicCluster Run ()throwsIOException, Interruptedexception, classnotfoundexception {//returns a cluster object and takes this object as a member variable of the Job class
That is, the Job class holds a reference to Cluster. return NewCluster (GetConfiguration ()); } }); }}
3.1.2. Viewing the implementation process for new Cluster ()
Public Cluster (inetsocketaddress jobtrackaddr, Configuration conf) throws IOException { this. conf = conf; this. Ugi = usergroupinformation.getcurrentuser (); Initialize (jobtrackaddr, conf); // the focus is on the internal implementation of this method }
3.1.3. There are two implementations of the client proxy object cluster instantiation process:localclientprotocolprovider ( local mode ) and Yarnclientprotocolprovider (yarn mode ).
synchronized(frameworkloader) { for(Clientprotocolprovider provider:frameworkloader) {log.debug ("Trying Clientprotocolprovider:" +Provider.getclass (). GetName ());
//ClientProtocol is the RPC protocol for client-side and NN communication, and according to the RPC communication principle, this Protocol interface must contain a VersionID field.
ClientProtocol ClientProtocol=NULL;Try { if(Jobtrackaddr = =NULL) {ClientProtocol=provider.create (conf); } Else{ClientProtocol=provider.create (jobtrackaddr, conf); } if(ClientProtocol! =NULL) {//initializing cluster internal member variablesClientprotocolprovider =provider; Client= ClientProtocol;//to create a client proxy object for the cluster classLog.debug ("Picked" +Provider.getclass (). GetName ()+ "as the Clientprotocolprovider"); Break; } Else{log.debug ("Cannot pick" +Provider.getclass (). GetName ()+ "as the clientprotocolprovider-returned NULL protocol"); } } Catch(Exception e) {log.info ("Failed to use" +Provider.getclass (). GetName ()+ "Due to error:" +e.getmessage ()); } } }
VersionID fields included in the 3.1.4.ClientProtocol interface
// Version 37:more Efficient serialization format for framework counters Public Static Final long VersionID = 37L;
3.2.1. View the implementation of the Submitjobinternal () method in the Jobsubmitter class:
Mr-job Submission Process