hadoop2.x Yarn Job Submission (client)

Source: Internet
Author: User
Tags exception handling file copy shuffle
The client submitting the yarn job still uses the Runjar class, and MR1, as can be referenced
http://blog.csdn.net/lihm0_1/article/details/13629375
In the 1.x is submitted to the Jobtracker, and in 2.x replaced by ResourceManager, the client's proxy object also changed, replaced by Yarnrunner, but the approximate process and 1 similar, the main process focused on jobsubmitter.submitjobinternal , including checking output directory legality, setting up job submission information (host and user), obtaining Jobid, and HDFs to copy job required files (Job.jar job.xml split file, etc.) to perform job submission. The submit process is still described here as an example of WordCount.
public static void Main (string[] args) throws Exception {  
    //Create a job  
    Configuration conf = new Configuration (); 
  
   conf.set ("Mapreduce.job.queuename", "P1");
    
    @SuppressWarnings ("deprecation")
		job Job = new Job (conf, "WordCount");  
    Job.setjarbyclass (wordcount.class); 
    Job.setjar ("/root/wordcount-2.3.0.jar");


    Set the input Output type  
    job.setoutputkeyclass (text.class);  
    Job.setoutputvalueclass (intwritable.class);  


    Set up map and reduce class  
    job.setmapperclass (Wordcountmapper.class);  
    Job.setreducerclass (wordcountreduce.class);  


    Sets the input output stream  
    fileinputformat.addinputpath (Job, New Path ("/tmp/a.txt"));  
    Fileoutputformat.setoutputpath (Job, New Path ("/tmp/output"));  


    Job.waitforcompletion (TRUE);/Enter the job submission process, then loop monitoring job status
}  
  
Here is the same as the 1.x, submit the job, Cycle monitor job status
Public Boolean waitforcompletion (Boolean verbose
                                 ) throws IOException, Interruptedexception,
                                          classnotfoundexception {
  if (state = = Jobstate.define) {
    submit ();//Commit
  }
  if (verbose) {
    Monitorandprintjob ();
  } else {
    //Get the completion poll interval from the client.
    int completionpollintervalmillis = 
      Job.getcompletionpollinterval (cluster.getconf ());
    while (!iscomplete ()) {
      try {
        thread.sleep (completionpollintervalmillis);
      } catch ( Interruptedexception IE) {
      }} return
  issuccessful ();
}
The main analysis of the submit function to see how the job is submitted, here and 1.x different. But still divided into two stages 1, Connection Master 2, job submission
public void Submit () 
       throws IOException, Interruptedexception, classnotfoundexception {
  ensurestate ( Jobstate.define);
  Setusenewapi ();
  Connecting RM connect
  ();
  Final Jobsubmitter submitter = 
      Getjobsubmitter (Cluster.getfilesystem (), cluster.getclient ());
  Status = Ugi.doas (new privilegedexceptionaction<jobstatus> () {public
    Jobstatus run () throws IOException, Interruptedexception, 
    classnotfoundexception {
    	//Submit Job return
      submitter.submitjobinternal (Job.this, cluster);
    }
  );
  state = jobstate.running;
  Log.info ("The URL to track job:" + gettrackingurl ());
}
The cluster instance is established when connecting to master, and the following is the cluster constructor, where the key initialization section
Public Cluster (Inetsocketaddress jobtrackaddr, Configuration conf) 
    throws IOException {
  this.conf = conf;
  This.ugi = Usergroupinformation.getcurrentuser ();
  Initialize (jobtrackaddr, conf);
}
The client Agent phase was created with Java.util.ServiceLoader, and the 2.3.0 version currently contains two Localclientprotocolprovider (local jobs) Yarnclientprotocolprovider (yarn job), where the corresponding client is created according to the Mapreduce.framework.name configuration
private void Initialize (Inetsocketaddress jobtrackaddr, Configuration conf) throws IOException {synchronized (FRA Meworkloader) {for (Clientprotocolprovider provider:frameworkloader) {log.debug ("trying clientprotocolprovid
      ER: "+ provider.getclass (). GetName ()); 
      ClientProtocol clientprotocol = null;
        try {if (jobtrackaddr = = null) {//Create Yarnrunner Object ClientProtocol = provider.create (conf);
        else {ClientProtocol = Provider.create (jobtrackaddr, conf);
          }//Initialize cluster internal member variable if (clientprotocol!= null) {Clientprotocolprovider = provider;
          client = ClientProtocol;
          Log.debug ("Picked" + Provider.getclass (). GetName () + "as the Clientprotocolprovider");
        Break else {log.debug ("Cannot pick" + provider.getclass (). GetName () + "as" Clientprotoco
      lprovider-returned NULL Protocol ");  } catch (Exception e) {log.info ("Failed to use" + Provider.getclass (). GetName () + "
      Due to error: "+ e.getmessage ());
  }}///exception handling, if an exception occurs here, it proves that the loaded jar package is problematic and that the jar where the yarnrunner is not present. if (null = = Clientprotocolprovider | | null = = client) {throw new IOException ("Cannot initialize Cluster. Please check your configuration for "+ Mrconfig.framework_name +" and the correspond server addr Esses. ");}}
Provider creating an agent is actually creating a Yranrunner object, because we're not submitting the local, but the yarn job
@Override Public
ClientProtocol Create (Configuration conf) throws IOException {
  if (mrconfig.yarn_framework_ Name.equals (Conf.get (mrconfig.framework_name)) {return
    new Yarnrunner (conf);
  }
  return null;
}
The process of creating a client agent is as follows:
Cluster->clientprotocol (Yarnrunner)->resourcemgrdelegate->client (Yarnclientimpl)->rmClient ( Applicationclientprotocol)
The RPC proxy is created in the Servicestart phase of Yarnclientimpl, and the protocols in it are noted.
protected void Servicestart () throws Exception {
  try {
    rmclient = Clientrmproxy.creatermproxy (GetConfig (),
          applicationclientprotocol.class);
  } catch (IOException e) {
    throw new yarnruntimeexception (e);
  }
  Super.servicestart ();
}
The Yarnrunner constructor is as follows:
Public Yarnrunner (Configuration conf, resourcemgrdelegate resmgrdelegate,
    clientcache clientcache) {
  this.conf = conf;
  try {
    this.resmgrdelegate = resmgrdelegate;
    This.clientcache = Clientcache;
    This.defaultfilecontext = Filecontext.getfilecontext (this.conf);
  } catch (Unsupportedfilesystemexception UFE) {
    throw new RuntimeException ("Error in instantiating Yarnclient", UFE); c8/>}
}
See below for the core submission section Jobsubmitter.submitjobinternal
Jobstatus submitjobinternal (Job job, Cluster Cluster) throws ClassNotFoundException, Interruptedexception, IOException


  {//Check output directory legality, whether already exist, or not set checkspecs (job);
  Configuration conf = job.getconfiguration ();
	Addmrframeworktodistributedcache (conf); Get the login area to store the files used during the execution of the job, the default location/tmp/hadoop-yarn/staging/root/.staging, and the Path can be modified by Yarn.app.mapreduce.am.staging-dir
  Jobstagingarea = Jobsubmissionfiles.getstagingdir (cluster, conf);
  Host name and address settings inetaddress IP = inetaddress.getlocalhost ();
    if (IP!= null) {submithostaddress = Ip.gethostaddress ();
    Submithostname = Ip.gethostname ();
    Conf.set (Mrjobconfig.job_submithost,submithostname);
  Conf.set (mrjobconfig.job_submithostaddr,submithostaddress);
  //Get the new Jobid, where RPC call Jobid Jobid = Submitclient.getnewjobid () is required.
  Job.setjobid (Jobid); Get Submit directory:/tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002 path Submitjobdir = new Path (Jobstagingarea,
  Jobid.tostring ());
  Jobstatus status = NULL; try {COnf.set (Mrjobconfig.user_name, Usergroupinformation.getcurrentuser (). Getshortusername ()); Conf.set ("Hadoop.http.filter.initializers", "
    Org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer ");
    Conf.set (Mrjobconfig.mapreduce_job_dir, submitjobdir.tostring ());
    Log.debug ("Configuring Job" + Jobid + "with" + Submitjobdir + "as the Submit Dir"); Get delegation token for the Dir tokencache.obtaintokensfornamenodes (Job.getcredentials (), new path[] {Sub
    
    Mitjobdir}, Conf);


    Populatetokencache (conf, job.getcredentials ()); Generate a secret to authenticate shuffle transfers if (Tokencache.getshufflesecretkey ()) = = Job.getcredentials
      L) {Keygenerator keyGen;
        try {keyGen = keygenerator.getinstance (shuffle_keygen_algorithm);
      Keygen.init (shuffle_key_length); catch (NoSuchAlgorithmException e) {throw new IOException ("Error Generating Shuffle Key", e);
      } Secretkey Shufflekey = Keygen.generatekey ();
    Tokencache.setshufflesecretkey (shufflekey.getencoded (), job.getcredentials ());
    //Copy the required files to the cluster, the following will be analyzed separately (1) copyandconfigurefiles (Job, Submitjobdir);
    
    Path submitjobfile = Jobsubmissionfiles.getjobconfpath (Submitjobdir); Write a fragmented file Job.split Job.splitmetainfo, the specific write process is the same as MR1, can refer to the previous article Log.debug ("Creating splits at" + jtfs.makequalified (submitjo
    Bdir));
    int maps = Writesplits (job, Submitjobdir);
    Conf.setint (Mrjobconfig.num_maps, MAPS);


    Log.info ("Number of splits:" + maps);
    Write "queue Admins of the queue to which job are being submitted"//to Job file.
    Set queue name String queue = Conf.get (Mrjobconfig.queue_name, jobconf.default_queue_name);
    Accesscontrollist ACL = submitclient.getqueueadmins (queue);


    Conf.set (Tofullpropertyname (queue, QueueACL.ADMINISTER_JOBS.getAclName ()), acl.getaclstring ()); Removing Jobtoken referrals before CopyinG The jobconf to HDFS/as the tasks don ' t need this setting, actually they could break//because of it if present
    As the referral to a//different job.


    Tokencache.cleanuptokenreferral (conf); if (Conf.getboolean (mrjobconfig.job_token_tracking_ids_enabled, mrjobconfig.default_job_token_tracking_id
      s_enabled)) {//ADD HDFS tracking ids arraylist<string> trackingids = new arraylist<string> (); for (token<? extends Tokenidentifier> t:job.getcredentials (). Getalltokens ()) {trackingids.
      Add (T.decodeidentifier (). Gettrackingid ()); } conf.setstrings (Mrjobconfig.job_token_tracking_ids, Trackingids.toarray (New String[trackingids.size ()))
    ;
    
    //write Job file to submit dir//write Job.xml writeconf (conf, submitjobfile); Now, actually submit the job (using the Submit name)//Here only begins to really submit, see below Analysis (2) Printtokens (Jobid, job.getcred Entials ());
    Status = Submitclient.submitjob (Jobid, submitjobdir.tostring (), job.getcredentials ());
    if (status!= null) {return status;
    else {throw new IOException ("Could not launch job");
      Finally {if (status = null) {Log.info ("cleaning up the Staging area" + Submitjobdir);


    if (jtfs!= null && submitjobdir!= null) jtfs.delete (Submitjobdir, true); }
  }
}
(1) The file copy process is as follows, the default copy number is 10
private void Copyandconfigurefiles (Job job, Path Jobsubmitdir) 
throws IOException {
  Configuration conf = Job.getconfiguration ();
  Short replication = (short) conf.getint (job.submit_replication);
  Start copying
  copyandconfigurefiles (Job, Jobsubmitdir, replication);


  Set The working directory
  if (job.getworkingdirectory () = null) {
    job.setworkingdirectory ( Jtfs.getworkingdirectory ());          
  }


The following is a detailed copy of the document, and the notes are more clearly written.
Configures-files,-libjars and-archives.  private void Copyandconfigurefiles (Job job, Path Submitjobdir, short replication) throws IOException {Configuration
  conf = Job.getconfiguration (); if (!) (
             Conf.getboolean (Job.used_generic_parser, False)) {Log.warn ("Hadoop command-line option parsing not performed." + "Implement the Tool interface and execute your application" + "with Toolrunner to remedy this.")
  ;
  //Get all of the command line arguments passed in by the user conf String files = conf.get ("Tmpfiles");
  String libjars = Conf.get ("Tmpjars");
  String archives = conf.get ("tmparchives");


  String Jobjar = Job.getjar ();  Figure out what FS the Jobtracker is using.  Copy the//job to it, under a temporary name.  This allows DFS to work,//and under the local FS also provides Unix-like object loading//semantics. (It, if the job file is deleted right after//submission, we can still run the submission toCompletion)////Create A number of filenames in the Jobtracker ' s FS namespace Log.debug ("Default filesystem:" +
  Jtfs.geturi ()); if (jtfs.exists (Submitjobdir)) {throw new IOException ("not submitting job. Job Directory "+ Submitjobdir +" already exists!! This is unexpected.
  Please check what ' s there in ' + ' directory ');
  } Submitjobdir = Jtfs.makequalified (Submitjobdir);
  Submitjobdir = new Path (Submitjobdir.touri (). GetPath ());
  Fspermission mapredsysperms = new Fspermission (jobsubmissionfiles.job_dir_permission);
  Create working directory Filesystem.mkdirs (Jtfs, Submitjobdir, mapredsysperms);
  Path Filesdir = Jobsubmissionfiles.getjobdistcachefiles (Submitjobdir);
  Path Archivesdir = jobsubmissionfiles.getjobdistcachearchives (Submitjobdir);
  Path Libjarsdir = Jobsubmissionfiles.getjobdistcachelibjars (Submitjobdir); Add all the command line Files/jars and archive//A/I them to jobtrackers filesystem//Create the above required directory if (f
 Iles!= null) {   Filesystem.mkdirs (Jtfs, Filesdir, mapredsysperms);
    string[] Filearr = Files.split (",");
      for (String Tmpfile:filearr) {URI Tmpuri = null;
      try {Tmpuri = new URI (tmpfile);
      catch (URISyntaxException e) {throw new IllegalArgumentException (e);
      Path TMP = new path (Tmpuri);
      Path NewPath = copyremotefiles (Filesdir, tmp, conf, replication);
        try {URI Pathuri = Getpathuri (NewPath, Tmpuri.getfragment ());
      Distributedcache.addcachefile (Pathuri, conf); The catch (urisyntaxexception UE) {//should not throw a URI exception throw new IOException ("Failed to Crea
      Te uri for "+ tmpfile, UE";
    }} if (Libjars!= null) {Filesystem.mkdirs (Jtfs, Libjarsdir, mapredsysperms);
    string[] Libjarsarr = Libjars.split (",");
      for (String tmpjars:libjarsarr) {Path TMP = new path (tmpjars);
      Path NewPath = copyremotefiles (Libjarsdir, tmp, conf, replication); DiStributedcache.addfiletoclasspath (New Path (Newpath.touri (). GetPath ()), Conf); 
    } if (archives!= null) {Filesystem.mkdirs (Jtfs, Archivesdir, mapredsysperms);
    string[] Archivesarr = Archives.split (",");
      for (String Tmparchives:archivesarr) {URI Tmpuri;
      try {Tmpuri = new URI (tmparchives);
      catch (URISyntaxException e) {throw new IllegalArgumentException (e);
      Path TMP = new path (Tmpuri);
      Path NewPath = copyremotefiles (Archivesdir, tmp, conf, replication);
        try {URI Pathuri = Getpathuri (NewPath, Tmpuri.getfragment ());
      Distributedcache.addcachearchive (Pathuri, conf); The catch (urisyntaxexception UE) {//should not throw a URI excpetion throw new IOException ("Failed to Crea
      Te uri for "+ tmparchives, UE"; 
    }} if (Jobjar!= null) {//copy jar to Jobtracker's FS//Use jar name if job isn't named. if ("". Equals (job.geTjobname ())) {Job.setjobname (new Path (Jobjar). GetName ());
    Path Jobjarpath = new Path (Jobjar);
    URI Jobjaruri = Jobjarpath.touri (); If the job jar is already in FS, we don't need to copy it from local FS if (jobjaruri.getscheme () = NULL | | jobjar Uri.getauthority () = = NULL | | !
                                          (Jobjaruri.getscheme (). Equals (Jtfs.geturi (). Getscheme ()) && jobjaruri.getauthority (). Equals ( 
      Jtfs.geturi (). getauthority ())) {//Copy Wordcount.jar, note the copy will be renamed to Job.jar after the past, the number of replicas is 10
      Copyjar (Jobjarpath, Jobsubmissionfiles.getjobjar (submitjobdir), replication);
    Job.setjar (Jobsubmissionfiles.getjobjar (Submitjobdir). toString ());  } else {Log.warn ("No job jar file set. User classes May is found.


  "+" to the "+" to the Job or Job#setjar (String);} Set the timestamps of the Archives and files//Set the public/private visibility of the archives and files Clien tdistributedCachemanager.determinetimestampsandcachevisibilities (conf); Get Delegationtoken for each cached file Clientdistributedcachemanager.getdelegationtokens (conf, job. Getcreden
Tials ()); }
(2) Real Job submission Section
@Override public Jobstatus submitjob (Jobid jobid, String jobsubmitdir, Credentials ts) throws IOException, Interruptedexc
  
  eption {addhistorytoken (TS); Construct necessary information to start the MR AM applicationsubmissioncontext appcontext = createapplicationsub


  Missioncontext (conf, jobsubmitdir, TS); Submit to ResourceManager try {ApplicationID ApplicationID = resmgrdelegate.submitapplication (appcontext


    );
    Applicationreport appmaster = resmgrdelegate. Getapplicationreport (ApplicationID);
            String Diagnostics = (Appmaster = null?)
    "Application is Null": Appmaster.getdiagnostics ()); if (Appmaster = null | | appmaster.getyarnapplicationstate () = = Yarnapplicationstate.failed | | appmaster.g
          Etyarnapplicationstate () = = yarnapplicationstate.killed) {throw new IOException ("Failed to run Job:" +
    Diagnostics); Return Clientcache.getclient (Jobid). Getjobstatus (JobiD);
  catch (Yarnexception e) {throw new IOException (e); }
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.