International - English

Cart Console

Topic Center

Contact Sales

Home > Others

hadoop2.x Yarn Job Submission (client)

Last Update:2018-07-23 Source: Internet

Author: User

Tags exception handling file copy shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The client submitting the yarn job still uses the Runjar class, and MR1, as can be referenced
http://blog.csdn.net/lihm0_1/article/details/13629375
In the 1.x is submitted to the Jobtracker, and in 2.x replaced by ResourceManager, the client's proxy object also changed, replaced by Yarnrunner, but the approximate process and 1 similar, the main process focused on jobsubmitter.submitjobinternal , including checking output directory legality, setting up job submission information (host and user), obtaining Jobid, and HDFs to copy job required files (Job.jar job.xml split file, etc.) to perform job submission. The submit process is still described here as an example of WordCount.

public static void Main (string[] args) throws Exception {  
    //Create a job  
    Configuration conf = new Configuration (); 
  
   conf.set ("Mapreduce.job.queuename", "P1");
    
    @SuppressWarnings ("deprecation")
		job Job = new Job (conf, "WordCount");  
    Job.setjarbyclass (wordcount.class); 
    Job.setjar ("/root/wordcount-2.3.0.jar");


    Set the input Output type  
    job.setoutputkeyclass (text.class);  
    Job.setoutputvalueclass (intwritable.class);  


    Set up map and reduce class  
    job.setmapperclass (Wordcountmapper.class);  
    Job.setreducerclass (wordcountreduce.class);  


    Sets the input output stream  
    fileinputformat.addinputpath (Job, New Path ("/tmp/a.txt"));  
    Fileoutputformat.setoutputpath (Job, New Path ("/tmp/output"));  


    Job.waitforcompletion (TRUE);/Enter the job submission process, then loop monitoring job status
}

Here is the same as the 1.x, submit the job, Cycle monitor job status

Public Boolean waitforcompletion (Boolean verbose
                                 ) throws IOException, Interruptedexception,
                                          classnotfoundexception {
  if (state = = Jobstate.define) {
    submit ();//Commit
  }
  if (verbose) {
    Monitorandprintjob ();
  } else {
    //Get the completion poll interval from the client.
    int completionpollintervalmillis = 
      Job.getcompletionpollinterval (cluster.getconf ());
    while (!iscomplete ()) {
      try {
        thread.sleep (completionpollintervalmillis);
      } catch ( Interruptedexception IE) {
      }} return
  issuccessful ();
}

The main analysis of the submit function to see how the job is submitted, here and 1.x different. But still divided into two stages 1, Connection Master 2, job submission

public void Submit () 
       throws IOException, Interruptedexception, classnotfoundexception {
  ensurestate ( Jobstate.define);
  Setusenewapi ();
  Connecting RM connect
  ();
  Final Jobsubmitter submitter = 
      Getjobsubmitter (Cluster.getfilesystem (), cluster.getclient ());
  Status = Ugi.doas (new privilegedexceptionaction<jobstatus> () {public
    Jobstatus run () throws IOException, Interruptedexception, 
    classnotfoundexception {
    	//Submit Job return
      submitter.submitjobinternal (Job.this, cluster);
    }
  );
  state = jobstate.running;
  Log.info ("The URL to track job:" + gettrackingurl ());
}

The cluster instance is established when connecting to master, and the following is the cluster constructor, where the key initialization section

Public Cluster (Inetsocketaddress jobtrackaddr, Configuration conf) 
    throws IOException {
  this.conf = conf;
  This.ugi = Usergroupinformation.getcurrentuser ();
  Initialize (jobtrackaddr, conf);
}

The client Agent phase was created with Java.util.ServiceLoader, and the 2.3.0 version currently contains two Localclientprotocolprovider (local jobs) Yarnclientprotocolprovider (yarn job), where the corresponding client is created according to the Mapreduce.framework.name configuration

private void Initialize (Inetsocketaddress jobtrackaddr, Configuration conf) throws IOException {synchronized (FRA Meworkloader) {for (Clientprotocolprovider provider:frameworkloader) {log.debug ("trying clientprotocolprovid
      ER: "+ provider.getclass (). GetName ()); 
      ClientProtocol clientprotocol = null;
        try {if (jobtrackaddr = = null) {//Create Yarnrunner Object ClientProtocol = provider.create (conf);
        else {ClientProtocol = Provider.create (jobtrackaddr, conf);
          }//Initialize cluster internal member variable if (clientprotocol!= null) {Clientprotocolprovider = provider;
          client = ClientProtocol;
          Log.debug ("Picked" + Provider.getclass (). GetName () + "as the Clientprotocolprovider");
        Break else {log.debug ("Cannot pick" + provider.getclass (). GetName () + "as" Clientprotoco
      lprovider-returned NULL Protocol ");  } catch (Exception e) {log.info ("Failed to use" + Provider.getclass (). GetName () + "
      Due to error: "+ e.getmessage ());
  }}///exception handling, if an exception occurs here, it proves that the loaded jar package is problematic and that the jar where the yarnrunner is not present. if (null = = Clientprotocolprovider | | null = = client) {throw new IOException ("Cannot initialize Cluster. Please check your configuration for "+ Mrconfig.framework_name +" and the correspond server addr Esses. ");}}

Provider creating an agent is actually creating a Yranrunner object, because we're not submitting the local, but the yarn job

@Override Public
ClientProtocol Create (Configuration conf) throws IOException {
  if (mrconfig.yarn_framework_ Name.equals (Conf.get (mrconfig.framework_name)) {return
    new Yarnrunner (conf);
  }
  return null;
}

The process of creating a client agent is as follows:
Cluster->clientprotocol (Yarnrunner)->resourcemgrdelegate->client (Yarnclientimpl)->rmClient ( Applicationclientprotocol)
The RPC proxy is created in the Servicestart phase of Yarnclientimpl, and the protocols in it are noted.

protected void Servicestart () throws Exception {
  try {
    rmclient = Clientrmproxy.creatermproxy (GetConfig (),
          applicationclientprotocol.class);
  } catch (IOException e) {
    throw new yarnruntimeexception (e);
  }
  Super.servicestart ();
}

The Yarnrunner constructor is as follows:

Public Yarnrunner (Configuration conf, resourcemgrdelegate resmgrdelegate,
    clientcache clientcache) {
  this.conf = conf;
  try {
    this.resmgrdelegate = resmgrdelegate;
    This.clientcache = Clientcache;
    This.defaultfilecontext = Filecontext.getfilecontext (this.conf);
  } catch (Unsupportedfilesystemexception UFE) {
    throw new RuntimeException ("Error in instantiating Yarnclient", UFE); c8/>}
}

See below for the core submission section Jobsubmitter.submitjobinternal

Jobstatus submitjobinternal (Job job, Cluster Cluster) throws ClassNotFoundException, Interruptedexception, IOException


  {//Check output directory legality, whether already exist, or not set checkspecs (job);
  Configuration conf = job.getconfiguration ();
	Addmrframeworktodistributedcache (conf); Get the login area to store the files used during the execution of the job, the default location/tmp/hadoop-yarn/staging/root/.staging, and the Path can be modified by Yarn.app.mapreduce.am.staging-dir
  Jobstagingarea = Jobsubmissionfiles.getstagingdir (cluster, conf);
  Host name and address settings inetaddress IP = inetaddress.getlocalhost ();
    if (IP!= null) {submithostaddress = Ip.gethostaddress ();
    Submithostname = Ip.gethostname ();
    Conf.set (Mrjobconfig.job_submithost,submithostname);
  Conf.set (mrjobconfig.job_submithostaddr,submithostaddress);
  //Get the new Jobid, where RPC call Jobid Jobid = Submitclient.getnewjobid () is required.
  Job.setjobid (Jobid); Get Submit directory:/tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002 path Submitjobdir = new Path (Jobstagingarea,
  Jobid.tostring ());
  Jobstatus status = NULL; try {COnf.set (Mrjobconfig.user_name, Usergroupinformation.getcurrentuser (). Getshortusername ()); Conf.set ("Hadoop.http.filter.initializers", "
    Org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer ");
    Conf.set (Mrjobconfig.mapreduce_job_dir, submitjobdir.tostring ());
    Log.debug ("Configuring Job" + Jobid + "with" + Submitjobdir + "as the Submit Dir"); Get delegation token for the Dir tokencache.obtaintokensfornamenodes (Job.getcredentials (), new path[] {Sub
    
    Mitjobdir}, Conf);


    Populatetokencache (conf, job.getcredentials ()); Generate a secret to authenticate shuffle transfers if (Tokencache.getshufflesecretkey ()) = = Job.getcredentials
      L) {Keygenerator keyGen;
        try {keyGen = keygenerator.getinstance (shuffle_keygen_algorithm);
      Keygen.init (shuffle_key_length); catch (NoSuchAlgorithmException e) {throw new IOException ("Error Generating Shuffle Key", e);
      } Secretkey Shufflekey = Keygen.generatekey ();
    Tokencache.setshufflesecretkey (shufflekey.getencoded (), job.getcredentials ());
    //Copy the required files to the cluster, the following will be analyzed separately (1) copyandconfigurefiles (Job, Submitjobdir);
    
    Path submitjobfile = Jobsubmissionfiles.getjobconfpath (Submitjobdir); Write a fragmented file Job.split Job.splitmetainfo, the specific write process is the same as MR1, can refer to the previous article Log.debug ("Creating splits at" + jtfs.makequalified (submitjo
    Bdir));
    int maps = Writesplits (job, Submitjobdir);
    Conf.setint (Mrjobconfig.num_maps, MAPS);


    Log.info ("Number of splits:" + maps);
    Write "queue Admins of the queue to which job are being submitted"//to Job file.
    Set queue name String queue = Conf.get (Mrjobconfig.queue_name, jobconf.default_queue_name);
    Accesscontrollist ACL = submitclient.getqueueadmins (queue);


    Conf.set (Tofullpropertyname (queue, QueueACL.ADMINISTER_JOBS.getAclName ()), acl.getaclstring ()); Removing Jobtoken referrals before CopyinG The jobconf to HDFS/as the tasks don ' t need this setting, actually they could break//because of it if present
    As the referral to a//different job.


    Tokencache.cleanuptokenreferral (conf); if (Conf.getboolean (mrjobconfig.job_token_tracking_ids_enabled, mrjobconfig.default_job_token_tracking_id
      s_enabled)) {//ADD HDFS tracking ids arraylist<string> trackingids = new arraylist<string> (); for (token<? extends Tokenidentifier> t:job.getcredentials (). Getalltokens ()) {trackingids.
      Add (T.decodeidentifier (). Gettrackingid ()); } conf.setstrings (Mrjobconfig.job_token_tracking_ids, Trackingids.toarray (New String[trackingids.size ()))
    ;
    
    //write Job file to submit dir//write Job.xml writeconf (conf, submitjobfile); Now, actually submit the job (using the Submit name)//Here only begins to really submit, see below Analysis (2) Printtokens (Jobid, job.getcred Entials ());
    Status = Submitclient.submitjob (Jobid, submitjobdir.tostring (), job.getcredentials ());
    if (status!= null) {return status;
    else {throw new IOException ("Could not launch job");
      Finally {if (status = null) {Log.info ("cleaning up the Staging area" + Submitjobdir);


    if (jtfs!= null && submitjobdir!= null) jtfs.delete (Submitjobdir, true); }
  }
}

(1) The file copy process is as follows, the default copy number is 10

private void Copyandconfigurefiles (Job job, Path Jobsubmitdir) 
throws IOException {
  Configuration conf = Job.getconfiguration ();
  Short replication = (short) conf.getint (job.submit_replication);
  Start copying
  copyandconfigurefiles (Job, Jobsubmitdir, replication);


  Set The working directory
  if (job.getworkingdirectory () = null) {
    job.setworkingdirectory ( Jtfs.getworkingdirectory ());          
  }

The following is a detailed copy of the document, and the notes are more clearly written.

Configures-files,-libjars and-archives.  private void Copyandconfigurefiles (Job job, Path Submitjobdir, short replication) throws IOException {Configuration
  conf = Job.getconfiguration (); if (!) (
             Conf.getboolean (Job.used_generic_parser, False)) {Log.warn ("Hadoop command-line option parsing not performed." + "Implement the Tool interface and execute your application" + "with Toolrunner to remedy this.")
  ;
  //Get all of the command line arguments passed in by the user conf String files = conf.get ("Tmpfiles");
  String libjars = Conf.get ("Tmpjars");
  String archives = conf.get ("tmparchives");


  String Jobjar = Job.getjar ();  Figure out what FS the Jobtracker is using.  Copy the//job to it, under a temporary name.  This allows DFS to work,//and under the local FS also provides Unix-like object loading//semantics. (It, if the job file is deleted right after//submission, we can still run the submission toCompletion)////Create A number of filenames in the Jobtracker ' s FS namespace Log.debug ("Default filesystem:" +
  Jtfs.geturi ()); if (jtfs.exists (Submitjobdir)) {throw new IOException ("not submitting job. Job Directory "+ Submitjobdir +" already exists!! This is unexpected.
  Please check what ' s there in ' + ' directory ');
  } Submitjobdir = Jtfs.makequalified (Submitjobdir);
  Submitjobdir = new Path (Submitjobdir.touri (). GetPath ());
  Fspermission mapredsysperms = new Fspermission (jobsubmissionfiles.job_dir_permission);
  Create working directory Filesystem.mkdirs (Jtfs, Submitjobdir, mapredsysperms);
  Path Filesdir = Jobsubmissionfiles.getjobdistcachefiles (Submitjobdir);
  Path Archivesdir = jobsubmissionfiles.getjobdistcachearchives (Submitjobdir);
  Path Libjarsdir = Jobsubmissionfiles.getjobdistcachelibjars (Submitjobdir); Add all the command line Files/jars and archive//A/I them to jobtrackers filesystem//Create the above required directory if (f
 Iles!= null) {   Filesystem.mkdirs (Jtfs, Filesdir, mapredsysperms);
    string[] Filearr = Files.split (",");
      for (String Tmpfile:filearr) {URI Tmpuri = null;
      try {Tmpuri = new URI (tmpfile);
      catch (URISyntaxException e) {throw new IllegalArgumentException (e);
      Path TMP = new path (Tmpuri);
      Path NewPath = copyremotefiles (Filesdir, tmp, conf, replication);
        try {URI Pathuri = Getpathuri (NewPath, Tmpuri.getfragment ());
      Distributedcache.addcachefile (Pathuri, conf); The catch (urisyntaxexception UE) {//should not throw a URI exception throw new IOException ("Failed to Crea
      Te uri for "+ tmpfile, UE";
    }} if (Libjars!= null) {Filesystem.mkdirs (Jtfs, Libjarsdir, mapredsysperms);
    string[] Libjarsarr = Libjars.split (",");
      for (String tmpjars:libjarsarr) {Path TMP = new path (tmpjars);
      Path NewPath = copyremotefiles (Libjarsdir, tmp, conf, replication); DiStributedcache.addfiletoclasspath (New Path (Newpath.touri (). GetPath ()), Conf); 
    } if (archives!= null) {Filesystem.mkdirs (Jtfs, Archivesdir, mapredsysperms);
    string[] Archivesarr = Archives.split (",");
      for (String Tmparchives:archivesarr) {URI Tmpuri;
      try {Tmpuri = new URI (tmparchives);
      catch (URISyntaxException e) {throw new IllegalArgumentException (e);
      Path TMP = new path (Tmpuri);
      Path NewPath = copyremotefiles (Archivesdir, tmp, conf, replication);
        try {URI Pathuri = Getpathuri (NewPath, Tmpuri.getfragment ());
      Distributedcache.addcachearchive (Pathuri, conf); The catch (urisyntaxexception UE) {//should not throw a URI excpetion throw new IOException ("Failed to Crea
      Te uri for "+ tmparchives, UE"; 
    }} if (Jobjar!= null) {//copy jar to Jobtracker's FS//Use jar name if job isn't named. if ("". Equals (job.geTjobname ())) {Job.setjobname (new Path (Jobjar). GetName ());
    Path Jobjarpath = new Path (Jobjar);
    URI Jobjaruri = Jobjarpath.touri (); If the job jar is already in FS, we don't need to copy it from local FS if (jobjaruri.getscheme () = NULL | | jobjar Uri.getauthority () = = NULL | | !
                                          (Jobjaruri.getscheme (). Equals (Jtfs.geturi (). Getscheme ()) && jobjaruri.getauthority (). Equals ( 
      Jtfs.geturi (). getauthority ())) {//Copy Wordcount.jar, note the copy will be renamed to Job.jar after the past, the number of replicas is 10
      Copyjar (Jobjarpath, Jobsubmissionfiles.getjobjar (submitjobdir), replication);
    Job.setjar (Jobsubmissionfiles.getjobjar (Submitjobdir). toString ());  } else {Log.warn ("No job jar file set. User classes May is found.


  "+" to the "+" to the Job or Job#setjar (String);} Set the timestamps of the Archives and files//Set the public/private visibility of the archives and files Clien tdistributedCachemanager.determinetimestampsandcachevisibilities (conf); Get Delegationtoken for each cached file Clientdistributedcachemanager.getdelegationtokens (conf, job. Getcreden
Tials ()); }

(2) Real Job submission Section

@Override public Jobstatus submitjob (Jobid jobid, String jobsubmitdir, Credentials ts) throws IOException, Interruptedexc
  
  eption {addhistorytoken (TS); Construct necessary information to start the MR AM applicationsubmissioncontext appcontext = createapplicationsub


  Missioncontext (conf, jobsubmitdir, TS); Submit to ResourceManager try {ApplicationID ApplicationID = resmgrdelegate.submitapplication (appcontext


    );
    Applicationreport appmaster = resmgrdelegate. Getapplicationreport (ApplicationID);
            String Diagnostics = (Appmaster = null?)
    "Application is Null": Appmaster.getdiagnostics ()); if (Appmaster = null | | appmaster.getyarnapplicationstate () = = Yarnapplicationstate.failed | | appmaster.g
          Etyarnapplicationstate () = = yarnapplicationstate.killed) {throw new IOException ("Failed to run Job:" +
    Diagnostics); Return Clientcache.getclient (Jobid). Getjobstatus (JobiD);
  catch (Yarnexception e) {throw new IOException (e); }
}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

yarn warehouse yarn separator tsc yarn unravel yarn sublime yarn sbir submission dmoz submission

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

hadoop2.x Yarn Job Submission (client)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support