Specifying a task through the MapReduce JobID Stop (Kill)

Source: Internet
Author: User

I. Description

Sometimes we are able to get the ID of a mapreduce task after submitting a task, typically a combination of job_**********_xxxx, which describes how to get Jobid and stop a running task with Jobid through another program.

Second, the process

1. Submit the task and get the ID value.

Normally, when we make a remote commit, we use Job.waitforcompletion (true), a function to submit a task and remotely monitor the execution of the task in eclipse. However, if you use the Job.submit () method, you simply commit the task to the end of the cluster, which is not the case for the task execution on the remote strong cluster.

Job.submit (); System.out.println (Job.getjobid ());

With the Job.submit () method, we can directly end the client's commit behavior after the task is submitted. However, the job's related data before it is submitted to the cluster is already encapsulated in the Job object. So we can get the ID value of the task just submitted by Job.getjobid ().

2. Stop the task by Jobid

To stop a task through jobid, you need to retrieve an example of the task on the cluster (cluster), so you need to complete the related task through jobclient. The following code allows you to complete the ID value of the running task on the cluster:

Package Mr;import Java.io.ioexception;import Java.net.inetsocketaddress;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.mapred.jobclient;import Org.apache.hadoop.mapred.runningjob;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.jobid;import Org.apache.hadoop.mapreduce.jobstatus;public class MRKillJob {public static void Main (string[] args) {mrkilljob test=new mrkilljob (); Test.killjob ("job_1461739723866_0094");} /** * @author Wozipa * @Date 2016-6-9 15:02 * @see Delete a task * @param jobId */public void killjob (String id) {Configuration c Onf=new Configuration (); Conf.set ("Fs.defaultfs", "hdfs://hadoop1:9000"); Conf.set ("Yarn.resourcemanager.address", "hadoop1:8032"); Conf.set ("Mapreduce.jobhistory.address", " 192.98.12.234:10020 "), Conf.set (" Yarn.resourcemanager.scheduler.address "," hadoop1:8030 "); Conf.set (" Mapreduce.framework.name "," yarn "); Conf.set (" Mapreduce.app-submission.cross-platform "," true "); try {jobclient Client=new jobclient (New InetsocketadDress ("192.98.12.234", 8032), conf); Runningjob job=client.getjob (ID); System.out.println (Job.setupprogress ()); System.out.println (Job.cleanupprogress ()); System.out.println (Job.mapprogress ()); System.out.println (Job.reduceprogress ()); System.out.println (Job.getjobname ()); System.out.println (Job.getjobstate ()); System.out.println (Job.iscomplete ()); System.out.println (Job.getfailureinfo ()); System.out.println (Job.gethistoryurl (). toString ()); System.out.println (Job.getid ()); Jobstatus Status=job.getjobstatus (); System.out.println (Status.tostring ()); System.out.println (Job.gettrackingurl ());} catch (IOException e) {//TODO auto-generated catch Blocke.printstacktrace ();}}}

In the hadoop2.x version, there are two ways to execute the MapReduce, MR V1 and yarn. Because here my cluster uses the yarn pattern (mapreduce.framework.name=yarn), so the scheduling of tasks is done with the ResourceManager process in yarn, So to get the Jobclienti object to connect to the cluster, you need to interact with Resourcemnager. So the network location where the Jobclient object was created is the IP address and port number of the Resoucemanager process.

And the location of the ResourceManager process (Id+port) needs to be established in conf, so the configuration of the task remote commit is copied here.

When you get the Jobclient object, you can

Runningjob job=client.getjob (ID);
The function gets the corresponding task handle that is being executed, and then it can then perform the task accordingly.

During the testing process, when the task is completed, the Jobclient object automatically gets the execution information of the task on the job history server. If you do not have the address of the history specified in the early conf, Jobclient will go to the local 0.0.0.0:10020 location to find the historical server, so you need to specify the location of the history server in Conf. So we know why the ip+port of Resoucemanager is set in Conf.

If you need to stop the task, use the

Job.killjob ();

You can complete the termination of the task.

The Runningjob object also contains additional information in the execution of the task, which can be obtained through methods in the Job object, such as the Setupprogress property, which is the percentage of the Setup program's completion, once and for all to know Mapprogress, The meaning of reduceprogress and cleanupprogress. Other properties users can go through the tests on their own.


Specifying a task through the MapReduce JobID Stop (Kill)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.