If more than one job is needed in MapReduce, and some dependencies need to be set between multiple jobs, such as JOB3 needs to rely on JOB1 and JOB2, this uses Jobcontrol, as follows:
Jobcontrol jbcntrl=new Jobcontrol ("Jbcntrl");
Jbcntrl.addjob (JOB1);
Jbcntrl.addjob (JOB2);
Jbcntrl.addjob (JOB3);
Job3.adddependingjob (JOB1);
Job3.adddependingjob (JOB2);
Thread Thecontroller = new Thread (Jbcntrl);
Thecontroller.start ();
while (!jbcntrl.allfinished ()) {
thread.sleep ();
}
Jbcntrl.stop ();
You need to set your own configuration for each job, and then connect multiple jobs together through Jobcontrol.
Since Jobcontrol implements the Runnable interface, you can run Jobcontrol through the thread and finally stop by the Stop method. If you do not run with a thread, it will cause all the jobs in Hadoop to finish and not exit at the end, but the result is output.
Another problem is that after using Jobcontrol to connect to the job, each job specific information, such as map input records and reduce input records can not be displayed, if you need to view the job information, you must go to the Web page to view.