Tasktraker Analysis of hadoop

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tasktracker has previously mentioned its responsibilities. It is mainly responsible for maintenance, application, and monitoring tasks, and communicates with jobtracker through heartbeat.

INIT process of tasktracker:

1. Read the configuration file and parse parameters

2. Delete the original user local files on tasktraker and create a new Dir and file

3. Map <taskattemptid, taskinprogress> tasks = new hashmap <taskattemptid, taskinprogress> (); clear map

4. This. runningtasks = new linkedhashmap <taskattemptid, taskinprogress> (); record the linked list of tasks
This. runningjobs = new treemap <jobid, runningjob> (); record the ID of the job

5. initialize jvmmanager:

  mapJvmManager = new JvmManagerForType(tracker.getMaxCurrentMapTasks(),         true, tracker);    reduceJvmManager = new JvmManagerForType(tracker.getMaxCurrentReduceTasks(),        false, tracker);

6. initialize RPC and obtain the jobtracker client for heartbeat communication;

7. A new background thread is used to listen for tasks completed by map.

      this.mapEventsFetcher = new MapEventsFetcherThread();    mapEventsFetcher.setDaemon(true);    mapEventsFetcher.setName(                             "Map-events fetcher for all reduce tasks " + "on " +                              taskTrackerName);    mapEventsFetcher.start();

The run method of the background thread is as follows:

 while (running) {        try {          List <FetchStatus> fList = null;          synchronized (runningJobs) {            while (((fList = reducesInShuffle()).size()) == 0) {              try {                runningJobs.wait();              } catch (InterruptedException e) {                LOG.info("Shutting down: " + this.getName());                return;              }            }          }          // now fetch all the map task events for all the reduce tasks          // possibly belonging to different jobs          boolean fetchAgain = false; //flag signifying whether we want to fetch                                      //immediately again.          for (FetchStatus f : fList) {            long currentTime = System.currentTimeMillis();            try {              //the method below will return true when we have not               //fetched all available events yet              if (f.fetchMapCompletionEvents(currentTime)) {                fetchAgain = true;              }            } catch (Exception e) {              LOG.warn(                       "Ignoring exception that fetch for map completion" +                       " events threw for " + f.jobId + " threw: " +                       StringUtils.stringifyException(e));             }            if (!running) {              break;            }          }          synchronized (waitingOn) {            try {              if (!fetchAgain) {                waitingOn.wait(heartbeatInterval);              }            } catch (InterruptedException ie) {              LOG.info("Shutting down: " + this.getName());              return;            }          }        } catch (Exception e) {          LOG.info("Ignoring exception "  + e.getMessage());        }      }    }

8. initializememorymanagement: initializes the memory settings of each tracktask.

9. A new launcher background thread for map and reducer

   mapLauncher = new TaskLauncher(TaskType.MAP, maxMapSlots);    reduceLauncher = new TaskLauncher(TaskType.REDUCE, maxReduceSlots);    mapLauncher.start();    reduceLauncher.start();

Used to create a sub-JVM to execute map and reduce tasks.

Take a look

Tasklauncher run method: // before preparing the job localize // all the archives taskattemptid taskid = T. gettaskid (); Final localdirallocator ldiralloc = new localdirallocator ("mapred. local. dir "); // simply get the location of the workdir and pass it to the child. the // child will do the actual dir creation final file workdir = new file (New Path (localdirs [Rand. nextint (localdirs. length)], tasktracker. gett Askworkdir (T. getuser (), taskid. getjobid (). tostring (), taskid. tostring (), T. istaskcleanuptask ())). tostring (); string user = tip. getugi (). getUserName (); // set up the child task's configuration. after this call, no localization // of files shoshould happen in the tasktracker's process space. any changes to // The conf object after this will not be reflected to the child. // setupchildtaskconfiguration (Ldiralloc); If (! Prepare () {return;} // accumulates class paths for child. list <string> classpaths = getclasspaths (Conf, workdir, taskdistributedcachemanager); long logsize = tasklog. gettaskloglength (CONF); // build exec child JVM args. vector <string> vargs = getvmargs (taskid, workdir, classpaths, logsize); tracker. addtomemorymanager (T. gettaskid (), T. ismaptask (), conf); // set memory limit using ulimit if feasib Le and necessary... string setup = getvmsetupcmd (); // set up the redirection of the task's stdout and stderr streams file [] logfiles = preparelogfiles (taskid, T. istaskcleanuptask (); file stdout = logfiles [0]; file stderr = logfiles [1]; tracker. gettasktrackerinstrumentation (). reporttasklaunch (taskid, stdout, stderr); Map <string, string> Env = new hashmap <string, string> (); errorinfo = getvmenvir Onment (errorinfo, user, workdir, Conf, ENV, taskid, logsize ); // flatten the Env as a set of export commands list <string> setupcmds = new arraylist <string> (); For (Entry <string, string> entry: Env. entryset () {stringbuffer sb = new stringbuffer (); sb. append ("Export"); sb. append (entry. getkey (); sb. append ("= \" "); sb. append (entry. getvalue (); sb. append ("\" "); setupcmds. add (sb. tostring ();} setup Cmds. add (Setup); launchjvmandwait (setupcmds, vargs, stdout, stderr, logsize, workdir); tracker. gettasktrackerinstrumentation (). reporttaskend (T. gettaskid (); If (exitcodeset) {If (! Killed & exitcode! = 0) {If (exitcode = 65) {tracker. gettasktrackerinstrumentation (). taskfailedping (T. gettaskid ();} Throw new ioexception ("task process exit with nonzero status of" + exitcode + ". ");}}}

The run method is a new child JVM of the current task. It sets the file path, Context Environment, JVM startup parameters, startup commands, and other information for the current task, then call the taskcontroll method to start a new JVM to execute the corresponding task.

The class relationship diagram is as follows:

End with the launchtask of taskcontroller as

10. Start starthealthmonitor (this. fconf );

Let's take a look at the run method of tasklauncher, that is, to continuously obtain new tasks in tasktracker, and then call the startnewtask method.

 if (this.taskStatus.getRunState() == TaskStatus.State.UNASSIGNED ||          this.taskStatus.getRunState() == TaskStatus.State.FAILED_UNCLEAN ||          this.taskStatus.getRunState() == TaskStatus.State.KILLED_UNCLEAN) {        localizeTask(task);        if (this.taskStatus.getRunState() == TaskStatus.State.UNASSIGNED) {          this.taskStatus.setRunState(TaskStatus.State.RUNNING);        }        setTaskRunner(task.createRunner(TaskTracker.this, this, rjob));        this.runner.start();        long now = System.currentTimeMillis();        this.taskStatus.setStartTime(now);        this.lastProgressReport = now;

Tasktracker run method: maintain heartbeat and jobtracker communication to get and kill new tasks. Focus on the heartbeat communication process:

 synchronized (this) {      askForNewTask =         ((status.countOccupiedMapSlots() < maxMapSlots ||           status.countOccupiedReduceSlots() < maxReduceSlots) &&          acceptNewTasks);       localMinSpaceStart = minSpaceStart;    }    if (askForNewTask) {      askForNewTask = enoughFreeSpace(localMinSpaceStart);      long freeDiskSpace = getFreeSpace();      long totVmem = getTotalVirtualMemoryOnTT();      long totPmem = getTotalPhysicalMemoryOnTT();      long availableVmem = getAvailableVirtualMemoryOnTT();      long availablePmem = getAvailablePhysicalMemoryOnTT();      long cumuCpuTime = getCumulativeCpuTimeOnTT();      long cpuFreq = getCpuFrequencyOnTT();      int numCpu = getNumProcessorsOnTT();      float cpuUsage = getCpuUsageOnTT();      status.getResourceStatus().setAvailableSpace(freeDiskSpace);      status.getResourceStatus().setTotalVirtualMemory(totVmem);      status.getResourceStatus().setTotalPhysicalMemory(totPmem);      status.getResourceStatus().setMapSlotMemorySizeOnTT(          mapSlotMemorySizeOnTT);      status.getResourceStatus().setReduceSlotMemorySizeOnTT(          reduceSlotSizeMemoryOnTT);      status.getResourceStatus().setAvailableVirtualMemory(availableVmem);       status.getResourceStatus().setAvailablePhysicalMemory(availablePmem);      status.getResourceStatus().setCumulativeCpuTime(cumuCpuTime);      status.getResourceStatus().setCpuFrequency(cpuFreq);      status.getResourceStatus().setNumProcessors(numCpu);      status.getResourceStatus().setCpuUsage(cpuUsage);    }    //add node health information        TaskTrackerHealthStatus healthStatus = status.getHealthStatus();    synchronized (this) {      if (healthChecker != null) {        healthChecker.setHealthStatus(healthStatus);      } else {        healthStatus.setNodeHealthy(true);        healthStatus.setLastReported(0L);        healthStatus.setHealthReport("");      }    }    //    // Xmit the heartbeat    //    HeartbeatResponse heartbeatResponse = jobClient.heartbeat(status,                                                               justStarted,                                                              justInited,                                                              askForNewTask,                                                               heartbeatResponseId);

This method mainly feedback various performance parameter information on tasktracker to jobtraker, call its heartbeat method, parse the returned results, and analyze the heartbeat mechanism in detail in the next article.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Tasktraker Analysis of hadoop

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Tasktraker Analysis of hadoop

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support