Spark Technology Insider: worker source code and Architecture Analysis

Source: Internet
Author: User

First, we will use a spark architecture diagram to understand the role and position of worker in Spark:


Worker has the following roles:

1. receive commands from the master to start or kill executor.

2. Accept the master command to start or kill the driver.

3. report the status of executor/driver to master

4. Heartbeat to the master, and heartbeat times out, the master considers that the worker has crashed and cannot work.

5. Report the worker status to the GUI


To put it bluntly, the worker is actually working in the whole cluster. First, let's take a look at the important data structure of worker:

  val executors = new HashMap[String, ExecutorRunner]  val finishedExecutors = new HashMap[String, ExecutorRunner]  val drivers = new HashMap[String, DriverRunner]  val finishedDrivers = new HashMap[String, DriverRunner]

These hash maps store the ing between the name and the object time, so that you can directly find the object through the name for calling.

Let's take a look at how to start executor:

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>      if (masterUrl != activeMasterUrl) {        logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")      } else {        try {          logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))          val manager = new ExecutorRunner(appId, execId, appDesc, cores_, memory_,            self, workerId, host,            appDesc.sparkHome.map(userSparkHome => new File(userSparkHome)).getOrElse(sparkHome),            workDir, akkaUrl, ExecutorState.RUNNING)          executors(appId + "/" + execId) = manager          manager.start()          coresUsed += cores_          memoryUsed += memory_          masterLock.synchronized {            master ! ExecutorStateChanged(appId, execId, manager.state, None, None)          }        } catch {          case e: Exception => {            logError("Failed to launch executor %s/%d for %s".format(appId, execId, appDesc.name))            if (executors.contains(appId + "/" + execId)) {              executors(appId + "/" + execId).kill()              executors -= appId + "/" + execId            }            masterLock.synchronized {              master ! ExecutorStateChanged(appId, execId, ExecutorState.FAILED, None, None)            }          }        }


Lines 1 to 3 verify whether the command is from a valid master. Lines 7 to 10 define an executorrunner. In fact, the system does not have a class called executor. What we call executor is actually implemented by executorrunner. This name is also appropriate. The new executor will be placed in the hash map mentioned above. Then start the executor in 12 rows. Statistics on core and memory used by Lines 13 and 14. Lines 15 to 17 actually report the executor status to the master. Locks are required here.

If an exception is thrown during this process, check whether the executor has been added to the hash map. If yes, stop it first and then delete it from the hash map. And report to the master that executor is failed. The master restarts the new executor.


Next, let's take a look at the use of the driver's hash map through killdriver:

    case KillDriver(driverId) => {      logInfo(s"Asked to kill driver $driverId")      drivers.get(driverId) match {        case Some(runner) =>          runner.kill()        case None =>          logError(s"Asked to kill unknown driver $driverId")      }    }

The killdirver command is actually issued by the master, and the master actually receives the kill driver command from the client. This also shows the simplicity of scala.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.