Spark analysis-standalone operation process analysis

Last Update:2014-08-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Cluster Startup Process-start master

$SPARK_HOME/sbin/start-master.sh

Start-master.sh script key content:

spark-daemon.sh start org.apache.spark.deploy.master.Master 1 --ip $SPARK_MASTER_IP --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT

Log information: $ spark_home/logs/

14/07/22 13:41:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:7077]14/07/22 13:41:33 INFO master.Master: Starting Spark master at spark://hadoop000:707714/07/22 13:41:33 INFO server.Server: jetty-8.y.z-SNAPSHOT14/07/22 13:41:33 INFO server.AbstractConnector: Started [email protected]0.0.0.0:808014/07/22 13:41:33 INFO ui.MasterWebUI: Started MasterWebUI at http://hadoop000:808014/07/22 13:41:33 INFO master.Master: I have been elected leader! New state: ALIVE

Ii. Cluster Startup Process-start worker

$ Spark_home/sbin/start-slaves.sh

Start-slaves.sh script key content:

spark-daemon.sh start org.apache.spark.deploy.worker.Worker master-spark-URL

When the worker is running, you need to register the specified master URL, Which is spark: // hadoop000: 7077.

After the worker is started, it mainly does two things:
1) Register yourself to the master (registerworker );
2) Send heartbeat information to the master periodically;

Worker sends registration information to the master:

Worker.scala　　　　==>preStart　　　　　　==>registerWithMaster　　　　　　　　==>tryRegisterAllMasters　　　　　　　　　　==> actor ! RegisterWorker(workerId, host, port, cores, memory, webUi.boundPort, publicAddress)

The master side receives the registerworker notification:

Master. Scala ==> caseRegisterworker(ID, workerhost, workerport, cores, memory, workeruiport, publicaddress) => {Val worker = new workerinfo (ID, workerhost, workerport, cores, memory, sender, workeruiport, publicaddress) if (registerworker (worker) {persistenceengine. addworker (worker) sender!Registeredworker(Masterurl, masterwebuiurl) // after successful registration, the message schedule () is sent to worker ()}}

After receiving the successful registration information from the master, the worker periodically sends heartbeat information to the master.

Worker.scala　　==>case SendHeartbeat =>　　　　masterLock.synchronized {if (connected) { master ! Heartbeat(workerId) }　　}

The master updates the last heartbeat time after receiving the heartbeat information sent by the worker.

Master.scala　　==>case Heartbeat(workerId) => {　　　　　　idToWorker.get(workerId) match {          　　case Some(workerInfo) =>　　　　　　　　　　workerInfo.lastHeartbeat = System.currentTimeMillis()　　　　　　}　　}

The master periodically removes heartbeat messages that are not sent to the worker node of the master when the timeout period is reached.

Master.scala　　==>preStart　　　　==>CheckForWorkerTimeOut　　　　　　==>case CheckForWorkerTimeOut => {timeOutDeadWorkers()} //Check for, and remove, any timed-out workers

Log information: $ spark_home/logs/

Some master log information:

14/07/22 13:41:36 INFO master.Master: Registering worker hadoop000:48343 with 1 cores, 2.0 GB RAM

Some worker log information:

14/07/22 13:41:35 INFO Worker: Starting Spark worker hadoop000:48343 with 1 cores, 2.0 GB RAM14/07/22 13:41:35 INFO Worker: Spark home: /home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.014/07/22 13:41:35 INFO WorkerWebUI: Started WorkerWebUI at http://hadoop000:808114/07/22 13:41:35 INFO Worker: Connecting to master spark://hadoop000:7077...14/07/22 13:41:36 INFO Worker: Successfully registered with master spark://hadoop000:7077

Iii. application submission process

A. Submit Application

Run spark-shell:$ Spark_home/bin/spark-shell -- master spark: // hadoop000: 7077

Log information: $ spark_home/work

Spark-shell is an application. It is created when the createtaskschedend of sparkcontext is started to create sparkdeployschedulerbackend.

client = new AppClient(sc.env.actorSystem, masters, appDesc, this, conf)client.start()

The registerapplication request is sent to the master.

AppClient.scala　　==>preStart　　　　==>registerWithMaster　　　　　　==>tryRegisterAllMasters　　　　　　　　==>actor ! RegisterApplication(appDescription)

B. The master processes the registerapplication request.

On the master side, the processing branch is the registerapplication. After the master receives the registerapplication request, the master node schedules the application:If a worker has been registered, send the launchexecutor command to the corresponding worker.

Master.scala        ==>case RegisterApplication(description) => {            logInfo("Registering app " + description.name)            val app = createApplication(description, sender)            registerApplication(app)            logInfo("Registered app " + description.name + " with ID " + app.id)            persistenceEngine.addApplication(app)            sender ! RegisteredApplication(app.id, masterUrl)            schedule()        }
        ==>schedule            ==>launchExecutor(worker, exec)                ==> worker.addExecutor(exec)                    worker.actor ! LaunchExecutor(masterUrl,exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory)                    exec.application.driver ! ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory)

C. Start executor

After the worker receives the launchexecutor command, it starts the executor process.

Worker.scala    ==>case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>        logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))        val manager = new ExecutorRunner(appId, execId, appDesc, cores_, memory_,        self, workerId, host,        appDesc.sparkHome.map(userSparkHome => new File(userSparkHome)).getOrElse(sparkHome),        workDir, akkaUrl, ExecutorState.RUNNING)        executors(appId + "/" + execId) = manager        manager.start()        coresUsed += cores_        memoryUsed += memory_        masterLock.synchronized {master ! ExecutorStateChanged(appId, execId, manager.state, None, None)}    }

D. Register executor

The started executor process registers itself to schedulerbackend In the Driver Based on the input parameters at startup.

SparkDeploySchedulerBackend.scala    ==>preStart   (CoarseGrainedSchedulerBackend)        ==> case RegisterExecutor(executorId, hostPort, cores) =>            logInfo("Registered executor: " + sender + " with ID " + executorId)            sender ! RegisteredExecutor(sparkProperties)            executorActor(executorId) = sender            executorHost(executorId) = Utils.parseHostPort(hostPort)._1            totalCores(executorId) = cores            freeCores(executorId) = cores            executorAddress(executorId) = sender.path.address            addressToExecutorId(sender.path.address) = executorId            totalCoreCount.addAndGet(cores)            makeOffers()CoarseGrainedExecutorBackend.scala    case RegisteredExecutor(sparkProperties) =>        ogInfo("Successfully registered with driver")        executor = new Executor(executorId, Utils.parseHostPort(hostPort)._1, sparkProperties,false)

Executor log information location: console/$ spark_home/logs

E. Run the task

Sample Code:

sc.textFile("hdfs://hadoop000:8020/hello.txt").flatMap(_.split(‘\t‘)).map((_,1)).reduceByKey(_+_).collect

After schedulerbackend receives the registration message of executor, it splits the submitted spark job into multiple specific tasks, and then disperses these tasks to various executors for real operation through the launchtask command..

CoarseGrainedSchedulerBackend.scala    def makeOffers() {        launchTasks(scheduler.resourceOffers(            executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))        }   ==>executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask))            ==>CoarseGrainedSchedulerBackend  case LaunchTask(data) =>                  if (executor == null) {                    logError("Received LaunchTask command but executor was null")                    System.exit(1)                  } else {                    val ser = SparkEnv.get.closureSerializer.newInstance()                    val taskDesc = ser.deserialize[TaskDescription](data.value)                    logInfo("Got assigned task " + taskDesc.taskId)                    executor.launchTask(this, taskDesc.taskId, taskDesc.serializedTask)                  }

Some master log information:

14/07/22 15:25:27 INFO master.Master: Registering app Spark shell14/07/22 15:25:27 INFO master.Master: Registered app Spark shell with ID app-20140722152527-000114/07/22 15:25:27 INFO master.Master: Launching executor app-20140722152527-0001/0 on worker worker-20140722134135-hadoop000-48343

Some worker log information:

Spark assembly has been built with Hive, including Datanucleus jars on classpath14/07/22 15:25:27 INFO Worker: Asked to launch executor app-20140722152527-0001/0 for Spark shellSpark assembly has been built with Hive, including Datanucleus jars on classpath14/07/22 15:25:28 INFO ExecutorRunner: Launch command: "java" "-cp" "::/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/conf:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/spark-assembly-1.0.1-hadoop2.3.0-cdh5.0.0.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-rdbms-3.2.1.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-core-3.2.2.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-api-jdo-3.2.1.jar" "-XX:MaxPermSize=128m" "-Xms1024M" "-Xmx1024M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://[email protected]:50515/user/CoarseGrainedScheduler" "0" "hadoop000" "1" "akka.tcp://[email protected]:48343/user/Worker" "app-20140722152527-0001"

Some log information in the console:

14/07/22 15:25:31 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:45150/user/Executor#-791712793] with ID 014/07/22 15:25:31 INFO CoarseGrainedExecutorBackend: Successfully registered with driver

Every time a new application is registered to the master, the master will schedule the schedule function to send the application to the corresponding worker, start the corresponding executorbackend in the corresponding worker, and the final task will run in the executorbackend.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Spark analysis-standalone operation process analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Spark analysis-standalone operation process analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support