Spark analysis-standalone operation process analysis

Source: Internet
Author: User

I. Cluster Startup Process-start master

$SPARK_HOME/sbin/start-master.sh

Start-master.sh script key content:

spark-daemon.sh start org.apache.spark.deploy.master.Master 1 --ip $SPARK_MASTER_IP --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT

Log information: $ spark_home/logs/

14/07/22 13:41:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:7077]14/07/22 13:41:33 INFO master.Master: Starting Spark master at spark://hadoop000:707714/07/22 13:41:33 INFO server.Server: jetty-8.y.z-SNAPSHOT14/07/22 13:41:33 INFO server.AbstractConnector: Started [email protected]0.0.0.0:808014/07/22 13:41:33 INFO ui.MasterWebUI: Started MasterWebUI at http://hadoop000:808014/07/22 13:41:33 INFO master.Master: I have been elected leader! New state: ALIVE

 

Ii. Cluster Startup Process-start worker

$ Spark_home/sbin/start-slaves.sh

Start-slaves.sh script key content:

spark-daemon.sh start org.apache.spark.deploy.worker.Worker master-spark-URL

When the worker is running, you need to register the specified master URL, Which is spark: // hadoop000: 7077.

After the worker is started, it mainly does two things:
1) Register yourself to the master (registerworker );
2) Send heartbeat information to the master periodically;

Worker sends registration information to the master:

Worker.scala    ==>preStart      ==>registerWithMaster        ==>tryRegisterAllMasters          ==> actor ! RegisterWorker(workerId, host, port, cores, memory, webUi.boundPort, publicAddress)

The master side receives the registerworker notification:

Master. Scala ==> caseRegisterworker(ID, workerhost, workerport, cores, memory, workeruiport, publicaddress) => {Val worker = new workerinfo (ID, workerhost, workerport, cores, memory, sender, workeruiport, publicaddress) if (registerworker (worker) {persistenceengine. addworker (worker) sender!Registeredworker(Masterurl, masterwebuiurl) // after successful registration, the message schedule () is sent to worker ()}}

After receiving the successful registration information from the master, the worker periodically sends heartbeat information to the master.

Worker.scala  ==>case SendHeartbeat =>    masterLock.synchronized {if (connected) { master ! Heartbeat(workerId) }  }

The master updates the last heartbeat time after receiving the heartbeat information sent by the worker.

Master.scala  ==>case Heartbeat(workerId) => {      idToWorker.get(workerId) match {            case Some(workerInfo) =>          workerInfo.lastHeartbeat = System.currentTimeMillis()      }  }

The master periodically removes heartbeat messages that are not sent to the worker node of the master when the timeout period is reached.

Master.scala  ==>preStart    ==>CheckForWorkerTimeOut      ==>case CheckForWorkerTimeOut => {timeOutDeadWorkers()} //Check for, and remove, any timed-out workers

Log information: $ spark_home/logs/

Some master log information:

14/07/22 13:41:36 INFO master.Master: Registering worker hadoop000:48343 with 1 cores, 2.0 GB RAM

Some worker log information:

14/07/22 13:41:35 INFO Worker: Starting Spark worker hadoop000:48343 with 1 cores, 2.0 GB RAM14/07/22 13:41:35 INFO Worker: Spark home: /home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.014/07/22 13:41:35 INFO WorkerWebUI: Started WorkerWebUI at http://hadoop000:808114/07/22 13:41:35 INFO Worker: Connecting to master spark://hadoop000:7077...14/07/22 13:41:36 INFO Worker: Successfully registered with master spark://hadoop000:7077

Iii. application submission process

A. Submit Application

Run spark-shell:$ Spark_home/bin/spark-shell -- master spark: // hadoop000: 7077

Log information: $ spark_home/work

Spark-shell is an application. It is created when the createtaskschedend of sparkcontext is started to create sparkdeployschedulerbackend.

client = new AppClient(sc.env.actorSystem, masters, appDesc, this, conf)client.start()

The registerapplication request is sent to the master.

AppClient.scala  ==>preStart    ==>registerWithMaster      ==>tryRegisterAllMasters        ==>actor ! RegisterApplication(appDescription)

B. The master processes the registerapplication request.

On the master side, the processing branch is the registerapplication. After the master receives the registerapplication request, the master node schedules the application:If a worker has been registered, send the launchexecutor command to the corresponding worker.

Master.scala        ==>case RegisterApplication(description) => {            logInfo("Registering app " + description.name)            val app = createApplication(description, sender)            registerApplication(app)            logInfo("Registered app " + description.name + " with ID " + app.id)            persistenceEngine.addApplication(app)            sender ! RegisteredApplication(app.id, masterUrl)            schedule()        }
==>schedule ==>launchExecutor(worker, exec) ==> worker.addExecutor(exec) worker.actor ! LaunchExecutor(masterUrl,exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory) exec.application.driver ! ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory)

C. Start executor

After the worker receives the launchexecutor command, it starts the executor process.

Worker.scala    ==>case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>        logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))        val manager = new ExecutorRunner(appId, execId, appDesc, cores_, memory_,        self, workerId, host,        appDesc.sparkHome.map(userSparkHome => new File(userSparkHome)).getOrElse(sparkHome),        workDir, akkaUrl, ExecutorState.RUNNING)        executors(appId + "/" + execId) = manager        manager.start()        coresUsed += cores_        memoryUsed += memory_        masterLock.synchronized {master ! ExecutorStateChanged(appId, execId, manager.state, None, None)}    }

D. Register executor

The started executor process registers itself to schedulerbackend In the Driver Based on the input parameters at startup.

SparkDeploySchedulerBackend.scala    ==>preStart   (CoarseGrainedSchedulerBackend)        ==> case RegisterExecutor(executorId, hostPort, cores) =>            logInfo("Registered executor: " + sender + " with ID " + executorId)            sender ! RegisteredExecutor(sparkProperties)            executorActor(executorId) = sender            executorHost(executorId) = Utils.parseHostPort(hostPort)._1            totalCores(executorId) = cores            freeCores(executorId) = cores            executorAddress(executorId) = sender.path.address            addressToExecutorId(sender.path.address) = executorId            totalCoreCount.addAndGet(cores)            makeOffers()CoarseGrainedExecutorBackend.scala    case RegisteredExecutor(sparkProperties) =>        ogInfo("Successfully registered with driver")        executor = new Executor(executorId, Utils.parseHostPort(hostPort)._1, sparkProperties,false)

Executor log information location: console/$ spark_home/logs

E. Run the task

Sample Code:

sc.textFile("hdfs://hadoop000:8020/hello.txt").flatMap(_.split(‘\t‘)).map((_,1)).reduceByKey(_+_).collect

After schedulerbackend receives the registration message of executor, it splits the submitted spark job into multiple specific tasks, and then disperses these tasks to various executors for real operation through the launchtask command..

CoarseGrainedSchedulerBackend.scala    def makeOffers() {        launchTasks(scheduler.resourceOffers(            executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))        }   ==>executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask))            ==>CoarseGrainedSchedulerBackend  case LaunchTask(data) =>                  if (executor == null) {                    logError("Received LaunchTask command but executor was null")                    System.exit(1)                  } else {                    val ser = SparkEnv.get.closureSerializer.newInstance()                    val taskDesc = ser.deserialize[TaskDescription](data.value)                    logInfo("Got assigned task " + taskDesc.taskId)                    executor.launchTask(this, taskDesc.taskId, taskDesc.serializedTask)                  }    

Some master log information:

14/07/22 15:25:27 INFO master.Master: Registering app Spark shell14/07/22 15:25:27 INFO master.Master: Registered app Spark shell with ID app-20140722152527-000114/07/22 15:25:27 INFO master.Master: Launching executor app-20140722152527-0001/0 on worker worker-20140722134135-hadoop000-48343

Some worker log information:

Spark assembly has been built with Hive, including Datanucleus jars on classpath14/07/22 15:25:27 INFO Worker: Asked to launch executor app-20140722152527-0001/0 for Spark shellSpark assembly has been built with Hive, including Datanucleus jars on classpath14/07/22 15:25:28 INFO ExecutorRunner: Launch command: "java" "-cp" "::/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/conf:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/spark-assembly-1.0.1-hadoop2.3.0-cdh5.0.0.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-rdbms-3.2.1.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-core-3.2.2.jar:/home/spark/app/spark-1.0.1-bin-2.3.0-cdh5.0.0/lib/datanucleus-api-jdo-3.2.1.jar" "-XX:MaxPermSize=128m" "-Xms1024M" "-Xmx1024M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://[email protected]:50515/user/CoarseGrainedScheduler" "0" "hadoop000" "1" "akka.tcp://[email protected]:48343/user/Worker" "app-20140722152527-0001"

Some log information in the console:

14/07/22 15:25:31 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:45150/user/Executor#-791712793] with ID 014/07/22 15:25:31 INFO CoarseGrainedExecutorBackend: Successfully registered with driver

Every time a new application is registered to the master, the master will schedule the schedule function to send the application to the corresponding worker, start the corresponding executorbackend in the corresponding worker, and the final task will run in the executorbackend.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.