How Spark uses Akka to implement process, node communication

Source: Internet
Author: User

How Spark uses Akka to implement process, node communication

"In-depth understanding of spark: core ideas and source analysis," The preface of the book, please see the link "in-depth understanding of spark: core ideas and source analysis," a book officially published listing

"In-depth understanding of spark: core ideas and source analysis," the first chapter of the content, please see the link to the 1th Chapter environment preparation

"In-depth understanding of spark: core ideas and source analysis," the second chapter of the content, please see the link to the 2nd Chapter Spark design concept and basic structure

"In-depth understanding of spark: core ideas and Source Analysis," chapter III of the first part of the content, please see the link "in-depth understanding of spark: core ideas and source analysis,"--sparkcontext initialization (POST)

"In-depth understanding of spark: core ideas and Source Analysis," chapter III of the second part of the content, please see the link "in-depth understanding of spark: core ideas and source analysis,"--sparkcontext initialization (Zhong article)

"In-depth understanding of spark: core ideas and source analysis," the third chapter of the third part of the content, please see the link "in-depth understanding of spark: core ideas and source analysis,"--sparkcontext initialization (tert-chapter)

"In-depth understanding of spark: core ideas and source analysis," the third chapter of the fourth part of the content, please see the link "in-depth understanding of spark: core ideas and source analysis,"--sparkcontext Initialization (quarterly)

Akka Introduction

Scala believes that it is bad practice for Java threads to share data and maintain shared data through locks, which can lead to contention for locks, and the context switching of a thread brings a lot of overhead, reduces the performance of concurrent programs, and even introduces deadlock problems. Only custom types are required to inherit the actor in Scala, and the Act method is provided, just as Java implements the Runnable interface, which requires the implementation of the Run method. But instead of invoking the Act method directly, the data is passed by sending the message (Scala sends the message asynchronously). Such as:
Actor! Message
Akka is the Advanced class library of the actor programming model, which is similar to the increasingly rich concurrency toolkit after JDK 1.5, simplifying programmer concurrency. The Akka is a toolset and runtime environment for building highly concurrent, distributed, scalable, Java Virtual machine-based message-driven applications. You can see the simplicity of Akka concurrent programming from a sample of code provided in the Akka website below.

Case class Greeting (who:string) class Greetingactor extends Actor with actorlogging {  def receive = {case    greetin G (WHO) ⇒log.info ("Hello" + Who)  }}val system = Actorsystem ("Mysystem") val greeter = System.actorof (props[greetingac Tor], name = "Greeter") greeter! Greeting ("Charlie Parker")

Akka provides a distributed framework that means that users do not need to consider how to implement distributed deployment, and the Akka website provides the following example showing how to obtain a reference to a remote actor.

Config on all Machinesakka {actor {Provider = Akka.remote.RemoteActorRefProvider Deployment {/greeter { remote = akka.tcp://[email protected]:2552}}}}//------------------------------//define the Greet ing actor and the Greeting Messagecase class greeting (who:string) extends Serializableclass Greetingactor extends actor W ITH actorlogging {def receive = {Case Greeting-⇒log.info ("Hello" + Who)}}//------------------------------/ /on machine 1:empty system, target for deployment from machine 2val system = Actorsystem ("Mysystem")//----------------- -------------/2:remote deployment-deploying on machine1val system = Actorsystem ("Mysystem") val greeter = s Ystem.actorof (Props[greetingactor], name = "greeter")//------------------------------//on machine 3:remote Lookup ( Logical home of "greeter" is machine2, remote deployment is transparent) val system = Actorsystem ("Mysystem") val greeter = System.actorselection ("Akka.Tcp://[email protected]:2552/user/greeter ") Greeter! Greeting ("Sonny Rollins")

The actor will eventually form a tree, as the actor of the father should deal with the abnormal failure of all sons. Akka gives a simple example of the code below.

Class Supervisor extends Actor {  override val Supervisorstrategy =  oneforonestrategy (maxnrofretries = 10,  Withintimerange = 1 minute) {case    _: Arithmeticexception⇒resume Case    _: Nullpointerexception⇒restart    case _: Exception⇒escalate  }  val worker = Context.actorof (Props[worker])  def receive = {case    N:int = = Worker forward N  }}

For more information on Akka, please visit the official website: http://akka.io/

Actorsystem of distributed message system based on Akka

Spark uses the messaging system provided by Akka to implement concurrency: Actorsystem is the most basic facility in spark, and Spark uses it to send distributed messages and use it for concurrent programming. Because of the character of the actor's lightweight concurrent programming, message sending, and Actorsystem support for distributed messaging, Spark chose Actorsystem.
The Akkautils tool class is used when creating Actorsystem in Sparkenv, and the code is as follows.

Val (Actorsystem, Boundport) =option (Defaultactorsystem) match {case  Some (AS) = (as, port) case  None =>
   val actorsystemname = if (isdriver) Driveractorsystemname else Executoractorsystemname    Akkautils.createactorsystem (Actorsystemname, hostname, port, conf, SecurityManager)}

The Akkautils.createactorsystem method is used to start the Actorsystem code as follows.

def createactorsystem (  name:string,  host:string,  port:int,  conf:sparkconf,  SecurityManager : SecurityManager): (actorsystem, int) = {  val startservice:int = = (Actorsystem, int) = {Actualport +    D Ocreateactorsystem (name, host, Actualport, conf, SecurityManager)  }  Utils.startserviceonport (port, StartService, conf, name)}

Akkautils uses the Utils static method Startserviceonport, Startserviceonport will eventually callback the method Startservice:int=> (T, Int), The startservice here is actually the method Docreateactorsystem. The real start Actorsystem is done by the Docreateactorsystem method, and the details of Docreateactorsystem implementation are detailed in Akkautils. For the implementation of Startserviceonport, see [Brief Introduction to the common tool class utils in Spark] (http://blog.csdn.net/beliefer/article/details/50904662) The content of the article.

Akkautils

Akkautils is another layer of spark-related APIs for Akka, where its common features are described.

(1) Docreateactorsystem

Function Description: Create Actorsystem.

Private Def docreateactorsystem (Name:string, host:string, Port:int, conf:sparkconf, Securitymanager:securityman Ager): (Actorsystem, Int) = {val Akkathreads = conf.getint ("Spark.akka.threads", 4) Val akkabatchsize = Conf.getint ("sp Ark.akka.batchSize ",") Val akkatimeout = Conf.getint ("spark.akka.timeout", +) val akkaframesize = Maxframesizebytes ( conf) Val akkaloglifecycleevents = Conf.getboolean ("Spark.akka.logLifecycleEvents", false) Val lifecycleevents = if (akk aloglifecycleevents) "On" Else "off" if (!akkaloglifecycleevents) {Option (Logger.getlogger ("Akka.remote.EndpointWrit ER ")). Map (L = l.setlevel (Level.fatal))} val logakkaconfig = if (Conf.getboolean (" Spark.akka.logAkkaConfig ", false)    ) "On" Else "off" val akkaheartbeatpauses = Conf.getint ("spark.akka.heartbeat.pauses", 6000) Val akkafailuredetector = Conf.getdouble ("Spark.akka.failure-detector.threshold", 300.0) val akkaheartbeatinterval = Conf.getint (" Spark.akka.heartbeat.interval ", +) VAL Secretkey = Securitymanager.getsecretkey () val Isauthon = securitymanager.isauthenticationenabled () if (IsAuthOn & & secretkey = = null) {throw new Exception ("Secret key is null with authentication on")} val Requirecookie = if (Isauthon)  "On" Else "off" val Securecookie = if (Isauthon) Secretkey Else "" Logdebug ("in Createactorsystem, Requirecookie is:" + Requirecookie) Val akkaconf = Configfactory.parsemap (conf.getakkaconf.tomap[string, String]). Withfallback (ConfigFac Tory.parsestring (S "" "" |akka.daemonic = on |akka.loggers = ["" Akka.event.slf4j.Slf4jLogger ""] |akka.stdout-log Level = "ERROR" |akka.jvm-exit-on-fatal-error = off |akka.remote.require-cookie = "$requireCookie" |akka.remote.s    Ecure-cookie = "$secureCookie" |akka.remote.transport-failure-detector.heartbeat-interval = $akkaHeartBeatInterval s |akka.remote.transport-failure-detector.acceptable-heartbeat-pause = $akkaHeartBeatPauses S | Akka.remote.transport-failure-detector.Threshold = $akkaFailureDetector |akka.actor.provider = "Akka.remote.RemoteActorRefProvider" |akka.remote.netty.tcp. Transport-class = "Akka.remote.transport.netty.NettyTransport" |akka.remote.netty.tcp.hostname = "$host" |akka.remot E.netty.tcp.port = $port |akka.remote.netty.tcp.tcp-nodelay = on |akka.remote.netty.tcp.connection-timeout = $akkaTi meout s |akka.remote.netty.tcp.maximum-frame-size = ${akkaframesize}b |akka.remote.netty.tcp.execution-pool-size = $    Akkathreads |akka.actor.default-dispatcher.throughput = $akkaBatchSize |akka.log-config-on-start = $logAkkaConfig |akka.remote.log-remote-lifecycle-events = $lifecycleEvents |akka.log-dead-letters = $lifecycleEvents |akka.log-dea D-letters-during-shutdown = $lifecycleEvents "" ". Stripmargin)) Val Actorsystem = Actorsystem (name, akkaconf) Val Prov Ider = actorsystem.asinstanceof[extendedactorsystem].provider val boundport = Provider.getDefaultAddress.port.get ( Actorsystem, Boundport)}
(2) Makedriverref

Function Description: Find an actor already registered from the remote Actorsystem.

def makedriverref (name:string, conf:sparkconf, actorsystem:actorsystem): Actorref = {  val driveractorsystemname = S Parkenv.driveractorsystemname  val driverhost:string = Conf.get ("Spark.driver.host", "localhost")  Val Driverport:int = Conf.getint ("Spark.driver.port", 7077)  utils.checkhost (driverhost, "expected hostname")  Val url = S "akka.tcp://[email protected] $driverHost: $driverPort/user/$name"  val timeout = akkautils.lookuptimeout (conf)  Loginfo (S "Connecting to $name: $url")  await.result (actorsystem.actorselection (URL). Resolveone (timeout), Timeout)}

How Spark uses Akka to implement process, node communication

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.