17th Lesson: Spark Streaming Resource dynamic application and dynamic control consumption rate principle analysis

Source: Internet
Author: User

Contents of this issue:

    • Spark Streaming Resource dynamic allocation

    • Spark streaming dynamically control consumption rate

Why do I need dynamic?

    • When spark is coarse-grained by default, the resource is first allocated for recalculation. Spark streaming has high peaks and low peaks, but the resources they need are different, and if they are at peak value, there will be a lot of wasted resources.

    • Spark streaming is constantly running, and the resource consumption and management is also a factor to consider.

    • Spark streaming challenges when dynamically adjusting the resources:

    • Spark streaming runs according to batch Duration, batch Duration requires a lot of resources, and the next batch Duration doesn't need that much resources. When you adjust the resources, the batch duration run is out of date. The time interval is adjusted at this time.

Spark Streaming Resource Dynamic application

1. Dynamic resource allocation is not turned on by default in Sparkcontext, but can be configured manually in sparkconf.

// optionally scale number of executors  dynamically based on workload. exposed for testing.val  Dynamicallocationenabled = utils.isdynamicallocationenabled (_conf) if  (!dynamicAllocationEnabled  && //parameter configuration whether to turn on resource dynamic allocation _conf.getboolean ("spark.dynamicAllocation.enabled",  false))  {   logwarning ("Dynamic allocation and num executors both set, thus  dynamic allocation disabled. ")} _executorallocationmanager =  if  (dynamicallocationenabled)  {     Some (New executorallocationmanager (this, listenerbus, _conf))   } else {     none  }_executorallocationmanager.foreach (_.start ()) 
    1. Executorallocationmanager: There are timers that will constantly scan the executor case, the running stage, to run in different executor, either increase executor or decrease. The schedule method in the

    2. Executorallocationmanager is periodically triggered for resource dynamic adjustment.

/** * this is called at a fixed interval to regulate  the number of pending executor requests * and number of  executors running. * * first, adjust our requested executors  Based on the add time and our current needs. * then, if  the remove time for an existing executor has expired, kill  the executor. * * this is factored out into its own  method for testing. */private def schedule (): unit =  synchronized {  val now = clock.gettimemillis   Updateandsyncnumexecutorstarget (now)   removeTimes.retain { case  (executorid,  Expiretime)  =>    val expired = now >= expiretime    if  (expired)  {       initializing = false      removeexecutor (EXECUTORID)     }    !expired  }}
    1. In Executorallocationmanager the timer in the thread pool runs continuously schedu Le.

/** * register for scheduler callbacks to decide when to add  and remove executors, and start * the scheduling task. */ Def start ():  unit = {  listenerbus.addlistener (listener)   val  Scheduletask = new runnable ()  {    override def run ():  unit = {      try {         schedule ()       } catch {         case ct: ControlThrowable =>           throw ct        case t: throwable = >          logwarning (S "uncaught exception in  thread ${thread.currEntthread (). GetName} ",  t)       }    }  }//  intervalmillis Timer trigger Time   executor.scheduleatfixedrate (Scheduletask, 0, intervalmillis,  timeunit.milliseconds)}

Dynamic control consumption rate: Spark streaming provides an elastic mechanism for the relationship between the speed of flow in and the speed of processing, and whether data is processed in time. If not, he will automatically dynamically control the speed of the data flow, spark.streaming.backpressure.enabled parameter settings.

The principle of dynamic control of consumption rate can be referenced in paper Adaptive Stream processing using dynamic Batch Sizing


Note:

1. DT Big Data Dream Factory public number Dt_spark
2, the IMF 8 o'clock in the evening big data real combat YY Live channel number: 68917580
3, Sina Weibo: Http://www.weibo.com/ilovepains


This article is from the "Ding Dong" blog, please be sure to keep this source http://lqding.blog.51cto.com/9123978/1784901

17th Lesson: Spark Streaming Resource dynamic application and dynamic control consumption rate principle analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.