Stage is the physical unit that spark schedules execution Spark1.6 version stage Source:
Package Org.apache.spark.scheduler Import scala.collection.mutable.HashSet import org.apache.spark._ Import Org.apache.spark.rdd.RDD Import Org.apache.spark.util.CallSite/** * A stage is a set of parallel tasks all computing th E same function, need to run as part * of a Spark job, where all the tasks has the same shuffle dependencies. Each DAG's tasks run * by the scheduler are split up to stages at the boundaries where shuffle occurs, and then the *
Dagscheduler runs these stages in topological order. * * Stage consists of a set of identical tasks, dividing the stage boundary is shuffle * * Each stage can either is a shuffle map stage, in the which case its tasks ' re Sults is input for * Other Stage (s), or a result stage, in which case its tasks directly compute a Spark action * (e.g. Count (), save (), etc) by running a function in an RDD.
For shuffle map stages, we also * track the nodes and each output partition are on. * Stage is divided into shufflemapstage and resultstage * * Each stage also have a firstjobid, identifyingThe job that first submitted the stage. When FIFO * Scheduling are used, this allows Stages from earlier jobs to being computed first or recovered * faster on Failu
Re. * * The stage can be retried * Finally, a single stage is re-executed in multiple attempts due to fault recovery.
In this * case, the Stage object would track multiple Stageinfo objects to pass to listeners or the Web UI.
* The latest one would be accessible through latestinfo. * * @param ID Unique Stage ID * @param RDD rdd that this stage runs on:for a shuffle map stage, it's the RDD we run map Tasks * on, when for a result stage, it's the target RDD, the We ran an action in * @param numtasks total number of tasks in stage;
Result stages in particular is not need to * compute all partitions, e.g. for first (), lookup (), and take ().
* @param parents List of stages, this stage depends on (through shuffle dependencies).
* @param firstjobid ID of the first job this stage is part of, a for FIFO scheduling. * @parAm CallSite location in the user program associated with this stage:either where the target * RDD is created, for a s
Huffle map stage, or where the action for a result stage is called. */Private[scheduler] abstract class Stage (Val id:int, Val Rdd:rdd[_], Val numtasks:int, Val parents:
List[stage],//parents is the Stage List, the connection relationship of the role Dag Val Firstjobid:int, Val callsite:callsite) extends Logging { Val numpartitions = rdd.partitions.length/** Set of Jobs The This stage belongs to. */val Jobids = new Hashset[int] val pendingpartitions = new Hashset[int]/** the ID to use for the next new Attem PT for the This stage.
*/private var nextattemptid:int = 0 Val name:string = callsite.shortform val details:string = Callsite.longform Private var _internalaccumulators:seq[accumulator[long]] = seq.empty/** Internal accumulators gkfx across all t asks in this stage. */def Internalaccumulators:seq[accumulator[long]] = _INTERNALACCUMulators/** * Re-initialize the internal accumulators associated with this stage. * * This is called every time the stage was submitted, *except* when a subset of tasks * belonging to this stage have Already finished.
Otherwise, reinitializing the internal * accumulators here again would override the partial values from the finished tasks.
*/def resetinternalaccumulators (): Unit = {_internalaccumulators = internalaccumulator.create (Rdd.sparkcontext) }/** * Pointer to the [Stageinfo] object for the most recent attempt.
This needs to is initialized * here, before any attempts has actually been created, because the Dagscheduler uses this
* Stageinfo to the sparklisteners when a job starts (which happens before any stage attempts * has been created). */private var _latestinfo:stageinfo = Stageinfo.fromstage (this, nextattemptid)/** * Set of stage attempt ID s that has failed with a fetchfailure. We keep track of these * failures In order to avoid endless retries if a stage keeps failing with a fetchfailure. * We keep track of each attempt ID that have failed to avoid recording duplicate failures if * Multiple tasks from the S
AME stage attempt fail (SPARK-5945). */private Val fetchfailedattemptids = new Hashset[int] Private[scheduler] def clearfailures (): Unit = {FETCHFA Iledattemptids.clear ()}/** * Check Whether we should abort the Failedstage due to multiple consecutive fetch FAI
Lures. * * This method updates the running set of failed stage attempts and returns * True if the number of failures exceed
s the allowable number of failures.
* * Check if you need to discard the current failed stage */Private[scheduler] def failedonfetchandshouldabort (stageattemptid:int): Boolean = {
Fetchfailedattemptids.add (Stageattemptid) fetchfailedattemptids.size >= Stage.max_consecutive_fetch_failures}
/** creates a new attempt for this stage is creating a new stageinfo with a new attempt ID. * * Retry Stage */Def makenewstageattempt (Numpartitionstocompute:int, Tasklocalitypreferences:seq[se Q[tasklocation]] = seq.empty): Unit = {_latestinfo = Stageinfo.fromstage (this, Nextattemptid, Some (numpartitio Nstocompute), tasklocalitypreferences) Nextattemptid + = 1}/** Returns the Stageinfo for the most recent attempt For this stage. */def Latestinfo:stageinfo = _latestinfo override final Def hashcode (): Int = ID override final Def equals (other
: any): Boolean = Other match {case stage:stage = stage! = NULL && stage.id = = id Case _ = False
}/** Returns the sequence of partition IDs that is missing (i.e. needs to be computed). * Get unfinished Parition */def findmissingpartitions (): Seq[int]} Private[scheduler] Object Stage {//the number of cons
Ecutive failures allowed before a stage is aborted//retry up to 4 times val max_consecutive_fetch_failures = 4}
Shufflemapstage and Resultstage
As shown in the figure:
Shufflemapstage is the stage of the job intermediate execution process, Resultstage is the last stage of the job