Job, task, and task attempt IDs
In Hadoop 2, MapReduce job IDs is generated from yarn application IDs this arecreated by the Yarn resource Manager.
The format of an application ID is composedof the time, the resource manager (not the application) started and an incr Ementingcounter maintained by the resource manager to uniquely identify the application to that instance of the resource m Anager.
So the application with this ID:
appllcation_1410450250506_0003
is the third (0003; application IDs was 1-based) application run by the resource Manager,which started at the time Repres Ented by the timestamp 1410450250506.
The counter is formatted with leading zeros to make IDs sort nicely-in directory listings, for example.
However, when the counter reaches 10000, it's not reset, resulting in longer application IDs (which don ' t sort so well). The corresponding job ID is created simply by replacing the application prefix for an application ID with a job prefix:
job_1410450250506_0003
Tasks belong to a job, and their IDs is formed by replacing the job prefix of a job ID with a task prefix and adding a su Ffix to identify the task within the job. For example:
task_1410450250506_0003_n_000003
is the fourth (000003; task IDs was 0-based) map (n) task of the job with ID job_1410450250506_0003. The task IDs arc created for a job when it's initialized, so they does not necessarily dictate the order in which the tasks would be executed. Tasks May is executed more than once, due to failure (see Mtask FAILURCM on page 193) or speculative execution (see Specul ative Execution "on page 204), so to identify different instances of a task execution, task attempts is given unique IDs. For example:
Attenpt_1410450256506_0003_n_000003_0
is the first (0; attempt IDs was o-based) attempt at running task
task_141045o250506_o003_m_000003.
Task attempts arc allocated during the job run as needed, so their ordering represents the order in which they were create D to run.
In short, when yarn application ID exceeds the 4-digit range, which is 10000, yarn directly increases the number of bits to expand the scope of the ID space. It is also officially acknowledged that this results in deviations based on the ID sort result.
Hadoop:the Definitive Guide:storage and analysis at the Internet scale
Yarn Application ID Growth reached 10000