Spark notes 4:apache Hadoop Yarn:yet another Resource negotiator

Last Update:2015-01-22 Source: Internet

Author: User

Tags spark notes

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Spark supports yarn as a resource scheduler, so the principle of yarn should still be known: http://www.socc2013.org/home/program/a5-vavilapalli.pdf But overall, this is a general paper, Its principles are not particularly prominent, and the data it enumerates are not comparable, and there is almost no advantage in yarn. Anyway, the way I read it is that yarn's resource allocation is poorly estimated on latency. And the actual use seems to confirm the premonition.
AbstractBoth key shortcomings:1) tight coupling of a specific programming model with the resource management infrastructure, FORCI NG developers to abuse the MapReduce programming model, and 2) centralized handling of jobs ' control flow, which resulted In endless scalability concerns for the scheduler
The previous two drawbacks of the resource Manager: 1) are tightly coupled with programming patterns, forcing programmers to overuse the MapReduce programming pattern, and 2) overly focused on the control process of jobs, which makes the scalability poor; YARN is committed to solving these problems. Introductionnothing interesting ...
Hadoop on Demand shortcomingsQuestions We care about:

Scalability Extensibility
Multi-tenancy Multi-Way rental
serviceability availability
Locality Awareness Localization
High Cluster utiltization cluster usage
Reliability/availability Reliability
Secure and auditable operation security
Support for programming model diversity Programming mode diversification
Flexible the scalability of Resource Model Resources
Backward compatibility back compatibility

3 ArchitectureTo address the requirements we discussed in section 2, YARN lifts some functions into a platform layer responsible for resource management,Leaving coordination of logical execution plans to a host of the framework implementations. Specifically, a per-cluster ResourceManager (RM) tracks resource usage and node liveness, enforces allocation invariants, and ARBITR Ates contention among tenants. By separating these duties in the Jobtracker ' s charter, the central allocator can use an abstract description of tenants ' requirements, but remain ignorant of the semantics of each allocation. That responsibility was delegated to an applicationmaster (AM), which coordinates the logical plan of a single job by Reque Sting resources from the RM, generating a physical plan from the resources it receives, and coordinating the execution of That plan around faults. Yarn promotes the functionality of resource management to the platform level, while the coordination of the logical execution plan is left to a host framework implementation. Specifically, a resource manager (RM) tracks the use and availability of resources, and according to a constant allocation method, the quorum resource competes and leases conflict. It leaves the semantics of each application to the Application Manager (AM), which is responsible for getting resources from RM to coordinate logical plans, generating physical plans, and executing plans and handling errors in execution. 3.1 OverviewThe RM runs as a daemonOn a dedicated machine, and acts as the Central Authority arbitratingResources among various competing applications in the cluster. Given This central and global view of the cluster resources, it can enforce rich, familiar properties such as fairness [R1 0], capacity [R10], and locality [R4] across tenants. Depending on the application demand, scheduling priorities, and resource availability, the RM Dynamically allocates leases–called containers–to applications to run on particular nodes. 5 The container is a logical bundle of resources (e.g., H2GB RAM, 1 cpui) bound to a particular node[R4,R9]. In order to enforce and track such assignments, the RM interacts with a special system daemon running on each node called The NodeManager (NM). Communications between RM and NMs is heartbeatbased for Scalability. NMs is responsible for monitoring resource availability, reporting faults, and container lifecycle management (e.g., star Ting, killing). The RM assembles its global view from these snapshots in NM state
RM executes as a background service on a specified machine and is responsible for centralized quorum processing. Depending on application requirements, scheduling priorities, available resources, RM dynamically allocates leases (also called containers) to the application for execution on specific nodes. A container is a logical concept of a resource that is bound to a particular node, such as a 2GB ram,1cpu. To enhance the tracking of resource allocations, RM interacts with a Node manager (NM). The heartbeat is used for communication between RM and NM (for extensibility reasons). NM is responsible for the availability of the empty resource and reports the error and status of the resource to RM. The RM consolidates the global resource state. Jobs is submittedTo the RM via a public submission protocol and go through an Admission Control PhaseDuring which security credentials is validated and various operational and administrative checks are performed [R7]. Accepted Jobsis passed to the scheduler to be run. Once the scheduler have enough resources, the application is moved from accepted to running state. Aside from internal bookkeeping, this involves allocating a container for the AM and spawning it to a node in the cluster. A record of accepted applications is written to persistent storage and recovered in case of RM restart or failure main flow: Submit job, enter audit status, Audit Pass, enter an acceptance state, start allocating resources, assign to a resource, go to the execution state, and assign a container on one nm to execute am. To be able to recover after an error, an accepted application is logged to disk. The Applicationmaster is the "head"of a job, managing all lifecycle aspects including dynamically increasing and decreasing resources consumption, managing the Flo W of execution(e.g., running reducers against the output of maps), Handling Faultsand computation Skew, and performing other local optimizations. In fact, the AM can run arbitrary user code, and can is written in any programming language since all communication with T He RM and NM are encoded using extensible communication protocols 6-as An example consider 5 We'll refer to "containers" As the logical lease on resources and the actual process spawned on the node interchangeably. 6 See:https://code.google.com/p/protobuf/the Dryad Port we discuss in section 4.2. YARN makes few assumptions about the AM, although in practice we expect most jobs would use a higher level programming fram Ework (e.g, MapReduce, Dryad, Tez, REEF, etc). By delegating-these functions to AMs, YARN ' s architecture gains a great deal of scalability [R1], programming model FL exibility [R8], and improved upgrading/testing [R3] (since multiple versions of the same framework can coexist).
Am is the lead person in the application, responsible for dynamic increase or decrease of resources, execution process, error handling, computation skew, and localization optimization, and other aspects of life cycle management. In fact, am can execute code in any programming language, as long as it satisfies the communication protocol requirements of AM and RM. Typically, an AMWould need to harness the resources (CPUs, RAM, disks etc.) available on multiple nodes to complete a job. To obtain containers, AM Issues resource requests to the RM.The form of these requests includes specification of locality preferencesand Properties of the containers. The RM would attempt to satisfy the resource requests coming from each application according to Availabilityand Scheduling Policies. When a resource is allocated in behalf of an AM, the RM generates a lease for the resource, which are pulled by a subsequent a M Heartbeat.A token-based Security mechanism guarantees its authenticity when the AM presents the container lease to the NM[R4]. Once The Applicationmaster discovers that a container are available for their use, it encodes an application-specific launch request with the lease. In MapReduce, the code running on the container is either a map task or a reduce task. 7 If needed, running containers may communicate directly with the AM through an application-specific protocolTo report status and Liveness and receive framework-specific Commands–yarn neither facilitates nor enforces this communication. Overall, a YARN deployment provides a basic, yet robust infrastructure for lifecycle management and monitoring of containe RS, while application-specific semantics is managed by each framework [R3,R8].
AM initiates a resource request to RM that contains localization preferences and container properties. RM allocates resources to AM based on resource availability and allocation policies. If a resource is available, RM assigns am to a container with a lease and returns to AM by the next heartbeat of AM. A token-based security mechanism ensures that AM has a container that contains a node. If necessary, the container can communicate directly with AM through the specified protocol and will report its status to AM.
3.2 Resource Manager (RM)
Two public interfaces Towards:rm exposes both exposed interfaces: 1) clients submitting applications, and client submission application; 2) Applicationmaster (s) d Ynamically negotiating access to resources, andam Dynamic Negotiation ResourcePs:one internal interface towards Nodemanagers for cluster monitoring and resource access management.
, there is also an internal interface that communicates with NM. am to RM request a resource format:1. Number of containers (e.g., containers), 2. Resources per Container h2gb RAM, 1 cpui,3. Locality prefer Ences, And4. Priority of requests within the application
Point out of what the ResourceManager are not responsible for. As discussed, it is not responsible for coordinating application execution or taskFault-tolerance, but neither are charged with 1) providing status or metrics for running applications ( nowPart of the Applicationmaster), nor 2) serving framework specific reports of completed jobs (now delegatedTo a per-framework daemon)Need to indicate RM not responsibleWork: 1) Provide status information and metrics for the app (it's working for AM) 2)serving framework specific reports of completed jobs

3.3 Application Master (AM)
TheApplicationmaster is the process that coordinates the application's execution in the cluster, but it itself is run Inch The cluster justlike any other container. A Component ofThe RM negotiates for the container to spawn this bootstrap process. AM is responsible for coordinating the application's execution in the cluster, but it itself executes on the same container as the application on the cluster, which is started by RM. The AM periodically heartbeatsTo the RM to affirmIts liveness and to update the record of its demand. AM passes the heartbeat information of the cycle to confirm that the resources it needs are available.
The allocations to a application is late binding:the process spawned was not bound to the request, but to the lease. The conditions that caused the I to issue the request is not remain true if it receives its resources, but the Semanti CS of the container are fungible and frameworkspecific
The resource assigned to the application is deferred: The resource is bound to a lease, not a request to a (AM), so when am receives the resource, the resource may no longer belong to it.
Since The RM does not interpret the container status,The AM determines the semantics of the success or failure of the container exit status reported by NMs throughThe RM. Since the AM itself a container running in acluster of unreliable hardware, it should be resilient tofailure.Because RM does not parse the state of the container, am needs to determine the success or failure of the application based on the state returned by the container. And because AM is also implemented using containers, it should be fault tolerant.
3.4 Node Manager (NM)
The NodeManager is the "worker" of Daemon in YARN. It authenticates container leases, manages containers ' dependencies, monitors their execution, and provides a set of servi CES to containers. Operators configure it to report memory, CPUs, and other resources available at this node and allocated for YARN. After registering with the RM, the NM heartbeats its status and receives instructions
NM as a working node. It is responsible for container leases, dependencies, viewing application execution, and providing services to the container. It reports the availability of memory, CPU, and other resources to RM through heartbeat information,
All containers in yarn–including Ams–are described by a Container Launch Context (CLC).This record includes a maps of environment variables, dependencies stored in remotely accessible storage, security tokens, Payloads for NM services, and the command necessary to create the process.
All containers in yarn can be represented by a CLC. The CLC includes environment variables, dependencies stored at the remote, security information, NM payload information, and commands to create the process. To launch the container, the NM copiesAll the necessary dependencies–data files, executables,tarballs–to Local storage.In order to execute a process, NM will copy the necessary information, dependent copies to local storage.
3.5 YARN framework/application Writers
The responsibilitiesof a YARN Application Author:1. Submitting the application by passing a CLC forThe Applicationmaster to the RM. Submit the application to RM via the CLC. 2. When RM starts the AM, it should register withThe RM and periodically advertise its liveness and requirements over the heartbeat protocol. when RM starts AM, am should register with RM and report status and available resources through the heartbeat information cycle. 3. Once the RM allocates a container, AM can construct a CLC to launch the container on the corresponding NM. It may also monitor the status of the runningcontainer and stop it when the resource should is reclaimed. Monitoring the progress of work done insideThe container is strictly the AM ' s responsibility.When RM allocates a container, am starts the application on NM via the CLC. AM also monitors the execution status of the container and stops and recycles the container. It is the responsibility of AM to monitor the working status of the container. 4. Once The AM is-is-to-be, it should unregister from the RM and exit cleanly. Once am has done all the work, it should unregister the RM and clean up the resources and exit. 5. Optionally, framework authors may add controlflow between their own clients to report job status andexpose a control plane.
7 Conclusion
Thanks to the decoupling of resource management and programming framework, yarn provides: Because yarn is decoupled from the programming framework, it provides: c1>1) Greater scalability, extensibility 2) higher efficiency, and more efficient (I find it hard to say)3) enables a large number of different frameworks to efficiently share a cluster. Enables different computing frameworks to share the same cluster.

From for notes (Wiz)

Spark notes 4:apache Hadoop Yarn:yet another Resource negotiator

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More