Hadoop-yarn Application Design Overview

Source: Internet
Author: User



an overview
An application is a general term for user-written processing of data, which requests resources from yarn to complete its own computational tasks. Yarn's own application type does not have any limitations, it can be a mapreduce job that handles short-type tasks, or it can be an application that deploys long-executing services. Applications can apply resources to yarn to complete various computing tasks. To develop an application on yarn, it is generally necessary to develop two components, each of which is client and Applicationmaster, inwhich the main role of the client is to submit the application to yarn, and with yarn Interact with Application master, query the status of the application, and complete some commands sent by the user. Applicationmaster, in turn, is responsible for applying resources to yarn and communicating with NodeManager to start each container, and applicationmaster is responsible for monitoring the execution status of individual tasks and requesting resources for them again at the time of failure. Yarn Driver submits an app to yarn Explorer using the app submission client (application submission Clien). With "Clientrmprotocol", the client first obtains a new "app id" and commits to perform the "app". The application submission includes information about the UNIX process that the master will start. The submission also describes the local files to be used by the application execution, the jar packages, the actual commands required for execution, and various UNIX environment settings.
Two client design
    YARN applictionclientneed to implement the APPLICATIONCLIENTPROTOCOL protocol, the Agreementprovides a range of access interfaces for users to interact with yarn, including submitting application, querying application execution status, changing properties of application (priority), killing applications, and so on. One of the most important access interfaces is the function that submits the application. The following two steps are usually involved:Step 1:client Create a applicationclientprotocol#getnewapplication to get a unique app ID from ResourceManager.Step 2:client The application master is submitted to the RM via the RPC function applicaitonclientprotocol#submitapplication.
a fully functional yarnclient requires not only interacting with Resourcemanger, but also interacting with Applicationmaster to query the application's internal state (usually There is no information related to an app in ResourceManager ) or control the tasks within the application (for example, killing tasks, same,no detailed task related information in ResourceManager ), This part requires the application to design the communication protocol itself.      Note: In order to reduce the load of ResourceManager in actual use, once the application Applicationmaster successfully started, the client usually communicates directly with Applicationmaster, To query its execution state or to control its execution (for example, to kill a task, etc.).
Three applicationmaster design
am needs to interact with RM and NM Two services, I can get the resources required for task calculation by interacting with RM, and by interacting with NM, AM is able to start the calculation task and monitor it until it is complete.   am-rm Writing ProcessThe communication between AM and RM involves three steps, in detail such as the following:1. Brochuream Start, the first to the RM Register, the registration information encapsulated in the PROTOCLO buffers message registerapplicationmasterquest, mainly include the field:a.host the node host where this startup is locatedB.rpc_host:am The rpcport number of this launchC.tracking_url:am provides an external tracking web url,client that enables you to query the application's running state through this tracking_url.
Upon successful registration, you will receive the following information:A. The maximum amount of resources that can be applied to a single container. B.client_to_am_token_master_key:clienttoamtokenmasterkeyC.APPLICATION_ACLS: Application access Control List
2. Resource ApplicationApplicationmaster Requests resources (in the form of container) through the RPC function Applicationmasterprotocol#allocate to ResourceManager.The request data format mainly contains the following fields:2.1.ask:application Master requests a list of resources, each of which is represented by Resourcerequest, which the user can use Allocaterequest#getasklist/allocaterequest #SetAskList获取或设置请求资源列表. Resoucerequest includes the following fields:a.priority: Resource priority, which is a positive integer, with a lower value and higher precedence. B.resource_name: The node or rack of the desired resource, assuming "*", means that no matter what node the resource is capable of. c.capability: The amount of resources required, currently supporting both CPU and memory resourcesd.num_containers: Number of resources required to meet the above criteriae.relax_locality: Whether to relax the local nature, that is, whether or not to meet the local resources of the node, the self-selected rack of local resources or other resources. 2.2.release:am release Container list2.3.response_id: The response ID of this communication, each communication, this value will be added 1.2.4.progress: The running progress of the application. 2.5.blacklist_request: List of nodes requesting to add/remove blacklists
Note: Even if am does not need whatever resources, it still needs to periodically call the Applicationprotoclo#allocate function to maintain the heartbeat between ResourceManager , otherwise, Assuming that RM does not receive a message from AM for a certain amount of time, the system will feel it is dead, remove it from the system, or trigger a fault-tolerant mechanism. AM sends RPC resource requests every 1000 seconds.
response information such as the following:a_m_command:applicationmaster the command to be run, there are currently two main values, respectively, Am_resync and Am_shutdown, each of which represents a reboot and a shutdown. RM makes am closed when Rm discovers that the AM node is blacklisted. response_id: The response ID of this communication, each communication, this value will be added one. allocated_containers: The list of container assigned to the application. RM encapsulates each of the available resources into a single container, which has specific information about the resource and, generally speaking, Applicationmaster performs a task in this container after receiving a container. completed_container_statuses: A list of container states that have been executed, it should be noted that the status of container in the list is likely to be successful, failed to execute, and killed. limit: The total amount of resources available to the cluster right nowupdated_nodes: A list of all node execution states in the current cluster. num_cluster_nodes: Total number of available nodes in the current cluster   3. Program Exitam through the RPC function Applicationmasterprotocol$finishapplicationmaster tells the RM application to run complete and exit.

am-nm Writing Process
1.AM allocates the requested resource two times to the internal task, and communicates with the corresponding Nodemanger through the RPC function Containermanagementprotocol#startcontainer to start the container ( Including task description narrative, resource description and other information), the function of the type of the parameter is Startcontainersrequest2. In order to master each container execution state, AM will ask the container execution state through the RPC function to nm, once found a container execution failure, am can try again for the corresponding task request resources. 3. Once a container is executed, AM is able to release container through the RPC function Containermanagementprotocol#stopcontainer.
Note: 1.YARN is a resource management system that is responsible not only for allocating resources, but also for recovering resources. When a container executes, it proactively confirms that container has freed the corresponding resource, that is, the AM must call the RPC function after the end of the container execution containermanagementprotcol# Stopcontainer release container.          
Four SummaryWhen a user wants to write an application that executes on yarn, it is often necessary to implement two components, each client and Applicationmaster, where the client is used primarily to submit applications and manage applications. Applicationmaster is responsible for the application of task segmentation, scheduling, monitoring and other functions.


Hadoop-yarn Application Design Overview

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.