Hadoop-yarn communication protocol

Source: Internet
Author: User
1 Introduction
The RPC protocol is the "main artery" connecting various components. Understanding the RPC protocol between different components helps us to learn more about the yarn framework. In yarn, there is only one RPC protocol between any two components that need to communicate with each other. For any RPC protocol, one end of the communication is the client and the other end is the server, the Client Always actively connects to the server. Therefore, yarn uses the pull-based communication model.
Protocol TypeYarn is mainly composed of the following RPC protocols. The communication protocols of each component (the component pointed by the arrow is the RPC server, and the component at the end of the arrow is the RPC client), as shown in:
Main communication protocols between yarn Components
The protocol between jobclient (job submission client) and RM-applicationclientprotocol: jobclient submits applications and queries application status through this RPC protocol.
Communication protocol between admin (Administrator) and RM-resourcemanageradministrationprotocol: Admin updates system configuration files through this RPC protocol, such as node blacklist and whitelist and user queue permissions.
Protocol between AM and RM-applicationmasterprotocol: am registers and revokes itself with RM through this RPC protocol, and applies for resources for each task.
Protocol between AM and NM-containermanagementprotocol: am requires the nm to start or stop the container through this RPC to obtain information such as the usage status of each container.
Protocol-resourcetracker: Nm registers with RM through this RPC protocol, and periodically sends heartbeat information to report the resource usage and container running status of the current node.
Four protocol BuffersTo improve the backward compatibility of hadoop and the compatibility between different versions, the serialization framework in yarn uses Google's open-source protocol buffers. In addition to serialization/deserialization, protocol buffers also provides the RPC function definition method. All RPC function parameters in yarn are defined by Protocol buffers. Compared with the writable serialization method in mrv1, the introduction of protocol buffers makes yarn a significant step forward in backward compatibility and performance. The following is a proto file that defines the RPC inermanager RPC protocol using protocol buffers: Service containermanagerservice { RPC startcontainer (startcontainerrequestproto) returns (startcontainerresponseproto ); RPC stopcontainer (stopcontainerrequestproto) returns (stopcontainerresponseproto ); RPC getcontainerstatus (getcontainerstatusrequestproto) returns (getcontainerstatusresponseproto ); } 

In addition, there are the following proto files:
Applicationmaster_protocol.proto: defines the protocol between AM and RM-applicationmasterprotocol. Applicationclient_protocol.proto: defines the protocol between jobclient (job submission client) and RM-applicationclientprotocol. Containermanagement_protocol.proto: defines the protocol between AM and NM-container-managementprotocol. Resourcemanager_administration_protocol.proto: defines the communication protocol between admin (Administrator) and RM-resourcemanageradministrationprotocol. Yarn_protos.proto: defines the RPC parameters of each protocol. Resourcetracker. proto: defines the protocol between NM and RM-resourcetracker.
In addition to the preceding protocols, yarn also uses protocol buffers to redefine the protocols in mapreduce: mrclientprotocol. proto: defines the protocol-mrclientprotocol between jobclient (job submission client) and mrappmaster. Mr_protos.proto: defines parameters of the mrclientprotocol.
SummaryThe Protocol Communication Mode of yarn uses the pull-based communication model instead of the push-base model. The advantage of the pull-based communication model is that it is easy to implement, but the disadvantage is that the communication latency is relatively large. In contrast, the implementation of the push-base model is complicated, however, it can reduce the communication latency between components.





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.