Introduction to the RPC mechanism of hadoop

Source: Internet
Author: User

 

 

As a distributed computing framework, hadoop must involve RPC. Hadoop does not use the RPC technology provided in JDK, but implements an RPC mechanism by itself.

The RPC logic of hadoop can be divided into three parts:

 

A. Communication Protocol

B. Servers

C. Client

 

The structure is as follows:

 

 

1
Communication Protocol

The communication protocol here is not a network communication protocol, but a client/server communication interface. The client needs to communicate with the server. Different functions require different interfaces. Versionedprotocol is the superclass (Interface) of all communication protocols. It only defines one method.

 

2
Server

The server listens to client requests through socket, obtains the methods and parameters that the client needs to call, uses the Java reflection mechanism to call the corresponding methods, and returns the results to the client.

The following describes several key services on the server:

2.1 org. Apache. hadoop. IPC. Server

This is an abstract class that implements listening to the client, request processing framework, and return the results to the client. However, the specific processing is implemented by the Implementation class.

2.2 org. Apache. hadoop. IPC. rpc. $ Server

This class is the implementation of org. Apache. hadoop. IPC. server and mainly implements the processing of customer requests.

 

When the server starts, several threads are started to respond to client requests.

1) Listener thread

This thread is responsible for listening to client requests and receiving data, and then forming a call instance for the received data to be placed in the Request queue.

2) Handler thread

This thread extracts the call request from the request queue and calls the abstract method.

Public abstract writable
Call (class <?> Protocol, writable Param, long receivetime) to process the call request and return the result to the client.

3) Responder thread

The response data is returned to the client by the handler thread, but if there is unfinished data, the responder thread returns the client.

 

3
Client

 

The RPC client code of hadoop is actually a class: org. Apache. hadoop. IPC. Client. This class uses the dynamic proxy technology of Java to generate a proxy for the server's business interface, send the called business methods and parameters to the server through socket, and wait for the server to respond.

The client call sequence is shown below:


 

1)
The RPC client user first calls the waitforproxy method of RPC to obtain the dynamic proxy of the remote service interface. For example, when calling namenode in datanode, the Code is as follows:

 
 

This. Namenode = (datanodeprotocol)

RPC.Waitforproxy(Datanodeprotocol.Class,

Datanodeprotocol.Versionid,

Namenodeaddr,

Conf );

 

 

 

 

 

 

 

 

 

2)
RPC calls Java's dynamic proxy class proxy to obtain the dynamic class, proxy. newproxyinstance.

 
 

Versionedprotocol
Proxy =

(Versionedprotocol) proxy.Newproxyinstance(

Protocol. getclassloader (),NewClass [] {
Protocol },

NewInvoker (ADDR, ticket,
Conf, factory ));

 

 

 

 

 

 

 

 

 

3)
After obtaining the dynamic class, call the business method. Invoker implements invocationhandler, and all business methods must pass

Public object invoke (Object proxy, Method
Method, object [] ARGs.

 

 
 

Objectwritable
Value = (objectwritable)

Client. Call (New
Invocation (method, argS), address,


Method. getdeclaringclass (), ticket );

 

 

 

 

 

 

4)
Client. Call assembles the parameters into a call instance, obtains the connection with the server, sends the parameters to the server, and synchronously waits for the server to return results.

Call call =NewCall (PARAM );

Connection connection =
Getconnection (ADDR, protocol, ticket, call );

Connection. sendparam (CALL );

......


While (! Call. Done ){


Call. Wait (); // wait for
Result


}

......

Return call. value;

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.