Summary of the RPC communication Principles for Hadoop Learning < four >--hdfs

Source: Internet
Author: User

Here first write your own study of RPC note summary, the following will be described in detail the learning process:

RPC (Remote Procedure Call)

the invocation of an object method between different Java processes.
A party is called a server, and a party is called a client.
The server side provides an object for the client to invoke, and the execution of the method of the invoked object occurs on the server side.

RPC is the foundation for the Hadoop framework to run.
What do you get from the RPC small example?
1. The object provided by the server must be an interface, interface extends Versioinedprotocal

2. The methods in the object that the client can have must be in the interface of the object.

To view a derived class or implementation class for a base class or interface---the mouse to the class name, Ctrl + T;

Look at the call relationship of the function (find all functions that call the method)--ctrl + Alt + H (Ubuntu system shortcut key occupancy, can be class name right-click on open Call Hierarchy, results in console output);

Quickly find information about a class object-Ctrl + O (Find all member variables and methods for the class name), F3 view the definition of the class name.


RPC is a remote procedure call (remotely Procedure call) that calls Java object running in other virtual machines remotely. RPC is a client/server pattern that includes the service-side code and client code when used, as well as the remote procedure object we invoke.

The operation of HDFS is built on this basis. This paper analyzes the operation mechanism of HDFS by implementing a simple RPC program.

1. First define the interface of the remote calling class, the interface inherits the Versionedprotocal, is the interface of the RPC of Hadoop, all RPC communication must implement this one interface, to ensure the port of the client and server is consistent. The service-side called class must inherit this interface versionedprotocal.

Package com. Rpc;import Org.apache.hadoop.ipc.versionedprotocol;public Interface Mybizable extends versionedprotocol{    // Defines the abstract class method Hellopublic abstract String Hello (string name);}
2. Then write the remote call class, implement this interface mybizable, there are two methods are implemented, one is the Hello method, the other is the Getprotocalversion method.

Package com. Rpc;import java.io.ioexception;//Implementing Interface mybizable, overriding the Hello and Getprotocolversion methods public class Mybiz implements mybizable{ public static long biz_version = 123456L; @Overridepublic string Hello (string name) {System.out.println ("I am bybiz, I am called. "); return" Hello "+ Name;} @Overridepublic Long Getprotocolversion (String protocol, long ClientVersion) throws IOException {//<span style= " Color: #FF0000; " > Return biz_version, ensure the server and client request version consistent </span>return biz_version;}}
3. With the remote Call object, we can write server-side code, detailed in the code is described.

Package com. Rpc;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.ipc.rpc;import Org.apache.hadoop.ipc.rpc.server;public class MyServer {//define final type server address and port public static final String server_address = "localhost";p ublic static final int server_port = 1234;/** * RPC is a remote procedure call (remotes Procedure calls) */public static void main (S Tring[] args) throws Exception {configuration conf = new Configuration ()///Key Rpc.getserver method, the method has four parameters, the first parameter is the called Java object ,//The second parameter is the address of the server, and the third parameter is the port of the server. After obtaining the server object,//start the server. In this way, the server listens to the client's request on the specified port. Final Server server = Rpc.getserver (new Mybiz (), server_address, Server_port, conf); Server.start ();}}
4. Finally, we can write client-side code to invoke the server method, and notice that the method is implemented on the server.


Package com. Rpc;import Java.net.inetsocketaddress;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.ipc.rpc;public class MyClient {/** * RPC client */public static void main (string[] args) throws Exception {/ /rpc.getproxy (), the method has four parameters, the first parameter is called the interface class,//The second is the client version number, the third is the server address. The proxy object returned,//is the proxy of the service-side object, which is implemented internally using Java.lang.Proxy. Final mybizable proxy = (mybizable) rpc.getproxy (Mybizable.class, Mybiz.biz_version, New Inetsocketaddress ( Myserver.server_address, Myserver.server_port), New Configuration (), or the method in the calling interface final String result = Proxy.hello (" *//print back the result, then close the network connection System.out.println (result); Rpc.stopproxy (proxy);}}
Note that the interface in the RPC get proxy method above is the interface object of the calling object, which can be derivedThe method of the business class that is invoked on the client is defined in the interface of the business class. The interface implements the Versionedprotocal interface.

When the above operation is completed, we start the server and then start the client. Observe the server and client output information. We then look at the command JPS of the Hadoop node runtime before entering the command line, output such as:

We can see a Java process, which is "MyServer", which is the server-side class MyServer of the RPC we just ran. It can be judged that the 5 Java processes generated by Hadoop should also be the service side of RPC. We observe the source code of NameNode, and we can see that NameNode did create the server side of the RPC (in the initialization initialize method of the NameNode class, for the convenience of observation, I copied the source, focusing on the Create Server section).

 /** * Initialize Name-node. * * @param conf the configuration */private void Initialize (configuration conf) throws IOException {Inetsocketa    ddress socaddr = namenode.getaddress (conf);    Usergroupinformation.setconfiguration (conf); Securityutil.login (conf, Dfsconfigkeys.dfs_namenode_keytab_file_key, Dfsconfigkeys.dfs_namenode_user_name_key, so    Caddr.gethostname ());        int handlercount = Conf.getint ("Dfs.namenode.handler.count", 10); Set Service-level Authorization Security Policy if (serviceauthenabled = Conf.getboolean (Servi Ceauthorizationmanager.service_authorization_config, False)) {Policyprovider Policyprovider = (PolicyProvi Der) (Reflectionutils.newinstance (Conf.getclass (Policyprovider.policy_provider_config, HDFS        Policyprovider.class, Policyprovider.class), conf);    Serviceauthorizationmanager.refresh (conf, policyprovider); } mymetrics = NameNodeInstrumentation.create (conf);    This.namesystem = new Fsnamesystem (this, conf);    if (usergroupinformation.issecurityenabled ()) {Namesystem.activatesecretmanager (); } <span style= "color: #FF0000;"    >//Create RPC server</span> inetsocketaddress dnsocketaddr = getservicerpcserveraddress (conf); if (dnsocketaddr! = null) {int servicehandlercount = Conf.getint (dfsconfigkeys.dfs_namenode_service_handler_c      Ount_key, Dfsconfigkeys.dfs_namenode_service_handler_count_default); This.servicerpcserver = <span style= "color: #FF0000;"  >rpc.getserver (this, Dnsocketaddr.gethostname (), Dnsocketaddr.getport (), Servicehandlercount, False, conf, Namesystem.getdelegationtokensecretmanager ());</span> this.servicerpcaddress =      This.serviceRpcServer.getListenerAddress ();    Setrpcserviceserveraddress (conf); } This.server = Rpc.getserver (this, Socaddr.gethostname (), Socaddr.getport (), Handlercount,False, Conf, Namesystem. Getdelegationtokensecretmanager ()); The Rpc-server port can be ephemeral ... ensure we have the correct info this.serveraddress = This.server.getListener     Address ();    Filesystem.setdefaulturi (conf, GetURI (serveraddress));    Log.info ("Namenode up at:" + this.serveraddress);    Starthttpserver (conf);  This.server.start ();          Start RPC server if (servicerpcserver! = null) {Servicerpcserver.start ();  } starttrashemptier (conf); }
From the above you can see that NameNode itself is a Java process. Observing the first parameter of the Rpc.getserver () method in Figure 5-2, the discovery is this, indicating that NameNode itself is a called object on the service side, that is, the method in NameNode can be called by the client code. According to the principle of RPC operation, the method that Namenode exposes to the client is located in the interface.
Continue to see the NameNode class interface implementation, you can see NameNode implementation of Clientprotocal, Datanodeprotocal, namenodeprotocal and other interfaces.

Following is an analysis of the functions of these common interfaces implemented by Namenode, who invoke the implementation:

clientprotocal called by dfsclient
This interface is for the client to invoke. The client here is not referring to the code we write ourselves, but a class of Hadoop called Dfsclient. The methods in clientprotocal are called in Dfsclient to complete some operations.
Most of the methods in this interface are operations on HDFS, such as Create, delete, mkdirs, rename, and so on.


datanodeprotocal called by Datanode
This interface is intended for DataNode invocation. DataNode calls the methods in this interface to report the status and block information of this node to NameNode.
NameNode cannot send messages to DataNode, only through the return value of the method in that interface to DataNode.


namenodeprotocal called by Secondarynamenode
This interface is intended for Secondarynamenode invocation. Secondarynamenode is specialized in namenode edits files to merge data into the fsimage.

For the interface implementation of the Datanode node, the analytical thinking is generally consistent, is based on the specific interface, view the interface protocol function, which is to view the Hadoop source learning experience, some of the common MyEclipse shortcut keys are as follows:

To view a derived class or implementation class for a base class or interface---the mouse to the class name, Ctrl + T;

Look at the call relationship of the function (find all functions that call the method)--ctrl + Alt + H (Ubuntu system shortcut key occupancy, can be class name right-click on open Call Hierarchy, results in console output);

Quick Find information about the class object-Ctrl + O (Find all member variables and methods of the class name), F3 view the definition of the class name;

The specific shortcut keys can be used to view all available methods by using the mouse class name and right-click.

That for Datanode interface everyone can through the above shortcut key exercises, also analysis under the interface function, learning Datanode important two interfaces, respectively is interdatanodeprotocal, clientdatanodeprotocal. I'll wrap it up here today.

Summary of the RPC communication Principles for Hadoop Learning < four >--hdfs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.