Hadoop source code analysis (3) RPC server initialization structure

Source: Internet
Author: User

Statement: Personal originality. for reprinting, please indicate the source. I have referenced some online or book materials in this article. Please tell me if there is anything wrong.

This article is the note I wrote when I read hadoop 0.20.2 for the second time. I encountered many problems during the reading process and finally solved most of the problems through various ways. The entire hadoop system is well-designed and the source code is worth reading by distributed students. In the future, all the notes will be posted one by one, hoping to help you read the source code and avoid detours.

Directory

4 RPC server (Org. Apache. hadoop, IPC. Server)

4.1 server Initialization

4 RPC server (Org. Apache. hadoop, IPC. Server) 4.1 server Initialization

Create and start the RPC server code in the org. Apache. hadoop. HDFS. server. namenode. namenode. initialize method, so the RPC server initialization starts from this method:

this.server = RPC.getServer(this, socAddr.getHostName(), socAddr.getPort(),                                 handlerCount, false, conf);this.server.start();  //start RPC server

The PPC. getserver method returns an object of the server type. The actual returned type is an object of the rpc. server class. This is the RPC server. In fact, the rpc. getserver method Passes parameters to the rpc. Server constructor intact. Before continuing to trace the construction method of the rpc. Server Object, let's take a look at the parameters of the rpc. getserver method:

Rpc. getserver Parameters

Value

Description

Object instance

This = namenode Class Object

Namenode implements many protocol interfaces and can be used as RPC server instances.

String bindaddress

Socaddr. gethostname () = localhost

Namenode RPC server listening IP Address

Int Port

Socaddr. Fig () = 9000

Namenode RPC server listening port number

Numhandlers

Handlercount = 10

Number of handler threads in the RPC server

Verbose

False

Indicates whether to record logs for each remote call.

Conf

Conf

Global Configuration

The rpc. server class has three data members compared with its parent class: instance, verbose, and authorize. An instance is an instance that implements a protocol interface. verbose indicates whether to record each Remote Procedure Call to a log. Authorize indicates whether to perform a permission check for each Remote Procedure Call. By default, verbose and authorize are both set to false, indicating that no logs are logged or permission verification is performed. This is where hadoop is insecure. This is the construction method of the rpc. server class:

public Server(Object instance, Configuration conf, String bindAddress,    int port,int numHandlers, boolean verbose) throws IOException {        super(bindAddress,port,Invocation.class,        numHandlers, conf, classNameBase(instance.getClass().getName()));    this.instance = instance;    this.verbose = verbose;    this.authorize = conf.getBoolean(        ServiceAuthorizationManager.SERVICE_AUTHORIZATION_CONFIG, false);}

In this constructor, besides initializing the three members, the constructor of the parent class is called. Two new parameters are passed to the constructor of the parent class. One is invocation. Class, which indicates the Data Type of the Remote Call parameter and the other is the class name of the instance. The reason why the classnamebase operation is performed on the instance class name is that the class <?>. The getname method may obtain a name such as Java. Lang. String, so you need to take the last part of the name.

The Org. Apache. hadoop. IPC. server class is the core of the RPC server. Its important data members are as follows:

Member

Description

StaticThreadlocal <Server>Server

Indicates the object of the currently running rpc server, that is, the server set by the run method of listener, responder, or handler (these are internal classes. this (such "class name. this is used when an internal class references an external Class Object. This refers to its own object ).

Hadoop allows different threads to run different RPC servers. Therefore, for static members, it is necessary to add threadlocal to declare it as a local variable of the thread instead of sharing threads.

StaticConcurrenthashmap <string, class <?>Protocol_cache

Protocol Interface Buffer. The connection of the RPC client sends the protocol name of the string type, and the server converts it into a protocol class. This ing table stores the ing of "protocol name-> Protocol Class.

Only getprotocolclass can be used to perform get and put operations on protocol_cache in the server class. This method does not synchronize getprotocolclass in the class, nor is it synchronized on protocol_cache during the call, therefore, concurrenthashmap is not used to synchronize get and put operations of this hashmap. In fact, multiple put operations do not matter because it is impossible to have a protocol of the same name instead of the same protocol. Protocol_cache is operated by different threads. Therefore, use concurrenthashmap to protect the data objects in the hashmap.

For more information about concurrenthashmap, see 2.3.4.

StaticThreadlocal <call>Curcall

The method call currently being processed will be used in the run method of the handler thread. The reason for threadlocal is the same as that for the server to support multithreading.

String bindaddress

RPC server listening address, such as localhost

IntPort

RPC server listening port, such as 9000

IntHandlercount

Number of handler threads

Class <?ExtendsWritable> paramclass

The encapsulation type sent by the client, including the remote call name, parameter type list, and parameter list, generally org. Apache. hadoop. IPC. rpc. Invocation

Blockingqueue <call> callqueue

The listener fills in this callqueue, reads all method calls of all connections, and then submits them to handler for parallel processing. Therefore, thread-safe blockingqueue is used here.

For blockingqueue, see 2.3.5.

Collections.Synchronizedlist(NewLinkedonlist <connection> () connectionlist

The connectionlist here is similar to concurrenthashmap, and it is also a list of internal data structure security, but you still need to synchronize your own business logic, please refer to 2.3.6.

Connectionlist is called in the thread running the server and the listener. Multiple Threads access Shared resources. Of course, a secure set is required.

In addition, when connectionlist is used, you must synchronize some operations. In fact, synchronized (connectionlist) is used in the connectionlist code to synchronize operations.

Listener listener

Listener thread, used to listen to connections from RPC Clients and call data threads

Responder

The responder thread used to send the return value of a method call.

IntNumconnections

How many connections does the RPC server have?

Handler [] handlers

The handler thread that calls a remote method call and processes the return value of a method call.

According to these important data members, a server stores several server. Connection, a listener thread, a responder thread, and many handler threads.

Now let's take a look at how the org. Apache. hadoop. IPC. Server constructor sets these data members and constructs an RPC server. The server constructor code is as follows:

protected Server(String bindAddress, int port,                   Class<? extends Writable> paramClass, int handlerCount,                   Configuration conf, String serverName)     throws IOException {    this.bindAddress = bindAddress;    this.conf = conf;    this.port = port;    this.paramClass = paramClass;    this.handlerCount = handlerCount;    this.socketSendBufferSize = 0;    this.maxQueueSize = handlerCount * MAX_QUEUE_SIZE_PER_HANDLER;    this.callQueue  = new LinkedBlockingQueue<Call>(maxQueueSize);     this.maxIdleTime = 2*conf.getInt("ipc.client.connection.maxidletime", 1000);    this.maxConnectionsToNuke = conf.getInt("ipc.client.kill.max", 10);    this.thresholdIdleConnections = conf.getInt("ipc.client.idlethreshold", 4000);        // Start the listener here and let it bind to the port    listener = new Listener();    this.port = listener.getAddress().getPort();        this.rpcMetrics = new RpcMetrics(serverName,                          Integer.toString(this.port), this);this.tcpNoDelay = conf.getBoolean("ipc.server.tcpnodelay", false);    // Create the responder here    responder = new Responder();}

The main operation here is the construction of the listener thread, the acquisition of this. Port port and the construction of the responder thread. The initialization content of other server members is obvious.

The server. Listener class is a thread class used to listen to client connections, read the connectionheader sent by the client, and call data in a remote process. Listener uses NiO to manage socket connections and data transmission. Therefore, you only need one thread in the RPC server to process all connections and read data requests from the client. For programs that use the selector to manage connections, only one selector object is required. A traditional socket server will have a serversocket to receive many socket connection requests, create a thread for each connection to process this socket connection. It is recommended that selector have only one thread, because selector can efficiently process a large number of socket operations, putting multiple selectors in multiple threads increases the synchronization burden and thread switching overhead.

As mentioned above, listener is responsible for listening and establishing socket connections. It reads the call data remotely and processes all the data. This will definitely add to the burden of listener, making the listener thread a bottleneck, in fact, the hadoop development community is also discussing this issue.

After the listener reads the method, it puts the method into server. callqueue, and handler calls the actual method. The result is then processed by server. responder and returned to the client.

Before describing the listener construction process, let's take a look at the important data members of listener:

Member

Description

Serversocketchannel acceptchannel

Server socket

Selector

Manage the server socket and the socket created by the server socket

Inetsocketaddress address

The listener address is actually the RPC server address localhost: 9000

The following is the listener construction method:

public Listener() throws IOException {      address = new InetSocketAddress(bindAddress, port);      // Create a new server socket and set to non blocking mode      acceptChannel = ServerSocketChannel.open();      acceptChannel.configureBlocking(false);      // Bind the server socket to the local host and port      bind(acceptChannel.socket(), address, backlogLength);      port = acceptChannel.socket().getLocalPort(); //Could be an ephemeral port      // create a selector;      selector= Selector.open();      // Register accepts on the server socket with the selector.      acceptChannel.register(selector, SelectionKey.OP_ACCEPT);      this.setName("IPC Server listener on " + port);      this.setDaemon(true);    }

Call the serversocketchannel. Open method to obtain a socket with a channel and assign it to this. acceptchannel. This is the socket of the server where the RPC server listens to and receives data from the client. Then, bind the acceptchannel to the specified RPC server address, such as localhost: 9000.

To register the serversocketchannel object to the selector, you must set the serversocketchannel object to be non-blocking. In addition, to register the socketchannel to the selector, it must also be configured as non-blocking. Non-blocking Socket means that the accept operation of serversocket and the read operation of socket will return immediately no matter whether there is a connection or whether there is data readable. There is no problem in configuring the non-blocking working method of selector, because selector always waits until there is an operation request to prompt the user thread.

The selector. Open method is called to construct a selector, and then the selectionkey. op_accept operation of the acceptchannel is registered to this selector. Therefore, the listener construction is completed.

Listener has an operation to bind an acceptchannel. After this operation, the acceptchannel may automatically change to an available address due to a port conflict. Therefore, save the new port number to server. Port.

The server constructor constructs the server. responder object after the listener is constructed. The server. Responder function is to send the return values or error messages of Remote Procedure Calls back to the RPC client. Like listener, responder uses only one thread to process all the responses sent to the client using the call, because the responder also uses NiO.

Server. responder has only two members, writeselector selector and pending of int type. Writeselector manages all output channels that send data to the RPC client. Pending refers to the number of remote process calls waiting for the return value to be processed. The responder constructor only calls the selector. Open method to construct a selector and assign a value to writeselect, and initialize pending to 0.

Return to the code at the beginning of this section again. When the getserver method is called, an RPC is returned. when the Server Object is built, the RPC structure goes through the above process. The second line of code is to call this. server. start method, RPC. sever does not define this method. This method is defined in the server class. This start method is simple: Start the responder thread, start the listener thread, create handlercount handler threads, and start them one by one. The synchronized keyword is added before this method, so this method is thread-safe.

After these threads start, the entire RPC server starts to work. Therefore, all the work is done in the thread responder, listener, and handler.

For the RPC server, the first task must be server. listener, because you must first access the socket connection of a client and receive a remote method call data sent from the client. Other threads like handler can execute method calls, responder can then send the result of the method call to the client. My source code analysis should follow the execution process as much as possible. Therefore, we should first analyze the listener thread, then the handler thread, and finally the responder thread. This is the content of the next section.

(Full text)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.