In layman's RPC-in-depth article

Source: Internet
Author: User

----------------------------------------------------------------------------------------------

"In-depth" we mainly focus on RPC's functional objectives and implementation considerations to expand, a basic RPC framework should provide what functionality, what requirements and how to implement it?

RPC feature Target

the primary function of RPC is to make it easier to build distributed computing (applications) without losing the semantic simplicity of local calls when providing powerful remote invocation capabilities. to achieve this goal, the RPC framework needs to provide a transparent calling mechanism so that the user does not have to explicitly distinguish between local calls and remote calls, and in the previous "shallow out" we give a realization structure, which is based on stub structure. Below we will specifically refine the implementation of the stub structure.

RPC Call Classification

RPC calls are divided into the following two types:

[Plain]View Plaincopy
    1. 1. Synchronous invocation
    2. The client waits for the call execution to complete and returns the result.
    3. 2. Asynchronous invocation
    4. The client does not have to wait for the result to return after the call, but can still get the return result by callback notification.
    5. If the client does not care about invoking the return result, it becomes a one-way asynchronous call, and one-way invocation does not return results.

The distinction between asynchronous and synchronous is whether to wait for the service side to finish and return the results.

RPC Fabric Disassembly

"Shallow out" gives a coarse-grained RPC implementation concept structure, where we further refine what components it should comprise, as shown in.

The RPC Service party RpcServer exports the (export) remote interface method, and the client passes the RpcClient (import) remote interface method. The client side calls the remote interface method just like the local method, and the RPC framework provides the proxy implementation of the interface, and the actual invocation is delegated to the proxy RpcProxy . The agent encapsulates the invocation information and transfers the call RpcInvoker to the actual execution. The client RpcInvoker passes through the connector RpcConnector to maintain the channel with the server RpcChannel , and uses the RpcProtocol Execute Protocol encoding (encode) and sends the encoded request message through the channel to the service party.

The RPC service-side sink receives RpcAcceptor a call request from the client, using the same RpcProtocol execution protocol decoding (decode). The decoded call information is passed to the RpcProcessor de-control processing call procedure, and finally the delegate invocation is RpcInvoker actually executed and returns the result of the call.

RPC Component Responsibilities

We have further disassembled the various components of the RPC implementation structure, and we describe in detail the division of responsibilities for each component below.

[Plain]View Plaincopy
  1. 1. Rpcserver
  2. Responsible for exporting the Remote Interface (export)
  3. 2. Rpcclient
  4. The proxy implementation that is responsible for importing (import) the remote interface
  5. 3. RpcProxy
  6. Proxy implementations of remote interfaces
  7. 4. Rpcinvoker
  8. Client-side implementation: Responsible for encoding call information and sending call requests to the service party and waiting for the call result to return
  9. Service-Side implementation: Responsible for invoking the specific implementation of the server-side interface and returning the result of the call
  10. 5. RpcProtocol
  11. Responsible for protocol compilation/decoding
  12. 6. Rpcconnector
  13. Responsible for maintaining the connection channel between the client and the service party and sending the data to the service party
  14. 7. Rpcacceptor
  15. Responsible for receiving client requests and returning request results
  16. 8. Rpcprocessor
  17. Responsible for controlling the call process in the service side, including managing the call thread pool, time-out, etc.
  18. 9. Rpcchannel
  19. Data transmission Channel
RPC Implementation Analysis

After further dismantling the components and dividing the responsibilities, here is an example of implementing the RPC framework conceptual model in the Java platform, in detail to analyze the factors that need to be considered in the implementation.

Export Remote Interface

Exporting a remote interface means that only the exported interface can be called remotely, and an interface that is not exported cannot. The code snippet for exporting an interface in Java might look like this:

[Java]View Plaincopy
    1. Demoservice demo = new ...;
    2. Rpcserver Server = new ...;
    3. Server.export (Demoservice.   Class, demo, options);

We can export the entire interface, or we can only export some methods in the interface at a finer granularity, such as:

[Java]View Plaincopy
    1. Export only methods that are signed as hi (String s) in Demoservice
    2. Server.export (Demoservice. Class, demo, "Hi", new class<?>[] {String.   Class}, Options);

In Java there is a more special call is polymorphic, that is, an interface may have multiple implementations, then the remote invocation of which is called exactly? The semantics of this local invocation are implicitly implemented by the reference polymorphism provided by the JVM, and for RPC, cross-process calls cannot be implicitly implemented. If the front DemoService interface has 2 implementations, then the interface needs to be specifically labeled with different implementations, such as:

[Java]View Plaincopy
    1. Demoservice demo = new ...;
    2. Demoservice Demo2 = new ...;
    3. Rpcserver Server = new ...;
    4. Server.export (Demoservice.   Class, demo, options);
    5. Server.export ("Demo2", Demoservice.   Class, Demo2, Options);

The above Demo2 is another implementation, we marked "Demo2" to export, then the remote call also need to pass the token to invoke the correct implementation class, which solves the semantics of the polymorphic call.

Import the remote interface with the client Agent

The import is relative to the export remote interface, in order for the client code to be able to invoke a method or procedure definition that must obtain the remote interface. At present, most of the cross-language platform RPC framework uses code generator to generate stub code based on the IDL definition, in this way the actual import process is done through the coding generator at compile time. Some of the cross-lingual platform RPC frameworks I've used, such as Corbar, WebService, ICE, and Thrift, are all of these ways.

The way code is generated is an inevitable choice for cross-language platform RPC frameworks, and RPC for the same language platform can be implemented through shared interface definitions. The code snippet for importing an interface in Java might look like this:

[Java]View Plaincopy
    1. Rpcclient client = new ...;
    2. Demoservice demo = Client.refer (demoservice.   Class);
    3. Demo.hi ("How is You?");

In Java, ' import ' is a keyword, so we use refer to express the meaning of the import interface in the code snippet. The import method here is essentially a code generation technique, but it is generated at run time, which is more concise than the code generation at the static compile time. Java provides at least two techniques to provide dynamic code generation, one is the JDK dynamic agent, and the other is bytecode generation. Dynamic proxies are easier to use than bytecode generation, but the dynamic proxy approach is less performance-generated than direct bytecode generation, and bytecode generation is much worse in code readability. The tradeoff between the two is that it is more important for individuals to sacrifice some performance to gain code readability and maintainability.

Protocol encoding and decoding

The client agent needs to encode the invocation information before initiating the call, which takes into account what information needs to be encoded and in what format to be transmitted to the server to allow the server to complete the call. For efficiency reasons, the less information you encode, the better (less data is transmitted), and the simpler the coding rule is, the better (and the more efficient). Let's start by looking at what we need to code:

[Plain]View Plaincopy
    1. --Call code--
    2. 1. Interface methods
    3. Include interface name, method name
    4. 2. Method parameters
    5. Include parameter type, parameter value
    6. 3. Calling Properties
    7. Includes calling property information, such as calling attachment implicit arguments, calling time-outs, and so on
    8. --Return code--
    9. 1. Return results
    10. The return value defined in the interface method
    11. 2. Return code
    12. Exception return code
    13. 3. Return exception information
    14. Calling exception information

In addition to these necessary invocation information, we may need some meta-information to facilitate program decoding and possible future extensions. In this way our code message is divided into two parts, part of the meta-information, and the other part is the necessary information to invoke. If you design an RPC protocol message, the meta-information is placed in the protocol message header, and the necessary information is placed in the protocol message body. The following is a conceptual format for RPC protocol message design:


[Plain]View Plaincopy
  1. --Message header--
  2. Magic: Protocol magic number, for decoding design
  3. Header size: Protocol header length, for extended design
  4. Version: Protocol versions, for compatible designs
  5. ST: type of message body serialization
  6. HB: Heartbeat message marker for long connection transmission layer heartbeat design
  7. OW: One-way message flag,
  8. RP: Response message token, pail bit default is request message
  9. Status code: Response message State Code
  10. Reserved: reserved for byte alignment
  11. Message ID: Msg ID
  12. Body size: Message body length
  13. --Message body--
  14. Using serialized encoding, the following format is common
  15. XML: such as Webservie soap
  16. JSON: such as Json-rpc
  17. Binary: As thrift; Hession; Kryo, etc.

Format determined after the codec is simple, because the length of the head must be so we are more concerned about the message body serialization mode. Serialization we care about three areas:

1. The efficiency of serialization and deserialization, the faster the better.
2. The byte length after serialization, the smaller the better.
3. Serialization and deserialization compatibility, interface parameter object if the field is added, whether it is compatible.

Above these three points is sometimes the fish and bear paw can not have, which involves the specific serialization library implementation details, not in this article further analysis.

Transfer Service

After the protocol is encoded, it is natural to transfer the encoded RPC request message to the service party after the service party executes and returns the result message or acknowledgment message to the client. The application scenario of RPC is essentially a reliable request-reply message flow, similar to HTTP. Therefore, the choice of long-connection TCP protocol is more efficient, unlike HTTP is at the protocol level we define a unique ID for each message, so it is easier to reuse the connection.

With long connections, the first question is how many root connections are needed between the client and server? In fact, single-connection and multi-connection in the use of no difference, for the low data transmission of the application type, a single connection is basically enough. The biggest difference between single-and multi-connection is that each connection has its own private send and receive buffers, so a large amount of data can be distributed over different connection buffers for better throughput efficiency. So, if your data transfer volume is not enough to keep a single-connected buffer saturated, then using multiple connections does not create any noticeable elevation, but increases the overhead of connection management.

The connection is initiated and maintained by the client side. If the client and server are directly connected, the connection is generally uninterrupted (except for physical link failures, of course). If the client and server connections are connected through some load relay devices, it is possible that these intermediate devices will be interrupted when the connection is inactive for a period of time. In order to maintain connectivity it is necessary to periodically send heartbeat data for each connection to maintain the connection uninterrupted. The heartbeat message is an internal message used by the RPC framework library, and there is a dedicated heartbeat bit in the previous protocol header structure that is used to mark the heartbeat message, which is transparent to the business application.

Execute call

What the client stub does is simply encode the message and transfer it to the service party, and the actual invocation process takes place on the service side. The server stub from the previous structure disassembly we subdivide RpcProcessor and RpcInvoker two components, one is responsible for controlling the calling process, one responsible for the real call. Here we also take the implementation of these two components in Java as an example to analyze what they need to do?

Dynamic interface calls to implement code in Java are now generally invoked through reflection. In addition to the native JDK's own reflection, some third party libraries provide better-performing reflection calls, so it RpcInvoker encapsulates the implementation details of the reflection invocation.

What are the factors that need to be considered for the control of the calling process and RpcProcessor what kind of call control services are required? Here are some ideas to enlighten:

[Plain]View Plaincopy
    1. 1. Efficiency improvement
    2. Each request should be executed as soon as possible, so we cannot create threads for each request to execute and need to provide a thread pool service.
    3. 2. Resource Isolation
    4. When we export multiple remote interfaces, how to prevent a single interface call from occupying all of the thread resources and throwing other interfaces to execute blocking.
    5. 3. Timeout control
    6. When an interface executes slowly, and the client side has timed out the wait, the server-side thread continues to execute at this point, which makes no sense.
RPC Exception Handling

No matter how RPC tries to disguise remote calls as local calls, they are still very different, and there are some exceptions that are never encountered when called locally. Before we say exception handling, let's compare some of the differences between local calls and RPC calls:

1. The local call is bound to execute, and the remote call does not necessarily, the invocation message may not be sent to the service party because of network reasons.
2. Local calls only throw exceptions that are declared by the interface, and remote calls run out of other exceptions when the RPC framework runs.
3. The performance of local and remote calls can vary greatly, depending on the proportion of RPC intrinsic consumption.

It is these differences that determine the need for more consideration when using RPC. When calling the remote interface to throw an exception, the exception could be a business exception, or it could be a run-time exception thrown by the RPC framework (such as a network outage, etc.). A business exception indicates that the service party has made a call, possibly due to a failure to perform properly for some reason, while the RPC runtime exception may not be executed at all, and the exception handling policy for the caller naturally needs to be differentiated.

Because RPC inherently consumes several orders of magnitude higher than local calls, the intrinsic consumption of local calls is nanosecond, while the intrinsic consumption of RPC is at the millisecond level. It is not appropriate for too lightweight computing tasks to export the remote interface is serviced by a separate process, and it is worthwhile to export the service to the remote interface only if the time spent on the computation task is much higher than the intrinsic consumption of RPC.

Summarize

At this point we present a conceptual framework for RPC implementations and detailed analysis of some of the implementation details that need to be considered. No matter how elegant the concept of RPC, but "there are still a few snakes in the grass hidden", only a deep understanding of the nature of RPC, can be better applied.

In layman's RPC-in-depth article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.