1. Overview
After the detailed information format, the network IO model is explained, and through the Java RMI explained by the preheating. Starting from this article we will go into another key knowledge system of this series of blogs: RPC. In the following articles, we first explain the basic concepts of RPC, what are the basic elements of a specific RPC implementation, and then we detail a typical RPC framework: Apache Thrift. Next we talk about the service governance and Dubbo Service framework. Finally, we summarize how to choose the appropriate RPC framework in the actual work.
2. RPC Overview 2-1, what is RPC
RPC (remote Procedure call Protocol) remoting procedure invocation protocol. A popular description is that the client calls an object that exists on the remote computer without knowing the details of the call, as if it were a call to an object in a local application. A more formal description is a protocol that requests services from a remote computer program over the network without needing to know the underlying network technology. So at least we'll dig a few points out of this description:
RPC is the protocol: Since the protocol is just a set of specifications, then someone needs to follow this set of specifications for implementation. The current typical RPC implementations include: Dubbo, Thrift, Grpc, Hetty, and so on. Here to illustrate, the current development trend of technology, the implementation of the RPC protocol application tools are often attached to other important functions, such as Dubbo also includes service management, access rights management and other functions.
The network protocol and the network IO model are transparent to them: since RPC clients think they are calling local objects. So the transport layer uses the TCP/UDP or the HTTP protocol, or some other network protocol, it doesn't need to be concerned. Since the network protocol is transparent to it, there is no need to care about which network IO Model callers are using during the call.
The information format is transparent to it: we know that in a local application, calls to an object need to pass some parameters and return a call result. The caller does not need to be concerned about how these parameters are used internally by the invoked object and the results of the processing are computed. So for remote calls, These parameters are passed to another computer on the network in some form of information format, and the caller does not need to be concerned .
There should be a cross-lingual ability: Why do you say that? Because the caller is not really sure what language the remote server's application is running in. So for call towners, this call should be successful regardless of the language used by the server, and the return value should also be described in the form that the caller program language can understand .
Then the above description can be expressed in the following cases:
2-2. RPC Features
Of course, it is a phenomenon observed by the caller of RPC (and the reality is that the client is more or less aware of some of the details of the call to RPC). But we are going to explain the basic concept of RPC, so what's going on inside the RPC protocol is clear:
The caller of the CLIENT:RPC protocol. As described above, the most desirable scenario is that the RPC client initiates a call to the remote service without knowing that the RPC framework exists. In reality, however, the client needs to specify some details of the RPC framework.
Server: In the RPC specification, this server is not a module that provides RPC server IP and port snooping. This is a concrete implementation of the remote service method (in Java, the implementation of the RPC service Interface). where the code is the most common and business-related code, even its interface implementation class itself is not known to be called by one of the RPC remote clients .
STUB/PROXY:RPC Agent exists in the client, because to implement the client to the RPC framework "transparent" call, then the client is not able to manage the message format itself, it is not possible to manage the network transport protocol itself, it is not possible to determine whether the call process is abnormal. All this work is done on the client side, which is given to the "agent" layer in the RPC framework.
Message Protocol: As we have already said, a complete client-server interaction must be a message format that can be identified at both ends of a common contract. The RPC message management layer specifically numbers and decodes the message information that is hosted on the network transport. The prevailing technology trend is that different RPC implementations have a set of (or a few sets) of proprietary message formats to enhance the efficiency of their own frameworks . For example, the message protocol used by the RMI framework described earlier is JRMP; later we will explain in detail the RPC framework thrift also has a private message protocol, "-Transfer/network Protocol" (Of course it also supports some common message formats, such as JSON).
Transfer/network Protocol: The transport protocol layer is responsible for managing the network protocols and network IO models used by the RPC framework. For example, the transport protocol for Hessian is based on the HTTP (Application-layer protocol), while the Thrift Transport Protocol is based on the TCP (Transport-layer protocol). The transport layer also needs to unify the IO model used by both the RPC client and the RPC server; the commonly used IO model has been explained in detail before (see my previous blog, architecture design: Inter-system Communication (3)--io communication model and Java practice)
Selector/processor: exists on the RPC server side due to the implementation of one of the RPC interfaces on the servers (it does not know that it is a service that will be called by RPC to a third-party system). Therefore, there should be a role in the RPC framework that is responsible for implementing RPC interface implementations. It is responsible for including: To manage the RPC interface registration, determine the client's request permissions, control the implementation of the interface implementation of the various types of work.
IDL: The fact that IDL (interface Definition Language) is not required in RPC implementations. But the RPC framework that requires cross-language is bound to have IDL parts. This is because you want to find a description of the message structure, the interface definition, that can be understood by various languages. If your RPC implementation does not consider cross-lingual, then the IDL section does not need to be included, such as Java RMI because it is intended to be used between Java languages, so Java RMI does not have the corresponding IDL.
It is important to note that different RPC framework implementations have some design differences . For example, the way the stubs are generated is different, the IDL description language is different, the service registration is managed differently, the service implementation is run differently, the message format is encapsulated differently, and the network protocol used is different. But the basic ideas are the same, the listed elements are also the same.
2-3, the typical RPC framework introduction
JAVA RMI: Does it feel like we introduced some of the key concepts mentioned in RMI in the previous article to find some shadow in RPC. Yes, RPC was first proposed by Sun and later revised by the IETF Onc. RMI is a typical RPC implementation, but RMI does not support cross-lingual, so there is no need for IDL in RMI. But RMI is really fast, and because there is no IDL, there are fewer steps to build a complete set of RPC implementations than other RPC frameworks, so it's easier to use them. If there are no cross-language considerations in your business needs, and basically the main system is implemented in Java, then RMI is definitely a scenario for you to consider.
GRPC:GRPC is a high-performance, general-purpose, open-source RPC framework, developed by Google primarily for mobile applications and based on the HTTP/2 protocol (note the HTTP/2 protocol, not the HTTP 1_1 we use most often. The detailed introduction to the HTTP/2 protocol can be found in the official address: https://http2.github.io/) Standard and is based on PROTOBUF (Protocol buffers) serialization protocol development and supports many development languages. In order to support the cross-linguistic nature of GRPC, GRPC has a separate IDL language. However, because Grpc is Google's Open source products, in the Information format packaging, Google is mainly to promote their own protobuf, so GPRC is not support other information format (at least protobuf efficiency is obvious to everyone). For a detailed introduction to the use of GRPC, see the official address: HTTPS://GITHUB.COM/GRPC/GRPC
Thrift:thrift is an open-source project on Facebook that later entered Apache for incubation. Thrift also supports cross-language, so it has its own set of IDL. It currently supports almost all major programming languages:C + +, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C #, Cocoa, JavaScript, node. js, Smalltalk, OCaml and Delphiand other languages. Thrift can support a variety of information formats, in addition to the thrift proprietary binary encoding Rules and a LVQ (similar to the TLV message format), as well as the regular JSON format. The Thrift network protocol is based on the TCP protocol, and supports the blocking IO model and the multiplexed IO multiplexing model. We will explain the use of Apache Thrift in detail later in this article. Thrift is also one of the most popular RPC frameworks, and the performance of thrift is leading from a variety of performance testing on the network. Thrift's official address is: http://thrift.apache.org/
The Hetty:hetty is a high-performance RPC framework built on Netty and Hessian. In the previous articles, I have described in detail the benefits of using Netty for network processing and using the Java native IO model directly for network processing (see my other blog post, "http://blog.csdn.net/yinwenjie/article/ details/48969853 "). Hetty's network protocol is based on HTTP, and because of the Netty, the Hetty supports the blocking IO model and the multiplexed IO multiplexing model. The Hetty message format is in a private binary stream format.
Dubbo:dubbo is the Alibaba Open source distributed service framework. Note that I am talking about the distributed Services Framework, not the RPC framework , which is summarized in more rigorous terms and should be the " Service governance Framework ". In addition to the specification of integrated RPC, Dubbo also builds service layer functionality, configuration layer functionality, and service routing capabilities (plus a total of 10 tiers in real RPC specification implementations) on top of RPC. The principle and use of Dubbo will be explained in detail after explaining "service governance".
Other RPC frameworks: In addition to the implementation of the RPC protocol of Appeal, there are: Wildfly, hprose and so on. Hprose is a Chinese-led RPC implementation that interested readers can take a look at (http://www.hprose.com/). In addition to RPC-based definitions, XFIRE,CXF these Web service frameworks also belong to the RPC:WSDL profile, which is their IDL, which generates stubs for different programming languages through WSDL, runs through different Web servers to manage specific service implementations, HTTP is their communication protocol, and XML is their message format.
3. Performance basis of RPC framework
With the same physical server performance, the following factors have a direct impact on the performance of an RPC framework:
Supported Network IO models: Your RPC server can support only traditional blocking synchronous Io, or it can make some improvements to enable your RPC server to support non-blocking synchronous Io, or to implement support for multiple IO models on your server. The performance of such RPC servers can vary greatly in high concurrency. In particular, the utilization rate of internal and CPU resources under the performance of unit processing .
Network protocols based on: In general, you can choose to have your RPC use the Application layer protocol, such as HTTP or the HTTP/2 protocol we mentioned earlier, or use the TCP protocol to let your RPC framework work in the transport layer. The level of work on the network will have a certain impact on the performance of the RPC framework, but the ultimate performance impact on RPC is not significant . But at least from the various mainstream RPC implementations, the UDP protocol is not used as the main transport protocol.
Selected Message Encapsulation Format: Select or define a message format encapsulation, the issues to be considered include: the readability of the message, describe the unit content of the message body size, coding difficulty, decoding difficulty, to solve the problem of half-pack/sticky packet difficulty. Of course, if you just want to define an RPC-specific message format, the legibility of the message may not be the most necessary consideration. The design of the message encapsulation format is the most important reason for the various RPC framework performance differences , which is why almost all major RPC frameworks design a proprietary message encapsulation format.
Implementation of service handling management: Under high concurrent requests, how to manage registered services is also a performance impact point. You can have RPC's selector/processor use a single thread to run a specific implementation of the service (which means that the request from the previous client has not been processed, the next client's request waits), You can also open a separate thread for each implementation of the RPC-specific service (you can process multiple requests at one time, but the operating system has a limit on the maximum number of threads that can be run), and you can also wire pool to run RPC-specific service implementations (it appears that, in the case of a single service node, This is a good approach, and you can also have multiple service nodes running specific RPC service implementations by registering the agent.
4, the following article introduction
Later, I will spend an article on the use of the Apache Thrift RPC Framework and the technical characteristics of the Thrift framework that distinguishes it from other RPC frameworks. And then we talk about how to effectively manage RPC services in large-scale systems, for many RPC services. Let's start with the solution and try to solve the problem ourselves. Finally, we introduce the Dubbo Distributed service framework to see how Dubbo solves this problem.
Copyright NOTICE: Welcome reprint, but look at my hard work, please specify the source: Http://blog.csdn.net/yinwenjie
Architecture Design: The basic concept of inter-system communication (--RPC)