A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
In a distributed service framework, one of the most fundamental issues is how remote services communicate, and there are many technologies in the Java domain that enable remote communication, such as RMI, MINA, ESB, Burlap, Hessian, SOAP, EJB, and JMS, What is the relationship between these nouns, and what is the rationale behind them, to understand these are the fundamentals of implementing a distributed service framework, and if there is a high level of performance requirements, then it is necessary to understand the mechanisms behind them.1 Fundamentals
To achieve communication between network machines, first of all to see the fundamentals of computer system network communication, at the bottom level to see, network communication needs to do is to transfer the flow from one computer to another computer, based on the transmission protocol and network IO to achieve, wherein the transport protocol is more famous TCP, UDP and so on, TCP , UDP is based on the concept of the socket for a certain type of application of the expansion of the transport Protocol, network IO, mainly bio, NIO, Aio three ways, all distributed application communication based on this principle is realized, just for the application of ease of use, Various languages often offer application-level protocols that are more accessible to applications.2 message Mode
In the final analysis, enterprise Application system is the processing of data, and for a multi-subsystem enterprise Application System, its basic support is undoubtedly the processing of messages. Unlike objects, a message is essentially a data structure (of course, an object can also be seen as a special message) that contains data that both consumers and the service can recognize, which need to be passed between different processes (machines) and potentially consumed by a number of completely different clients . Message delivery seems to be better than file delivery and remote procedure call (RPC) because it has better platform independence and is well-supported for concurrent and asynchronous calls.
For Web service and restful, it can be seen as a derivative or encapsulation of messaging technology.2.1 Message Channel Mode
The message pattern that we often use is the message channel mode.
The message channel, as an indirect layer introduced between the client (consumer, Consumer) and the service (producer, Producer), can effectively relieve the coupling between them. The consumer's "ignorance" of the producer can be achieved by implementing a message format that requires communication between the two parties, as well as the mechanism and timing for processing the message. in fact, the model can support multiple producers and consumers . For example, we can have multiple producers send messages to the message channel because consumers are ignorant of the producers, and it does not have to consider which producer sent the message.
While the message channel relieves the coupling between the producer and the consumer, it allows us to arbitrarily expand the producer and consumer, but it also introduces its own dependency on the message channel, because they must know the location of the channel resources. To remove this dependency on the channel, consider introducing the lookup service to find the channel resource. For example, in JMS, the message channel queue can be obtained through JNDI. For full flexibility, you can store channel-related information in a configuration file, and the lookup service first obtains the channel by reading the configuration file.
The message channel usually exists as a queue, and this FIFO-first data structure is undoubtedly best suited for this scenario of processing messages. Microsoft's MSMQ, IBM MQ, JBoss MQ, and Open source RABBITMQ, Apache ACTIVEMQ all implemented the message channel pattern through the queue. Therefore, in choosing to use the Message channel mode, it is more necessary to analyze and weigh all kinds of products that implement the model from the aspect of quality attribute. For example, the message channel supports concurrency and performance, whether the message channel adequately considers error handling, support for message security, and support for message persistence, disaster preparedness (fail over), and clustering.
Because the messages passed by the channel are often important business data, once the channel becomes a point of failure or a breach of security, it can have a disastrous impact on the system.
The mechanism of Jndi is also mentioned here, because Jndi depends on the specific implementation, which can only be explained in the implementation of JBoss Jndi:
2.2 Publisher-Subscriber (Publisher-subscriber) mode
When the object instance is bound to JBoss JNP server, when the remote end obtains the remote object instance in Context.lookup () mode and begins the call, the implementation of JBoss Jndi is to get the object instance from JNP server, serialize it back to local, It is then deserialized locally, and then the class is called locally.
Through this mechanism, we can know that the local actually must have bound to the JBoss object instance class, otherwise the deserialization will certainly fail, and the remote communication needs to do is to perform a remote operation, and obtain the corresponding results, can be seen purely based on Jndi is unable to achieve remote communication.
But Jndi is also a key technical point for implementing a distributed service framework, because it allows for transparent remote and local calls, like EJBs, and it's a good way to hide actual deployment mechanisms (like DataSource).
Once the message channel needs to support multiple consumers, it is possible to face the choice of two models: Pull model and push model. Pull model is initiated by the consumer of the message, the initiative is held in the hands of consumers, it will be based on their own situation to the producers to initiate calls. ：
Another embodiment of the pull model is that the producer notifies the consumer when the state has changed. However, the notified consumer will get more details by calling the passed-in consumer object in a callback manner.
In a message-based distributed system, the consumers of the pull model usually listen to the channel periodically in the form of a batch job, based on a predetermined interval of time. Once a message is found to be delivered, it will instead pass the message to the real processor (or to the consumer) to process the message and execute the relevant business.
The initiative of pushing models is often in the hands of producers, and consumers are passively awaiting notification from producers, which requires producers to be aware of consumer-related information. ：
For push models, consumers do not need to know the producers. When the producer notifies the consumer, it is often the message (or event) that is passed, not the producer itself. At the same time, producers can also register different consumers according to different circumstances, or in the packaging of the notification logic, according to different state changes, notify different consumers.
Both models have the advantage. The advantage of the pull model is that it can further relieve the consumer's reliance on the channel and periodically access the message channel through the background task. The downside is the need to introduce a separate service process to execute in schedule form. For the push model, the message channel is actually the subject of the consumer observation, and once the message is found, the consumer is notified to perform the processing of the message. regardless of the push model, pull the model, for the message object, may adopt similar observer pattern mechanism, realizes the consumer to the producer's subscription, therefore this mechanism is often called the Publisher-subscriber pattern ,:
Typically, publishers and Subscribers are registered to the infrastructure that is used to propagate the changes (that is, the message channel). The publisher proactively understands the message channel so that it can send messages to the channel, and once the message channel receives the message, it proactively invokes subscribers registered in the channel to complete the consumption of the message content.
For subscribers, there are two ways to process messages. One way is the broadcast mechanism, when messages in the message channel are out-of-order, and you need to copy the message object and pass the message to multiple subscribers . For example, there are several subsystems that need to obtain customer information from the CRM system, and according to the delivery of customer information, the corresponding processing. The message channel at this point is also known as the Propagation channel. The other way is the preemption mechanism, which follows the synchronization method, where only one subscriber can process the message at a time . The message channel that implements the Publisher-subscriber mode selects the currently idle unique subscriber and then dequeue the message and passes it to the subscriber's message handling method.
Currently, there are many message middleware that can support the publisher-subscriber pattern, such as the Messagepublisher and Messagesubscriber interfaces provided in the JMS interface protocol for topic objects. RABBITMQ also provides its own implementation of the pattern . Microsoft's MSMQ, while introducing an event mechanism, can trigger events to notify subscribers when a message is received by the queue. But it is not the publisher-subscriber pattern implementation in the strict sense. Nservicebus, a major contributor to the Microsoft MVP Udi Dahan, has done a layer of packaging for MSMQ and WCF, and is well-implemented in this model.2.3 Message Routing (MSG Router) mode
Whether it is the message channel mode or the Publisher-subscriber mode, the queue plays a pivotal role in it. However, in enterprise applications, when systems become more complex, the requirements for performance are increasing, and for systems, it may be necessary to support the deployment of multiple queues at the same time, and may require a distributed deployment of different queues. These queues can receive different messages by definition, such as order Processing messages, log messages, query task messages, and so on. At this point, it is not appropriate for the producer and consumer of the message to assume responsibility for determining the message delivery path. In fact, according to the principle of s single responsibility, this assignment is unreasonable, which is not conducive to the reuse of business logic, but also causes the coupling between producer, consumer and message queue, which can affect the expansion of the system.
since these three kinds of objects (components) are not suitable for such duties, it is necessary to introduce a new object specifically responsible for the transfer path selection function, which is called the message router mode :
With message routing, we can configure routing rules to specify the path to the message delivery, as well as specify the specific consumer consumer for the corresponding producer. For example, specify the keyword for the route, and it binds the specific queue to the specified producer (or consumer). The support of routing provides the flexibility of message passing and processing, and improves the message processing ability of the whole system. At the same time, the routing object effectively encapsulates the logic of finding and matching the message path, as if it were a mediator (meditator), responsible for coordinating the relationship between messages, queues, and path addressing.3 Application-level protocols
Remote service communication, the need to achieve the goal is to make a request on one computer, another machine after receiving the request for processing and return the results to the request side, which will have such as one way request, synchronous request, asynchronous request and so on, according to network communication principle, What needs to be done is to convert the request into a stream and transfer it to the far end via the transfer Protocol, and the remote computer is processed after the requested stream has been received, and the result is converted to a stream after processing, and returned to the caller via the transport protocol.
The principle is this, but for the convenience of application, the industry has introduced a lot of application-level protocols based on this principle, so that people can not go directly to operate such a low level of things, usually application-grade remote communication protocol will provide:
So when learning about application-level remote communication protocols, we can learn with these questions:
However, the application-level remote communication protocol does not make much improvement on the transport protocol, mainly in the flow operation, let the application layer generation flow and processing flow of the process to more closely conform to the language or standards used, as for the transport protocol is usually optional, in the Java domain is well-known: RMI, XML-RPC , Binary-rpc, SOAP, CORBA, JMS, HTTP, to look at the application-level protocols for these remote communications in detail.3.1 RMI (remote method call)
RMI is a typical Java custom Remote communication protocol, we know that in a single VM, we can directly invoke the Java object instance to achieve communication, then in the remote communication, if you can also follow this way is of course the best, This remote communication mechanism became RPC (remote Procedure call), and RMI was born to this goal.
RMI uses stubs and skeletons to communicate with remote objects (remotely object). The stub acts as the client proxy for the remote object and has the same remote interface as the remote object, and the call to the remote object is actually done by invoking the object's client proxy object stub, through which RMI is like it works locally, uses the TCP/IP protocol, The client directly invokes some methods on the service side. The advantage is the strong type, the compile time can check the error, the disadvantage is only based on the Java language, the client and the server tightly coupled.
Consider the principle of a complete remote communication process based on RMI:
- The client initiates the request and requests the stub class to be forwarded to the RMI client;
- The stub class serializes the requested interface, method, parameter, and other information;
- Based on the socket, the serialized stream is transmitted to the server side;
- The server side receives the stream and forwards it to the corresponding Skelton class;
- The Skelton class invokes the actual processing class after deserializing the requested information;
- After processing the class processing, the result is returned to the Skelton class;
- The Skelton class serializes the result and sends it to the client stub through the socket;
- The stub is deserialized after it receives the stream and returns the deserialized Java object to the caller.
Follow the principles to answer a few questions before learning the application-level protocol:
RPC uses the C/S method, uses the HTTP protocol, sends the request to the server, waits for the server to return the result. This request consists of a set of parameters and a set of text, usually in the form of "Classname.methodname". The advantage is cross-language cross-platform, c-side, s-side has greater independence, the disadvantage is that the object is not supported, the compiler can not check the error, only at run time check.
XML-RPC is also a remote invocation protocol similar to RMI, which differs from RMI in that it defines the requested information in a standard XML format (the requested object, method, parameter, etc.), and what is the benefit of being able to use it when communicating across languages.
Take a look at a remote communication process for the XML-RPC protocol:
- The client initiates the request and populates the request information according to the XML-RPC protocol;
- After filling, the XML is transformed into a stream and transmitted through the transport Protocol.
- Received after receiving the stream into XML, according to the XML-RPC protocol to obtain the requested information and processing;
- After processing, the results are written in XML and returned in accordance with the XML-RPC protocol.
To answer the question:
Binary-rpc look at the name to know and Xml-rpc is similar, the difference is only in the transmission of the standard format from XML to binary format.
To answer the question:
Soap is originally simple Object Access Protocol, is a distributed environment, lightweight, XML-based information exchange protocol, you can think of soap is an advanced version of XML RPC, the principle of the two are identical, are http+xml, Unlike the XML specification, which is defined only by the two, soap is also the service invocation protocol standard adopted by WebService, so it is not elaborated here.
Web Service provides services based on web containers, the underlying use of the HTTP protocol, similar to a remote service provider, such as the weather Forecast service, to provide weather forecasts for local clients, is a request response mechanism, cross-system cross-platform. is to provide services out through a servlet.
First the client obtains the WebService WSDL from the server, and at the same time generates a proxy class on the client, which is responsible for the request and response with the WebService server. When a data (XML format) is encapsulated in a SOAP-formatted stream to the server, a process object is generated and the SOAP packet that receives the request is parsed, then the thing is processed, and the result is soap-wrapped after processing ends. And then the package as a response sent to the client proxy class, similarly, the proxy class will also parse the SOAP packet, followed by subsequent operations. This is a running process for webservice.
A WEB service is broadly divided into 5 levels:
- HTTP transmission channel;
- The data format of XML;
- SOAP encapsulation format;
- how WSDL is described;
- UDDI UDDI is a directory service that enterprises can use to register and search for webservices;
JMS is a means and method of implementing remote communication in the Java domain, and RPC is different when implementing remote communication based on JMS, although the effect of RPC can be achieved, but because it is not defined from the protocol level, we do not consider JMS to be an RPC protocol, but it is a remote communication protocol. In other language systems there are similar JMS things, can be unified to call such mechanisms as message mechanism, and message mechanism, usually high concurrency, distributed domain recommended a communication mechanism, the main problem here is fault tolerance.
JMS is a Java messaging service that allows asynchronous message transmission between JMS clients through the JMS service. JMS supports two message models: Point-to-Point and Publish/subscribe (Pub/sub), the point-to-point and publish-subscribe models .
Consider the process of a remote communication in JMS:
- The client translates the request into a JMS-compliant message;
- Put a message into a JMS queue or topic through the JMS API;
- In the case of a JMS queue, the corresponding target Queue in the send, such as topic, is sent to the JMS queue subscribed to this topic.
- The processing end obtains the message through the rotation JMS Queue, and receives the message and resolves it and processes it according to the JMS protocol.
To answer the question:
JMS is also one of the common methods of implementing remote asynchronous calls.4 Differences between 4.1 RPC and RMI
RMI is the delivery of serializable Java objects on the TCP protocol, which can only be used on Java virtual machines, binding languages, and both the client and server must be java. WebService does not have this limitation, WebService is to pass an XML text file on the HTTP protocol, regardless of language and platform.4.4 WebService and JMS
WebService focuses on remote service invocation, and JMS focuses on information exchange.
In most cases webservice is a direct interaction between two systems (Consumer Producer), and in most cases JMS is a tripartite system interaction (Consumer Producer). Of course, JMS can also implement Request-response mode of communication, as long as the consumer or producer one side of the broker.
JMS can do asynchronous calls to completely isolate the client and service provider, to withstand traffic peaks; WebService services are usually synchronous calls, requiring complex object transformations, and now json,rest are good HTTP schema schemes compared to soap.
JMS is a message specification on the Java platform. The general JMS message is not an XML, but a Java object, and it is clear that JMS does not consider heterogeneous systems, and frankly, JMS does not consider things that are not java. But fortunately, most of the JMS provider (that is, the various implementations of JMS) solve the heterogeneous problem. It's different than WebService's cross-platform.5 Optional Implementation Technology
The current Java domain can be used to implement remote communication framework or library, known as: Jboss-remoting, spring-remoting, Hessian, Burlap, XFire (Axis), ActiveMQ, Mina, Mule , EJB3 and so on, to do a simple introduction and evaluation of each, in fact, to do a distributed service framework, these things are to have a very deep understanding, because the Distributed service framework is in fact contains the solution of the distributed domain and the application level of two aspects of the problem.
Of course, you can also implement your own communication framework or library based on the principle of Remote network communication (transport Protocol+net IO).
So what's the problem with learning about the framework or library of these remote communications?
Spring-remoting is a remote communication framework provided by spring that provides the Java domain, and it is also possible to easily publish ordinary spring beans as a remote protocol, as well as to configure the bean that the spring bean calls remotely.
Hessian is a remote communication library provided by Caucho based on BINARY-RPC implementation.
Burlap is also provided by Caucho, which differs from Hessian in that it is based on the XML-RPC protocol.
XFire, Axis is the WebService implementation framework, WebService can be considered as a complete SOA architecture implementation standard, so the use of XFire, axis These also means the use of WebService way.
ACTIVEMQ is the implementation of JMS, based on JMS such a message mechanism to achieve remote communication is a good choice, after all, the function of the message mechanism itself makes it easy to implement synchronous/asynchronous/One-way invocation, and the message mechanism is a good choice in terms of fault tolerance, This is an important basis for Erlang to be able to do fault tolerance.
Mina is an Apache-provided communication framework that has not previously mentioned network IO, previously mentioned frameworks or libraries are basically bio-based, and Mina is NIO, and NiO has a significant performance boost compared to bio when concurrency increases. and the improvement of Java performance, and its NIO this block and the close integration of the OS is not a small relationship.
Mina is the NIO approach, so it is no surprise to support asynchronous invocation.6 development and status of RPC framework
RPC (remote Procedure Call) is a remote invocation protocol that simply allows applications to invoke remote processes or services like local methods, and can be applied in many scenarios, such as distributed services, distributed computing, remote service invocation, and so on. Talking about RPC Everyone is not unfamiliar, the industry has a lot of good open source RPC framework, such as Dubbo, Thrift, Grpc, Hprose and so on. The following is a brief introduction to the RPC and common remote invocation of the characteristics, as well as some excellent open source RPC framework.
RPC compared to other remote calls, RPC and HTTP, RMI, Web Service can complete remote invocation, but the implementation method and focus are different.6.1 RPC with HTTP
HTTP (hypertext Transfer Protocol) is an application-layer communication protocol that uses standard semantics to access a specified resource (picture, interface, and so on), and the brokered server in the network can identify the protocol content. The HTTP protocol is a resource access protocol that allows remote requests to be completed and return the result of the request via the HTTP protocol.
The advantages of HTTP are simple, easy-to-use, understandable and language-independent, and are widely used in remote service invocation including Weibo. The disadvantage of HTTP is that the protocol header is heavier, the general request to the specific server link is longer, there may be DNS resolution, Nginx Proxy and so on.
RPC is a protocol specification that can be used to treat HTTP as an implementation of RPC, or as a transport protocol for RPC. RPC services are highly automated, enabling powerful service governance capabilities, and a more user-friendly language and excellent performance. Compared with HTTP, the disadvantage of RPC is relatively complex, the learning cost is slightly higher.6.2 RPC and RMI
RMI (remote method invocation) refers to the method invocation in the Java language, in which each method in the RMI has an approach signature, and the RMI client and server side make remote method calls through the method signature. RMI can be used only in the Java language to treat RMI as an object-oriented Java RPC.6.3 RPC with Web Service
Web Service is a Web-based service publishing, querying, and invocation of the schema approach, focusing on the management and use of services. The WEB service typically describes the service through WSDL and uses SOAP to invoke the service over HTTP.
RPC is a remote access protocol, and Web service is an architecture in which Web services can also make service calls through RPC, so the Web service is better suited for comparison with the same RPC framework. When the RPC Framework provides discovery and management of the service and uses HTTP as the transport protocol, it is actually a Web service.
The relative Web SERVICE,RPC Framework provides finer-grained governance of services, including traffic control, SLA management, and more, with a greater advantage in microservices and distributed computing.
RPC can be based on HTTP or TCP protocol, Web Service is HTTP protocol-based RPC, it has good cross-platform, but its performance is inferior to TCP protocol-based RPC. The two aspects will directly affect the performance of RPC, one is the transmission mode, and the second is serialization.
As we all know, TCP is the Transport layer protocol, HTTP is the application layer protocol, and the transport layer is more lower than the application layer, in the data transmission, the lower the faster, therefore, in general, TCP must be faster than HTTP.7 Summary
In the field of remote communication, the knowledge points involved are still quite large, for example: communication Protocol (SOCKET/TCP/HTTP/UDP/RMI/XML-RPC etc), message mechanism, network IO (Bio/nio/aio), multithread, Transparent scheme for local invocation and remote invocation (involving Java Classloader, Dynamic Proxy, Unit Test etc), asynchronous and synchronous calls, network communication processing mechanism (auto-re-connect, broadcast, exception, pool processing, etc.), Java Serialization (private serialization mechanism of various protocols, etc.), the implementation principle of various frameworks (transmission format, how to convert the transfer format into a stream, how to convert the request information into a transport format, how to receive the stream, how to restore the stream to the transmission format, etc.), To be proficient in these things, according to the actual needs of the decision, only to understand the principle of the situation can be easily made to choose, even according to the requirements of the Private remote communication protocol, for the Distributed service platform or the development of large-size distributed applications of people, I think at least the above mentioned point of knowledge needs to be better understood.
Welcome to join the Learning Exchange Group 569772982, we learn to communicate together.
Java Remote communication technology and principle analysis
Start building with 50+ products and up to 12 months usage for Elastic Compute Service