In today's application architecture, communication between distributed applications and services is the core idea. To benefit from distribution, you must keep several basic principles in mind. Otherwise, you may encounter performance and scalability problems easily. These problems do not occur frequently in the development phase, but when you perform load testing or productization, you may realize that the software architecture you choose cannot meet the performance and scalability requirements. In this article, we focus on the key points that need to be remembered when building distributed applications.
Distributed applications need to interact with each other. The scope includes simple point-to-point interaction on a large-scale cluster architecture, to dynamic service-oriented or service-based architecture. Communication across system boundaries is also the key to improving the scalability and availability of software systems. Today, software architecture has taken distributed as a core concept. The Java platform has become the core role, because it is distributed and has good API and product support. Application scenarios include System Integration on standard software like SAP to internal or external service integration. SOA provides such a method to make services and applications flexible and reusable, and quickly respond to new market demands. In addition, the trend of grid computing, virtual machines, and multi-core blade machines has led to the emergence of more and more cluster applications. This is mainly driven by the pursuit of high scalability and high availability. The Development Trend of cloud computing shows that distributed platforms will become more popular in the future. In addition, the system is becoming more dynamic to increase its flexibility. For example, add an application node at runtime. These trends also make the system structure more and more complex. For developers, it is more difficult to understand how service calls are implemented in products. This complexity and lack of knowledge can easily lead to increased resource consumption, CPU, memory, and network) and reduced performance.
Demon behind mask
Nowadays, remote technology makes the implementation of distributed applications easier. The details of the underlying communication and the infrastructure of the server and client are transparent to developers. Now, to expose a Java class as a service, you only need to add an annotation to this class. The service can also be easily accessed by the agent generated by the tool. As shown in, but this is only the tip of the iceberg.
Figure 1. Remote protocol upper-layer architecture
The core block of the Remote Stack is the serialization and transmission formatting of objects. Generally, application developers do not need to know this. However, this is also the cause of many performance problems. Low-efficiency serialization means that much unnecessary data is transmitted over the network. Complex Object display and a large amount of data lead to high CPU and memory usage during serialization and deserialization. The underlying infrastructure and its configuration have a great impact on application performance. On the client side, connection management and underlying thread models are used. The guidelines for using connections in distributed applications are similar to those for connecting databases. It takes some time to establish a connection. But it also depends on what protocol. For example, the overhead of establishing an HTTPS connection is greater than that of a simple TCP/IP connection. At the same time, connection is an important resource of the system. Therefore, it is important to use the connection pool. Correct configuration is also critical here, because the wrong configuration file brings us more harm than the benefits. The thread model involves how requests are processed. The important thing is whether the request is processed synchronously or asynchronously. Synchronous communication blocks a process until it receives the corresponding message. In asynchronous communication, a callback is called when a response is received. This allows this thread to be used by other transactions. On the server side, the number of available worker threads is the maximum number of service requests defined for parallel processing. The network itself is also an important component of distributed applications. Network is an important bottleneck resource that limits scalability more than performance. This is often ignored during development because the actual network is not called.
The beauty of remote calls lies in...
There are many options. Java provides many possibilities and technologies to implement distributed applications. The selection of remote technology has an important impact on the application architecture, performance, and scalability. RMI is the most "old" but almost the most widely used remote protocol ).
Figure 2. RMI Architecture
RMI is a standard protocol for J2EE applications. As its name implies, it is designed to call the methods provided by objects on a remote Java virtual host. When an object is exposed on the server, the client can call this object through a proxy. The same server object is used by multiple threads. The thread pool is managed by RMI infrastructure. Communication is processed through TCP/IP, and JRMP or iiop giopcorba is used for RMI. The application server also provides its own attribute protocol to optimize its performance. The RMI infrastructure also provides a garbage collector to manage references, just as the reference on the server needs to be managed. This distributed garbage collector DGC) also uses the RMI protocol to manage the object lifecycle on the server. In addition to the powerful client and server, RMI has some other implementations. For more information about RMI and its application, see the previous article "using RMI to implement Java-based Distributed Computing".
RMI only supports synchronous communication. The disadvantages have been discussed above. In addition, low-level caching cannot be provided for data-driven services because it is based on the binary protocol. Developers and system architecture can change infrastructure configuration parameters to optimize performance. JMS is the second protocol used on the J2EE platform. For example:
Figure 3. JMS Architecture
Unlike RMI, JMS is an asynchronous protocol. The communication is queue-based so that the listener can respond to messages. JMS is not a standard Remote Call protocol, but it can still meet the interaction between services. Many of the most important ESB implementations in SOA use JMS-based middleware to transmit information between services. Because JMS is asynchronous, some typical synchronization problems can be avoided. In many systems, the key to high scalability is the ability to quickly release resources like threads ). In many cases, asynchronous processing is the only suitable method. JMS provides many different transmission formats. XML is the most common message format, but binary format is also possible. The design of the message structure is an important part of the application architecture, because it can directly affect the performance and scalability of the application.
SOAP-based WEB services such as) and other related WS-* have become increasingly important in the Java Enterprise application field.
Figure 4. synchronous and asynchronous SOAP Architecture
SOAP was designed to replace CORBA and was strongly supported by the industry at the beginning. Because of the joint efforts between WS-I, different platforms can be easily connected. SOAP is an XML-based RPC protocol, so it is easy to associate with wasted bandwidth.