Why Message Queuing (MQ) is required.
##########################################################################################
The main reason is because in high concurrency environment, due to too late synchronous processing, the request will often occur congestion, for example, a large number of insert,update and other requests to reach MySQL, directly resulting in countless row lock table lock, even the final request will accumulate too much, thereby triggering too many Connections error. By using Message Queuing, we can handle requests asynchronously to relieve the pressure on the system.
##########################################################################################
Leslie Lamport, author of the American computer scientist, Latex, said: "A distributed system is a system in which a computer that you do not even know is malfunctioning and may cause your own computer to be unavailable." "The gave away of developing a distributed system is that it is complex and uncontrolled. So Martin Fowler stressed that the first principle of distributed invocation is not to be distributed. This sentence seems quite philosophical, but as far as enterprise application systems are concerned, this principle will be forced to break as long as the entire system evolves and there are multiple subsystems that exist together. Lid because in today's enterprise applications, it is difficult to find a scenario that does not require a distributed call at all. The principle proposed by Martin Fowler, on the one hand, is to expect designers to be cautious about distributed calls and, on the other hand, a flaw in the distributed system itself. Whether it is CORBA or EJB 2, either the RPC platform or the Web Service, because of the distributed components residing in different process spaces, introduces additional complexity and can negatively affect the efficiency, reliability, and predictability of the system.
However, it is undeniable that in the field of enterprise application system, we always face the communication, integration and integration between different systems, especially when faced with heterogeneous systems, this distributed call and communication becomes more important, it is more prominent in the architectural design of its value. And, from the perspective of business analysis and architectural quality, we also want to form the reuse of services as much as possible in the system architecture, and completely disassociate the client from the server by running the service in the process independently. This is often the inevitable path of architectural evolution. In the article "Corruption of Architecture" published by my colleague Chen Jinzhou in Infoq, it is thought that the problem of the architecture being corrupted by the size of the code can be solved by "putting separate modules into separate processes".
As the network infrastructure matures, evolving from RPC to Web Service, and in the industry starting with SOA in general, to later restful platforms and the extension of PAAs and SaaS concepts in cloud computing, distributed architectures start to show different features in enterprise applications, but the same , the goal of these distributed architectures is still to return to the era of building Babel, where communication between systems is no longer a barrier to the barriers of different languages and platforms. As Martin Fowler wrote in the book "Enterprise Integration Model", "integration is important because independent applications are lifeless." We need a technology that integrates applications that do not consider interoperation at design time, breaking the gap between them and gaining more benefit than a single application. " This is perhaps the main meaning of the existence of distributed architectures. 1. Message mode in Integrated mode
In the final analysis, enterprise Application system is the processing of data, and for a multi-subsystem enterprise Application System, its basic support is undoubtedly the processing of messages. Unlike objects, a message is essentially a data structure (of course, an object can also be seen as a special message) that contains data that both consumers and the service can recognize, which need to be passed between different processes (machines) and potentially consumed by a number of completely different clients. In many distributed technologies, message delivery seems to be better than file delivery and remote procedure call (RPC) because of its better platform independence and its ability to support concurrent and asynchronous calls. For Web service and restful, it can be seen as a derivative or encapsulation of messaging technology. In the "pattern-oriented Software architecture (volume IV)" Book, the message-passing pattern is classified as a distributed infrastructure, because the emergence of many message middleware products, the original need for developers to implement their own functions, can be directly reused. This greatly reduces development costs, including design costs and implementation costs. Therefore, the requirements of the architect from the original design implementation, to the business scenario and functional requirements of the judgment, so as to correctly make architectural decisions, technology selection and pattern application.
Common Message Patterns
In all of the enterprise applications I have participated in, the message-based distributed architecture has been adopted (or partially adopted in some subsystems and modules) without exception. But the difference is that the evidence that allows us to make architectural decisions is very different, and it has a direct impact on the message patterns we want to apply.
Message Channel mode
The message pattern we often use is the message channel pattern, as shown in Figure 1.
Figure 1 Message channel pattern (image from Eaipatterns)
The message channel, as an indirect layer introduced between the client (consumer, Consumer) and the service (producer, Producer), can effectively relieve the coupling between them. The consumer's "ignorance" of the producer can be achieved by implementing a message format that requires communication between the two parties, as well as the mechanism and timing for processing the message. In fact, the model can support multiple producers and consumers. For example, we can have multiple producers send messages to the message channel because consumers are ignorant of the producers, and it does not have to consider which producer sent the message.
While the message channel relieves the coupling between the producer and the consumer, it allows us to arbitrarily expand the producer and consumer, but it also introduces its own dependency on the message channel, because they must know the location of the channel resources. To remove this dependency on the channel, consider introducing the lookup service to find the channel resource. For example, in JMS, the message channel queue can be obtained through JNDI. For full flexibility, you can store channel-related information in a configuration file, and the lookup service first obtains the channel by reading the configuration file.
The message channel usually exists as a queue, and this FIFO-first data structure is undoubtedly best suited for this scenario of processing messages. Microsoft's MSMQ, IBM MQ, JBoss MQ, and Open source RABBITMQ, Apache ACTIVEMQ all implemented the message channel pattern through the queue. Therefore, in choosing to use the Message channel mode, it is more necessary to analyze and weigh all kinds of products that implement the model from the aspect of quality attribute. For example, the message channel supports concurrency and performance, whether the message channel adequately considers error handling, support for message security, and support for message persistence, disaster preparedness (fail over), and clustering. Because the messages passed by the channel are often important business data, once the channel becomes a point of failure or a breach of security, it can have a disastrous impact on the system. In the second part of this article, I will present a practical case to illustrate the architectural factors that should be considered when making architectural decisions, and thus make the right decisions.
Publisher-Subscriber (publisher-subscriber) mode
Once the message channel needs to support multiple consumers, it is possible to face the choice of two models: Pull model and push model. Pull model is initiated by the consumer of the message, the initiative is held in the hands of consumers, it will be based on their own situation to the producers to initiate calls. As shown in Figure 2:
Figure 2 Pull Model
Another embodiment of the pull model is that the producer notifies the consumer when the state has changed. However, the notified consumer will get more details by calling the passed-in consumer object in a callback manner.
In a message-based distributed system, the consumers of the pull model usually listen to the channel periodically in the form of a batch job, based on a predetermined interval of time. Once a message is found to be delivered, it will instead pass the message to the real processor (or to the consumer) to process the message and execute the relevant business. The health system introduced in the second part of this paper realizes the batch Job by introducing Quartz.net and completes the processing of messages in the message channel.
The initiative of pushing models is often in the hands of producers, and consumers are passively awaiting notification from producers, which requires producers to be aware of consumer-related information. As shown in Figure 3:
Figure 3 Push model
For push models, consumers do not need to know the producers. When the producer notifies the consumer, it is often the message (or event) that is passed, not the producer itself. At the same time, producers can also register different consumers according to different circumstances, or in the packaging of the notification logic, according to different state changes, notify different consumers.
Both models have the advantage. The advantage of the pull model is that it can further relieve the consumer's reliance on the channel and periodically access the message channel through the background task. The downside is the need to introduce a separate service process to execute in schedule form. For the push model, the message channel is actually the subject of the consumer observation, and once the message is found, the consumer is notified to perform the processing of the message. Regardless of the push model, pull the model, for the message object, may adopt similar observer pattern mechanism, realizes the consumer to the producer's subscription, therefore this mechanism is often called the publisher-subscriber pattern, as shown in Figure 4:
Figure 4 Publisher-subscriber mode (image from Eaipatterns)
Typically, publishers and Subscribers are registered to the infrastructure that is used to propagate the changes (that is, the message channel). The publisher proactively understands the message channel so that it can send messages to the channel, and once the message channel receives the message, it proactively invokes subscribers registered in the channel to complete the consumption of the message content.
For subscribers, there are two ways to process messages. One is the broadcast mechanism, when messages in the message channel are out-of-order, you also need to copy the message object and pass the message to multiple subscribers. For example, there are several subsystems that need to obtain customer information from the CRM system, and according to the delivery of customer information, the corresponding processing. The message channel at this point is also known as the Propagation channel. The other way is the preemption mechanism, which follows the synchronization method, where only one subscriber can process the message at a time. The message channel that implements the Publisher-subscriber mode selects the currently idle unique subscriber and then dequeue the message and passes it to the subscriber's message handling method.
Currently, there are many message middleware that can support the publisher-subscriber pattern, such as the Messagepublisher and Messagesubscriber interfaces provided in the JMS interface protocol for topic objects. RABBITMQ also provides its own implementation of the pattern. Microsoft's MSMQ, while introducing an event mechanism, can trigger events to notify subscribers when a message is received by the queue. But it is not the publisher-subscriber pattern implementation in the strict sense. Nservicebus, a major contributor to the Microsoft MVP Udi Dahan, has done a layer of packaging for MSMQ and WCF, and is well-implemented in this model.
Message Routing (msg Router) mode
Whether it is the message channel mode or the Publisher-subscriber mode, the queue plays a pivotal role in it. However, in enterprise applications, when systems become more complex, the requirements for performance are increasing, and for systems, it may be necessary to support the deployment of multiple queues at the same time, and may require a distributed deployment of different queues. These queues can receive different messages by definition, such as order Processing messages, log messages, query task messages, and so on. At this point, it is not appropriate for the producer and consumer of the message to assume responsibility for determining the message delivery path. In fact, according to the principle of s single responsibility, this assignment is unreasonable, which is not conducive to the reuse of business logic, but also causes the coupling between producer, consumer and message queue, which can affect the expansion of the system.
Since these three objects (components) are not suitable for such duties, it is necessary to introduce a new object specifically responsible for the transfer path selection function, which is called the message router pattern, as shown in Figure 5:
Figure 5 Message router mode (image from Eaipatterns)
With message routing, we can configure routing rules to specify the path to the message delivery, as well as specify the specific consumer consumer for the corresponding producer. For example, specify the keyword for the route, and it binds the specific queue to the specified producer (or consumer). The support of routing provides the flexibility of message passing and processing, and improves the message processing ability of the whole system. At the same time, the routing object effectively encapsulates the logic of finding and matching the message path, as if it were a mediator (meditator), responsible for coordinating the relationship between messages, queues, and path addressing.
In addition to the above patterns, the messaging mode provides a communication infrastructure that allows us to integrate independently developed services into a complete system. Message Translator mode completes the parsing of messages, allowing different message channels to receive and recognize messages in different formats. And by introducing such objects, it is also good to avoid the emergence of multiple services that are intertwined and dependent on each other. Message Bus mode provides a service-oriented architecture for the enterprise. It enables the delivery of messages, the adaptation and coordination of services, and requires these services to work together in a unified manner. 2. Application scenario of message mode
A message-based distributed architecture always revolves around messages. For example, you can encapsulate a message as an object, or specify the specification of a message, such as soap, or serialize and deserialize an entity object. The only purpose of these methods is to design the message into a format that both producers and consumers can understand and to pass through the message channel.
Scenario One: Message-based unified service architecture
In the CIMS system of manufacturing industry, we try to expose various kinds of business to the caller of the client in the form of service, such as defining an interface like this:
Public interface IService {
IMessage Execute (IMessage amessage);
void SendRequest (IMessage amessage);
}
The reason we are able to design such a service is because we have a high degree of abstraction of business information that is passed between services in the form of messages. At this time, the message is actually a contract or interface between the producer and the consumer, so long as the contract is followed and the message is converted and extracted according to the stipulated format, the distributed processing of the system can be well supported.
In this cims system, we divide the message into id,name and body, and we can get the related properties of the message body by defining the following interface methods:
Public interface imessage:icloneable
{
string MessageID {get; set;}
String MessageName () {get; set;}
Imessageitemsequence createmessagebody ();
Imessageitemsequence getmessagebody ();
}
Message body class MSG implements the IMessage interface. In this class, the body body of the message is of type imessageitemsequence. This type is used to get and set the contents of the message: Value and item:
Public interface Iitemvaluesetting {
string Getsubvalue (string name);
void Setsubvalue (string name, string value);
}
Public interface imessageitemsequence:iitemvaluesetting, ICloneable
{
imessageitem Getmessageitem (string AName);
Imessageitem Createmessageitem (string aName);
}
Value is a string type that leverages Hashtable to store key and value pairs for key and value. Item is the Imessageitem type, and in the Imessageitemsequence implementation class, the Hashtable stores key and item values.
Imessageitem supports nesting of message bodies. It consists of two parts: Subvalue and subitem. Implemented in the same way as imessageitemsequence. By defining such nested structures, it makes it possible to extend the message. The general message structure is as follows:
Imessage--name
--id
--body (imessageitemsequence)
--value
--item (imessageitem)
-- Subvalue
--subitem (imessageitem)
——......
The relationship between the individual message objects is shown in Figure 6:
Figure 6 The relationship between message objects
Before implementing the service process communication, we must define the message formats for each service or business. By means of the message body, the value of the message is set at one end of the service, then sent, and these values are obtained at the other end of the service. For example, the sending message end defines the following message body:
Imessagefactory factory = new Messagefactory ();
IMessage message = Factory. CreateMessage ();
Message. Setmessagename ("Service1");
Imessageitemsequence BODY = message. Createmessagebody ();
Body. Setsubvalue ("Subname1", "subvalue1");
Body. Setsubvalue ("Subname2", "subvalue2");
Imessageitem item1 = body. Createmessageitem ("Item1");
Item1. Setsubvalue ("Subsubname11", "Subsubvalue11");
Item1. Setsubvalue ("Subsubname12", "subsubvalue12");
Send Request Message
myserviceclient service = new Myserviceclient ("Client");
Service. SendRequest (message);
We introduced a Servicelocator object on the client that listens to the message queue through Messagequeuelistener, and once it receives the message, gets the name in the message to locate the service it corresponds to, and then invokes the service's Execute ( Amessage) method to perform the relevant business.
Servicelocator's responsibility for positioning is to query the services stored in the ServiceContainer container. The ServiceContainer container can read the configuration file, initialize all distributed services when the service is started (note that these services are stateless), and manage these services. It encapsulates the basic information of the service, such as the location of the service, how the service is deployed, and so on, thus avoiding the service's callers ' direct reliance on the details of the service, reducing the burden on the caller, and enabling the extension and migration of the service.
In this system, we mainly introduce the messaging mode, through the definition of the imessage interface, so that we can better abstract the service, and in a flat format to store the data information, thereby relieving the coupling between services. As long as the services agree on a common message format, the requester can be independent of the recipient's specific interface. Through the introduction of the Message object, we can build a common messaging model in the industry and a distributed service model. In fact, based on such a framework and platform, in the development of the business of the manufacturing industry, the most important activity of the developer is to discuss with the domain experts on various kinds of business message format, such a domain-oriented message language, which clears the communication barrier between the technician and the business person, and in the various subsystems, We also only need to maintain the message interface tables that pass between services. The implementation of each service is completely isolated, effectively to the business knowledge and infrastructure of the reasonable encapsulation and isolation.
For the format and content of messages, we consider the introduction of the message translator pattern, which is responsible for translating and parsing the message structure defined earlier. In order to further alleviate the burden of developers, we can also build a message-object-relationship mapping framework based on the platform, introducing the entity engine to transform the message into a domain entity, so that the service developers can develop the various service components with completely object-oriented thinking, and persist the message data by calling the persistence layer. At the same time, the message bus (at this time the message bus can be seen as a connector for each service component) connects different services, and allows messages to be passed asynchronously to encode the message. Such a message-based distributed architecture is shown in Figure 7:
Figure 7 Cims distributed architecture based on message bus
Scenario Two: Architecture decision of message middleware
In a healthcare system, we face the non-functional needs of our customers for system performance/availability. When we initially launched the project, the customer expressed special attention to performance and usability. Customers want the end user to have a good user experience when it comes to complex substitution deletions, in short, to be able to quickly respond to operations. The problem is that such a substitution delete operation needs to deal with complex business logic, while the amount of associated data involved is very large, and the entire operation needs to be completed in the worst case of a few minutes. We can perform performance tuning for database operations by introducing caching, indexing, paging, and more, but the entire operation is time-consuming to meet customer requirements. Since the system is developed on the basis of a legacy system, if you want to introduce map-reduce to handle these operations to meet quality requirements, the impact on the architecture is too great and some components of the previous system are not well reused. Obviously, the cost of paying is not proportional to the benefit.
By analyzing the requirements, we note that the end customer does not need to obtain the results in real time, as long as the consistency and completeness of the final results can be guaranteed. The key is that in terms of the user experience, they don't want to go through a lengthy wait and then tell them whether the operation succeeded or failed. This is a typical scenario that needs to be handled asynchronously through a background task.
In the enterprise application system, we often encounter such a scenario. We have tried to control the concurrent access of the background thread in a financial system by writing our own tasks, and to complete the dispatch of the task. It turns out that such a design is not effective. For this kind of typical asynchronous processing, the message-based architecture pattern is the best way to solve this problem.
Because of the gradual maturation of the message middleware, the architectural design of this problem has shifted from the original focus on design implementation to how to make product selection and technology decision. For example, under the. NET platform, the architect needs to focus on what kind of message middleware should be chosen to handle these issues. This requires a combination of specific business scenarios to identify the risks of such asynchronous processing, and then to compare the various technologies based on these risks in order to find the most appropriate solution.
By analyzing the business scenario and the nature of the customer, we find that the business scenario has the following characteristics: In some specific cases, bulk substitution deletions may be concentrated, resulting in a high number of concurrent operations; For example, when the FDA calls for a recall of some offending drugs, it needs to remove the drug's information from the drug store; Operation results do not require real-time, but need to ensure the reliability of the operation, not because the exception failed to cause some operations can not be done; the process of automatic operation is irreversible, so it is necessary to record the operation history; Based on performance considerations, most operations require a call to the database's stored procedures, and the operational data requires some security Avoid being corrupted by illegal users; operation-related functions are packaged as components to ensure the reusability, scalability and testability of components, and the amount of data may increase as the end user increases;
In view of the above business requirements, we decided to compare and consider the various technical solutions in the following aspects. Concurrency: The selected message queue must support the concurrency of user access well; security: Whether Message Queuing provides sufficient security mechanisms; performance scaling: It is not possible to make Message Queuing a single performance bottleneck for the entire system; Deploy: Make it easier to deploy Message Queuing as much as you can; disaster preparedness: cannot be caused by unexpected errors, Failure or other factors lead to the loss of processing data; API Ease of use: The API to process messages must be simple enough to support testing and scaling well;
We have inspected MSMQ, Resque, ACTIVEMQ and RABBITMQ, and we finally chose RABBITMQ by querying relevant data and writing spike code validation related quality.
We chose to abandon MSMQ because it relied heavily on the Windows operating system, and although it provided an easy-to-use GUI for easy management to install and deploy, it was very difficult to write automated deployment scripts. At the same time, MSMQ's queue capacity cannot be checked for 4M bytes, which is something we cannot receive. The problem with Resque is that only ruby-only client calls are currently supported and are not very well connected to. NET platform integration. In addition, Resque's handling of message persistence is written to Redis, which requires the introduction of new storage in the context of existing RDBMS. We were more attracted to Activemq and RABBITMQ, but by writing test code and using a loop to send big data messages to verify the performance and stability of the message middleware, we found that ACTIVEMQ's performance was not very satisfying. At the very least, during our forensic research, ACTIVEMQ occasionally crashes due to the frequent sending of big data messages. Comparatively speaking, RABBITMQ is more suitable for our architectural requirements in all aspects.
For example, in disaster preparedness and stability, RABBITMQ provides a durable queue that can persist unhandled messages to disk when the queue service crashes. To avoid information loss due to delays between sending messages to write messages, RABBITMQ introduces the Publisher confirm mechanism to ensure that messages are actually written to disk. It offers two modes of active/passive and active/active for cluster support. For example, in active/passive mode, once a node fails, the passive node is immediately activated and quickly replaces the failed active node, assuming the responsibility for message delivery. As shown in Figure 8:
Figure 8 Active/passive Cluster (image from RABBITMQ official website)
In terms of concurrent processing, RABBITMQ itself is a message middleware based on Erlang, and as a programming language for concurrent processing, Erlang's innate advantage over concurrency allows us to have confidence in the concurrency of RABBITMQ. RABBITMQ can be easily deployed to Windows, Linux and other operating systems, and it can also be well deployed to server clusters. Its queue capacity is unlimited (depending on the disk capacity of the installation RABBITMQ), and the performance of sending and receiving information is also very good. RABBITMQ provides client APIs for Java,. NET, Erlang, and C, which are very simple to invoke and do not introduce too many third-party libraries to the entire system. For example. NET client only needs to rely on one assembly.
Even if we choose RABBITMQ, it is still necessary to decouple the system from the specific message middleware, which requires us to abstract the producer and consumer of the message, for example, define the following interface:
Public interface Iqueuesubscriber
{
void listento<t> (String queuename, action<t> Action);
void Listento<t> (String queuename, predicate<t> messageprocessedsuccessfully);
void Listento<t> (String queuename, predicate<t> messageprocessedsuccessfully, BOOL Requeuefailedmessages );
}
Public interface Iqueueprovider
{
T pop<t> (string queuename);
T popandawaitacknowledgement<t> (String queuename, predicate<t> messageprocessedsuccessfully);
T popandawaitacknowledgement<t> (String queuename, predicate<t> messageprocessedsuccessfully, BOOL Requeuefailedmessages);
void Push (Functionalarea functionalarea, String Routingkey, Object payload);
}
In the implementation classes of the two interfaces, we encapsulate the RABBITMQ invocation class, for example:
public class Rabbitmqsubscriber:iqueuesubscriber {public void listento<t> (string queuename, actio N<t> action) {using (iconnection connection = _factory. OpenConnection ()) using (IModel channel = connection.
Createmodel ()) {var consumer = new Queueingbasicconsumer (channel); String Consumertag = Channel.
Basicconsume (QueueName, acknowledgeimmediately, consumer); var response = (Basicdelivereventargs) consumer.
Queue.dequeue ();
var serializer = new JavaScriptSerializer (); String json = Encoding.UTF8.GetString (response.
Body); var message = serializer.
Deserialize<t> (JSON);
Action (message); }}} public class Rabbitmqprovider:iqueueprovider {public T pop<t> (string qu
Euename) {var returnval = default (T); const BOOL Acknowledgeimmediately = true; using (var connection = _factory. OpenConnection ()) using (var channel = connection. Createmodel ()) {var response = Channel.
Basicget (QueueName, acknowledgeimmediately);
if (response! = null) {var serializer = new JavaScriptSerializer (); var json = Encoding.UTF8.GetString (response.
Body); ReturnVal = serializer.
Deserialize<t> (JSON);
}} return returnval;
}
}
We use Quartz.net to implement batch Job. The queue is listened to in the Execute () method by defining a job class that implements the Istatefuljob interface. The Listento () method of the Rabbitmqsubscriber class in the job invokes the Dequeue () method of the queue, and when the received message arrives in the queue, the job hears the event that the message reached and then synchronously causes the message to pop up the queue. and passes the message as an argument to the action delegate. Therefore, in the Execute () method of the batch job, you can define the method of message handling and call the Listento () method of the Rabbitmqsubscriber class as follows (note that the message passed here is actually the job ID):
public void Execute (Jobexecutioncontext context)
{
string queuename = Queueconfigurer.getqueueproviders (). Queue.name;
Try
{
queuesubscriber.listento<myjob> (
queuename,
job = Request. MakeRequest (Job. Id.tostring ()));
}
catch (Exception err)
{
Log.warnformat ("Unexpected Exception while processing queue ' {0} ', Details: {1}", QueueName, err);
}
}
Information about the queue, such as queue names, is stored in the configuration file. The Execute () method invokes the MakeRequest () method of the Request object and passes the obtained message (that is, Jobid) to the method. It queries the job's corresponding information based on Jobid to the database and performs real business processing.
In the case of decision-making based on the message processing architecture, in addition to the considerations mentioned above, many design details need to be judged and weighed in a multi-faceted sense. For example, for job execution and queue management, the following factors need to be considered: monitoring and querying of job status in queue, management of job priority, ability to cancel or terminate a job with too long execution time, and the ability to set the time interval of poll; Whether the job can be distributed across the machine, the processing of the failed job, the ability to support multiple queues, the naming queue, the ability to allow the worker process to execute jobs to correspond to a specific queue, and the support for dead message. 3, the timing of the choice
At what point, we should choose a distributed architecture based on message processing. According to my experience in multiple enterprise applications, I think I need to meet the following conditions: the real-time requirements of the operation are not high, but the tasks that need to be performed are very time-consuming, there is integration between heterogeneous systems within the enterprise, and the server resources need to be allocated and utilized rationally;
For the first case, we often select Message Queuing to handle long-executing tasks. The message queue that is introduced at this point becomes the buffer for message processing. The asynchronous communication mechanism introduced by Message Queuing allows both the sender and the receiver to continue executing the following code without waiting for the other party to return a successful message, thereby improving the ability to process data. Especially when the traffic volume and data flow are large, it can reduce the load of database processing data by combining the message queue with the background task, and processing the big data by avoiding the peak times. The health system mentioned earlier is one such application scenario.
The integration of different systems and heterogeneous systems is precisely the scenario in which message patterns are good at handling. As long as the format and delivery of the message are specified, the communication between different systems can be implemented effectively. When developing a large system for a car manufacturer, the distributor acts as a. NET client, you need to pass data to the central administration. This data will be used by Oracle's EBS (e-business Suite). The Distributor Management System (Dealer Management System,dms) uses the C/s structure, the database is SQL Server, and the EBS database of the automaker's management center is Oracle 10g. We need to address the transfer of data between two different databases. The solution is to use MSMQ to transform the data into database-independent message data and to deploy the MSMQ server on both ends to establish Message Queuing for easy storage of message data. The implementation architecture is shown in Figure 9.
Figure 10 Distributed processing architecture with MSMQ implementation
First, the reseller's data is passed through MSMQ to the MSMQ server, and the data is inserted into the SQL Server database, using FTP to deliver the data to the dedicated file server. EBS APP Server writes files from a file server to an Oracle database based on an interface specification. NET system and Oracle system integration.
Distributed systems can often relieve the pressure on a single server and efficiently allocate and utilize server resources by deploying different business operations and data processing in different service forms and running on different servers. In this case, services deployed on different servers can be used either as a service side to handle requests from client calls or as clients to delegate the remaining business requests to other services after processing their own business. In the early CORBA system, a unified naming service was established to manage and dispatch the service, and the event Service was used to realize the distribution and processing of events. However, the CORBA system uses the RPC method, which needs to design and deploy the service as a remote object and establish the proxy. If you pass through the message channel, you can either dismiss this dependency on the remote object and support the asynchronous invocation of the model. The CIMS system mentioned above is to provide the infrastructure of message passing through message bus, and to establish a unified message processing service model to relieve the dependency of service, so that each service can be deployed to different servers independently. 4. Difficulties faced
Because of the particularity of the message pattern itself, we often face many difficulties when we use the message pattern to establish the message-based distributed architecture.
The first is the problem of system integration. Because communication between systems is communicated by message, it is necessary to ensure the consistency of the message, and also to maintain the stability of the interface between the system (mainly the service). Once the interface changes, it can affect all callers to that interface. Even though the service is abstracted through an interface, the message holds the business data provided by both parties, to some extent, violates the essentials of encapsulation. In other words, both sides of production and consumption messages are tightly coupled to the message. Changes to the message directly affect the implementation classes of each service interface. However, in order to ensure the abstraction of the interface as much as possible, the messages we are dealing with are not strongly typed, which makes it difficult to find errors caused by changes in the content of the message during compilation. There is a problem with the car retailer management system that I mentioned before. At that time, my CRM module needed to communicate with multiple subsystems at the same time, and each subsystem was developed by a different team. The interface tables are often not synchronized in a timely manner between teams for reasons of communication. Although the unit and functional tests for each subsystem have passed, the integration test for the CRM was not found to have a large number of message mismatches, the cause of which was the change of the message.
The solution is to introduce adequate integration testing, even regression testing, and to run these tests in a timely manner to get feedback quickly. We can use integration testing as a validation of the commit code, requiring that both the integration test and the specified regression test be run for each commit code. This is the embodiment of continuous integration. By running integrated and regression tests on-premises and remote builds, it is effective to ensure that the local version and the integrated version do not compromise the functionality because of changes in the message. Once they are compromised, they are able to get feedback, identify problems, and solve these problems immediately, rather than wait until the project is centralized for integration testing.
Another problem is the difficulty of testing the non-real-time nature of background tasks. Because background tasks are periodically processing messages in Message Queuing, the timing of the triggering is unpredictable. In this case, we usually use two kinds of solutions at the same time and double-pipe them to solve the problem. First, we will introduce a synchronous implementation version of the system, and by introducing the toggle switch mechanism in the configuration file, we can switch between the synchronous function and the asynchronous function at any time. If we can guarantee the correctness of Message queue processing and background task execution, we can set the synchronization function so that we can quickly and accurately test the functions represented by the task and harvest feedback in time. At the same time, we can build a dedicated pipeline (pipeline) on the continuous integration server to run the asynchronous version based on message processing. The tasks for this pipeline can be performed manually, or you can set a timer on the pipeline to execute at a specified time (for example, at two o'clock in the morning, so that you can get feedback before you start working the next day). We need to prepare a specific execution environment for the pipeline and modify the listening and execution time of the background task to an acceptable value. This will not only be able to understand the correct function in time, but also ensure that the message-based system is working properly.
Of course, the distributed system also has the performance loss of parsing message and network transmission. For these issues, the architect is required to carefully analyze the business scenario and correctly select the schema and schema patterns. Compared with the local system, the maintenance of distributed systems may be more difficult to increase exponentially. This requires us to take into account the stability of the system architecture when making architectural decisions and designs, but also to introduce system log processing. A better approach is to increase the error notification for log processing, as long as the message processing error messages, by mail, text messages, and other means to notify the system administrator, timely processing errors. The problem can be better positioned only if the error log is queried at the time of the error. You can also introduce the error message queue and the dead message queue for the system to handle errors and exceptions.
For distributed systems, there is also a need to consider the consistency of service execution results, especially when a business requires multiple services to participate in a session, and once a service fails, it can result in inconsistent state of the application because only all participants have successfully performed the task to be considered completely successful. This involves the issue of distributed transactions, where the execution of the task becomes transactional: The task must be atomic and the resulting state must be consistent. State modifications are isolated from each other during task processing, and successful state modifications are persistent throughout the execution of the transaction. This is the acid (atomic,consistent,isolated and durable) property of the transaction.
One scenario is to introduce a distributed Transaction Coordinator, the DTC (distributed Transaction Coordinator), to divide the transaction into two-segment or even three-segment commits, requiring all participants in the entire transaction to vote to determine whether the transaction is completely successful or failed. Another option is to reduce the requirement for consistency of results. According to ebay's best practices, the immediate consistency of distributed resources is unnecessary and unrealistic given the cost of distributed transactions. In Randy Shoup's article, "Scalability Best practices: Experience from ebay," mentions Eric Brewer's cap axiom: three key metrics for distributed systems-consistency (consistency), availability (availability), and Partition tolerance (partition-tolerance)-only two can be set at any one time. We should weigh these three elements according to the different application scenarios. Without the need to ensure immediate consistency, we can consider a reasonable division of services, as far as possible to the same transaction scope of business operations can be deployed in the same process, to avoid distributed deployment. If you do need to maintain consistent results between multiple distributed services, consider introducing data reconciliation, asynchronous recovery events, or centralized accounts.