"Download this PDF for reading"
The three carriages mentioned here refer to microservices , message Queues , and timed tasks . As shown, here is a three-horse carriage co-driven by the architecture of a stereoscopic Internet project. Regardless of whether the project is large or small, the pattern of this schema template will not change after the stereotypes, the difference is that we have more services have more complex calls, more complex message flow, more jobs, the whole architecture is extensible, and will not be deformed, this architecture can be a long time without large adjustments.
The dotted box on the diagram indicates that the module or project is not too much of a business logic, it is purely a layer of skins (it invokes the service but does not touch the database). The arrows for black lines represent dependencies, and the green and Red Arrows are the direction of MQ's send and subscribe message flows, respectively. This will be explained further in detail later.Micro-Service
MicroServices are not a very new concept, I began to practice this architecture style 10 years ago, in the four companies to fully implement micro-services in the project, more and more firmly believe that this is very suitable for Internet projects an architectural style. Not that our service must be remotely invoked across physical machines, but rather that we have a deliberate design that allows our business to be segmented by domain at the very beginning, which allows us to understand the business more fully, allowing us to easily work on different business modules in subsequent iterations, Can make our project development more and more relaxed , easy to come from several aspects:
1. If we can do microservices, then we must go through a more comprehensive product requirements discussion and domain division, each service carefully design their own field of table structure, this is a very important design process, also determines the entire technical architecture and product architecture is matched, The architecture of All-in-one tends to omit this process, where the requirements go to where the code is written.
2. Our division of services and the positioning of responsibilities if it is clear, for the new requirements, we can know where to change the code, there is no copy paste there are a lot less pits.
3. Most of our business logic has been developed and reused directly, and our new business is just an aggregation of existing logic. After the PRD review, the development concludes that only a combination of the XYZ methods that call the ABC three service, and then modify the Z method in the C service to add a branching logic, can build up a new logic, a refreshing feeling unimaginable.
4. when there is a significant bottleneck in performance, we can increase the capacity of some services to add more machines, and because of the division of services, we are more aware of the bottleneck of the system, from 10000 lines of code to a line of performance problems of the code is more difficult, But if these 10000 lines of code are already made up of 10 services, then locating a service with performance problems and then analyzing the service reduces the complexity of the location problem.
5. If the business has a relatively large change to the downline, then we can be sure that the underlying public services will not be eliminated, the offline corresponding business of the aggregation of business services to stop the traffic entrance, and then the relevant basic services related to the offline part of the interface can be. If you have a perfect service governance platform, you don't even have to change the code.
Here too, we are required to do the principles of several aspects:
1. The granularity of the service needs to be well controlled. My habit is to first divide by field, not wrong, as the project progresses slowly more granular split. For example, the Internet financial peer-to business, the beginning can be divided into:
2. Service must be three-dimensional, not on a level, such as, our services have three levels:
Hope here to clarify this matter, how to divide the service how to divide three levels of service is a very interesting and necessary thing, after the service division it is best to have a clear document to describe each service responsibility, so that we can not need to read the API to be able to locate the service of the business, The whole complex system becomes straightforward.
3. Each service docking of the underlying data table is independent of no cross-correlation, that is, the data structure is not directly external, the need to use other services must be done through the access interface. Benefits are the benefits of encapsulation in object-oriented design:
Well, that's my data. I'm in charge, I want to get out of the way, it's important to refactor or do some high-level technical architectures (such as live offsite) without the underlying data being relied upon. Of course, the downside or the trouble is that the cross-service invocation makes the data operation impossible to complete in a database transaction, which is not a big problem, because our splitting method does not make the granularity too thin, most of the business logic is done in a business service, The second is to mention that cross-service invocations, whether through MQ or direct invocation, will have compensation for eventual consistency.
4. consider the significant differences in the stability of service invocations across machines across processes. Method calls inside the method, we need to consider the situation of the call exception, but almost no need to consider the situation of time-out, almost no need to consider the situation of request loss, almost no need to consider the situation of repeated calls, for remote service calls, these points need to focus on, otherwise the whole system is basically available, The test environment is not a problem, but it is in a state of trouble on the line. This requires a few more questions about the availability and invocation of each service, carefully considering the fact that the network problem method is not performing multiple executions or partial execution:
If you say that so many services, I am very difficult to consider at the time of implementation of these points, I do not consider the distribution of transactions, idempotent, compensation (no exaggeration to say, sometimes we spend 20% of the time to implement the business logic, and then spend 80% of the time to achieve the external logic of these reliability), OK? is not not, then the business online running on the time will be riddled with, if the entire business of the processing of the reliability of the requirements are not high or business is not facing the user will not be complained, this part of the business is to temporarily do not consider these points, But businesses that do not allow inconsistencies, such as the order business, need to take these points into full consideration.
5. consider the significant differences in service data transfer across machine cross-process calls. For local method calls, if the arguments and return values are the objects, then for most languages, the pointer (or copy of the pointer) is directed to the allocated object in the heap, the cost of the object on the data transfer is almost negligible, and there is no overhead of serialization and deserialization. For cross-process service invocations, this cost is often not negligible. If we need to return a lot of data, the definition of interfaces often requires special modifications:
6. the problem of method granularity is also deduced here, for example, we can define getuserinfo to return different data combinations by passing in unused parameters, or we can define Getuserbasicinfo, Getuservipinfo, Getuserinvestdata and so on fine-grained interfaces, the granularity of the interface is defined by how the consumer will use the data, more likely to use single or composite types of data at a time, and so on.
7. then we need to consider the issue of interface upgrade, the interface changes are best compatible with the previous interface, if the interface needs to retire the downline, it is necessary to ensure that the caller has been transformed to a new interface, to ensure that the caller traffic for 0 to observe a period of time before the old interface from the code offline. Once the service is open, it is not so easy to make an interface definition adjustment or even go offline. So the external API design needs to be cautious.
8. Finally, I have to say that after the entire company has started micro-services, some cross-departmental service calls in the agreed API will inevitably have some of the phenomenon of wrangling, whether I pass to you or you pull, this data is useless to me why should I stay here? Aside from the non-technical aspects of the matter, these wrangling is also some technical means to resolve:
You may see this feeling dizzy here, why microservices need extra consideration for so many things, the complexity of achieving a sudden rise. What I'm trying to say is that we need to look at this in a different perspective:
1. We do not need to consider all logic at the outset, overriding core process core logic. Because cross-service becomes the provider and consumer of the service, the equivalent of myself, there are many other people who will be able to relate to my service ability, people will ask all kinds of questions, which is good for designing a reliable method.
2. even when we stack all the logic together without cross-service invocation, it does not mean that these logic must be transactional, implemented tightly, and that cross-service invocation tends to amplify the likelihood of the problem.
3. We also have a service framework, the service framework often in the monitoring tracking level and operation and maintenance system together to provide a lot of integrated functions, which will be closed in the internal method of the logical break exposed, for a perfect monitoring platform of the micro-service system, When troubleshooting a problem you tend to lament that this is a remote service call.
4. The biggest bonus is that when we form a three-dimensional service system with clear business logic, any requirement can be dissected into a very small number of code modifications and some combination of service invocations, and you know that I do not have any problems, Because the underlying service ABCDEFG are historically tested, this refreshing experience will be enjoyable once in a while.
However, if the service granularity division unreasonable, the hierarchy division unreasonable, the underlying data source has the intersection, did not consider the network call failed, did not consider the data volume, the interface definition is unreasonable, the version upgrade is too reckless, the whole system will have a variety of extension problem performance problems and bugs, this is very headache, This also requires that we have a perfect service framework to help us locate a variety of unreasonable, in the later talk about the middleware article will be specifically focused on service governance this piece.Message Queuing
The use of Message Queuing MQ has several benefits, or we tend to consider introducing MQ in these purposes:
1. Asynchronous Processing: a process like an order can typically define a core process that processes the state machine of the core order, which needs to be synchronized as soon as possible, and then around the order will derive a series of user-related inventory related to the subsequent business processing, These processes do not require a card to be processed at the instant the user clicks the submit order. The next order is just a confirmation of the legal processing of orders, the follow-up of many things can be slowly in dozens of modules in the flow, this process even consumes 5 minutes, users do not need to feel.
2. Traffic flood Peak: one of the characteristics of the Internet project is that some time will do some TOC promotion, there are some traffic peaks, if we introduce the message queue between the modules as a buffer, then backend service can be in their own existing comfortable frequency to passively consume data, Will not be overwhelmed by the flow of pressure. Of course, good monitoring is essential, the following to elaborate on the monitoring.
3. module decoupling: As the project complexity increases, we will have a variety of events from internal and external projects (user registration login, investment, withdrawal events, etc.), these important events may continue to have a variety of modules (marketing module, activity module) need to care about, Core business System to call these external system modules, so that the entire system in the internal entanglement is obviously inappropriate, this time through the MQ decoupling, so that a variety of events in the system of loose coupling flow, the module between each other do not perceive each other, this is more appropriate practice.
4. message mass: There are some messages that will have multiple receivers, the number of receivers or dynamic (similar to the nature of the chain of accusations is also possible), at this time if the upstream and downstream of a pair of more coupling will be more troublesome, for this situation is more appropriate to use MQ decoupling. Upstream just send a message that what is happening now, downstream no matter how many people care about the news, upstream is not aware of.
These requirements are essential in Internet projects, so the use of Message Queuing is a very important architectural tool. There are several points of note in use:
1. I prefer to be independent of a dedicated listener project (instead of merging in the server) to specifically listen to the message, and then the module does not have too much logic, but only after receiving the specific message to call the corresponding service API for message processing. Listener is capable of starting multiple copies to do a load balancer (depending on the MQ product used), but since there is little pressure here, not 100% must. Note that not all services are required to have a matching listener project, and most of the public base service is often not listener because it is independent and does not need to perceive other business events outside of it. There are some similar reasons for basic business services that do not need to have listener.
2. for important MQ messages, the appropriate compensation line should be used as a backup, in which the MQ cluster is properly trapped and as a back when the MQ cluster is paralyzed. I have used RABBITMQ in tens of thousands of days of orders, although the QPS in hundreds of thousand, far less than the rabbitmq to withstand the tens of thousands of QPS, but there is a total of one out of 10,000 lost message probability (I also used Ali's ROCKETMQ, But because the small amount is not currently observed to have a similar problem, these discarded messages are immediately processed by the compensation line. In extreme cases, the RABBITMQ has an entire cluster outage, a service sent messages can not reach the B service, this time the compensation job began to work, regularly from a service bulk pull messages to the B service, although the message processing is a batch, but at least ensure that the message can be handled normally. It is important to do this backup because we cannot ensure that the middleware is available at 100%.
3. The realization of compensation is without any business logic, and we'll comb it out to compensate for it. If the A service is the provider of the message, B-listener is the message listener, and when the message is heard, the specific method Handlexxmessage (Xxmessage message) in B-server is invoked to execute the business logic, and when MQ stops working, There is a job (configurable compensation time and the amount of each pull) to periodically invoke a service-provided proprietary method Getxxmessages (LocalDateTime from, LocalDateTime to, int batchsize) to pull the message, It is then possible to call the B-server handlexxmessage (which can be concurrent) to process the message. This compensated job can be reused to be configurable, without having to hand-write a set of messages each time, the only thing that needs to be done is a service that needs to provide an interface to pull the cancellation. Then you might say, I a service here also need to maintain a set of database-based Message Queuing, this is not a set based on the passive pull message queue? In fact, the message here is often just a conversion work, a must have landed in the database over a period of time has changed the data, as long as the data into a message object to provide out. B-server Handlexxmessage because is idempotent, so does not matter whether the message is repeated processing, here is only in the emergency situation in the past a period of time without brain data processing.
4. the processing side of all messages is best for the same message processing implementation idempotent, even if some MQ products support message processing and processing only once, on their own power and so on to make things easier.
5. There are scenarios where there is a need for deferred messages or deferred Message Queuing, such as RABBITMQ, ROCKETMQ, are implemented in different ways.
6. MQ messages generally have two types, one is (preferably) only consumed by one consumer and consumed only once, and the other is that all subscribers can handle it without limiting the number of people. There are different implementations of MQ middleware for both forms, sometimes using message types, some using different switches, or using group partitioning (different group can repeat the same message). In general, these two implementations are supported. When using specific products, be sure to study the relevant documents, do a good experiment to ensure that the two messages are handled in the correct manner, so as to avoid the occurrence of monsters.
7. need to do a good job of monitoring the message, the most important thing is to monitor whether the message has accumulated, some need to enhance the downstream processing capacity (plus machine, plus threading), of course, do better points can be a hot map of the flow of all messages to the flow rate at a glance you can see which messages are currently under pressure. You might think that since the messages are not lost in the MQ system, there is nothing wrong with the backlog of messages. Yes, the message can be properly stacked, but not a lot of accumulation, if the MQ system has storage problems, a large number of accumulated messages are lost is more troublesome, and some business systems for message processing is time, late arrival of the message will be considered a business violation ignored.
8. The figure draws two MQ clusters, an internal set of external. The reason is that the internal MQ cluster we control on the authority can be relative weaknesses, the external cluster must be clear every topic, and topic need to be fixed by the people to maintain can not be arbitrarily deleted in the cluster topic cause confusion. Hard isolation of internal and external messages is also good for performance, and it is recommended to isolate the MQ cluster internally and externally in the production environment.Scheduled Tasks
There are several types of requirements for timed tasks:
1. As mentioned earlier, the MQ notification inevitably has an unreachable problem with cross-service invocation, and we need some mechanism to compensate.
2. Some of the operations are driven by task tables, which are described in detail below for the design of the task table.
3. Some of the business is scheduled to be processed regularly, do not need real-time processing (such as notifying users of the red envelope is about to expire, and the Bank of the final reconciliation, to the user billing, etc.). The difference with 2 is that the time and frequency of the tasks here are varied, and 2 is generally a fixed frequency.
Explain in detail what the task driver is all about. In fact, in the database to do some task table, with these table drivers as the entire data processing core system, this passive mode of operation is the most reliable, than the MQ driver or service-driven two forms of reliable, inherently must be load-balanced + idempotent processing + compensation to the end, the task table can design the following fields:
In addition to these fields, it is possible to add some of the business's own fields, such as order status, user ID, and so on as redundancy. Task table can be archived to reduce the amount of data, the task table plays the nature of Message Queuing, we need to have monitoring can be on the data backlog, access team imbalance processing, dead-letter data and so on, such as the situation to alarm. If our process processing is a task ABCD sequence to deal with, each task because of its own check interval, the system may waste a little time, not through MQ real-time concatenation so efficient, but we have to consider that the processing of the task is often bulk data acquisition + parallel execution, and MQ based on a single data processing is not the same, the overall throughput will not be too much difference, the difference is only a single data execution time, considering the task table-driven execution of the passive stability, for some business, this is an option.
Here are some of the design principles of the job:
1. job can be driven by a variety of scheduling frameworks, such as elasticjob, quartz and so on, need to separate project processing, can not be mixed with services, deployment of more than the start of a problem often. Of course, the implementation of a task scheduling framework is not a very troublesome thing, in the implementation of the time to decide which machine to run the job, so that the entire cluster resource use more reasonable. Plainly, there are two forms, one where the job is deployed to be triggered by the framework, and just where the code is, from the framework to the process.
2. The job project is just a layer of skins, with up to some configuration consolidation, there should be no actual business logic, no touching of the database, and most of the scenario is invoking the API interface of the specific service. The job project is responsible for configuration and frequency control.
3. compensation class job pay attention to the number of compensation, to avoid the whole task by dead-letter data stuck problem.
The three carriages are finished, so, finally, let's comb the module of the whole project under such a set of architectures:
Each of these modules can be packaged into a separate package, all projects are not necessarily in a project space, can be split into 20 projects, the service Api+server+listener in a project, which is actually beneficial to the CICD disadvantage is to modify the code when you need to open N projects.
As I said at the beginning, using this simple architecture can be a great way to expand, not to say much more in terms of complexity or workload than the All-in-one architecture, and you may not agree with this view here. In fact, this is to see the accumulation of the team, if the team are familiar with this architecture system, play micro-service for many years, then in fact, many problems will be in the process of coding directly into consideration, many times the design can also be considered as a live practice, do a lot of nature know what should put where, How to divide and how to close, so there will not be too much extra time cost. These three carriages constitute a simple and practical architecture solution I think can be applied to most Internet projects, but some Internet projects will be more biased in one aspect of the weakening on the other hand, I hope this article is useful to you.