Comparison of the advantages and disadvantages of DDD CQRS architecture and traditional architectures

Source: Internet
Author: User

Tomorrow is a Eve, today at home free, want to focus on the characteristics of CQRS architecture and compared to the traditional architecture of the advantages and disadvantages of analysis. First of all, I wish you monkey Happy New Year, good luck, health!

In recent years, in the field of DDD, we have often seen the concept of CQRS architecture. I personally wrote a Enode framework specifically for implementing this architecture. The idea of CQRS architecture itself is actually very simple, that is, read-write separation. is a very well-understood thought. Just like we use the MySQL database master preparation, data write to the master, and then query from the standby to check, the main backup data synchronization by the MySQL database itself, this is a database level of read and write separation. About the CQRS structure of the introduction is actually very much, we can Baidu or Google itself. I would like to summarize today the structure of the traditional architecture (three-tier architecture, DDD Classic four-tier architecture) in the data consistency, scalability, usability, scalability, performance of the similarities and differences, I hope to summarize some of the advantages and disadvantages, for everyone in the architecture selection to provide reference.

Objective

CQRS architecture because it is only a read-write separation of ideas, the implementation of a variety of ways. For example, the data storage is not separated, just the code level read and write separation, is also the embodiment of CQRS, and then the data storage of read and write separation, the C-side responsible for data storage, the Q-side is responsible for data query, Q end of the data through the C-side of the event to synchronize, this is also a CQRS architecture. The CQRS architecture I'm discussing today refers to this implementation. Another important point, C-side we will also introduce the event Sourcing+in memory of the two architectural ideas, I think these two ideas and CQRS architecture can be a perfect combination of CQRS this architecture to play the maximum value.

Data consistency

Traditional architecture, data is generally strong consistency, we usually use database transactions to ensure that all data modification of an operation in a database transaction, so that the strong consistency of data. In distributed scenarios, we also want the strong consistency of data, that is, the use of distributed transactions. But it is well known that the difficulty and cost of distributed transaction is very high, and the throughput of the system with distributed transaction will be low, and the system availability will be relatively low. So, most of the time, we also give up the strong consistency of the data and adopt the final consistency; from the point of view of the cap theorem, we discard consistency and choose availability.

The CQRS architecture is fully consistent with the concept of eventual consistency. This architecture is based on an important assumption that the data the user sees is always old. For a multi-user operation of the system, this phenomenon is very common. such as the scene of the second kill, when you place a single, perhaps the interface you see the number of items you have, but when you place the order, the system prompts the product sold out. In fact, we just have to think about it, and that's true. Because the data we see in the interface is taken out of the database, once displayed on the interface, it will not change. But it is possible that others have modified the data in the database. This phenomenon is especially common in most systems, especially high-concurrency web systems.

So, based on this assumption, we know that even if our system achieves strong data consistency, the user is likely to see the old data. So, this gives us a new idea of the design architecture. Can we do this: we just need to make sure that all the additions, deletions, and modifications to the system are based on the data that is up-to-date and that the queried data doesn't have to be up to date. This naturally leads to the CQRS architecture. C-terminal data is kept up-to-date, the data is strong consistent, Q-side data is not up-to-date, through the C-Terminal events asynchronous update. So, based on this idea, we start to think about how to implement CQ at both ends. See here, perhaps you have another question, is why the C-side of the data must be up-to-date? This is actually very easy to understand, because you want to modify the data, then you may have some modified business rules to judge, if you based on the data is not up-to-date, it means that the judgment is meaningless or inaccurate, so the changes based on old data is meaningless.

Scalability

Traditional architectures, which are strongly dependent on each other, are direct method calls between objects, whereas the CQRS architecture is an event-driven idea; from the microscopic aggregation root level, the traditional architecture is the application layer that coordinates multiple aggregate roots through procedural code to complete the entire business operation in a transactional way. The CQRS architecture, in the event-driven way, ultimately implements the interaction of multiple aggregation roots, in the spirit of a saga. In addition, the CQ end of the CQRS architecture is also an event-driven representation of asynchronous data synchronization through events. Rise to the architectural level, the former is the idea of SOA, the latter is the idea of EDA. SOA is a service that invokes another service to complete the interaction between services, and the services are tightly coupled; EDA is a component that subscribes to another component's event messages, updating the component's own State based on the event information, so the EDA architecture, each component does not depend on other components; the components are only associated by topic , the coupling is very low.

It says that the coupling of the two architectures is obvious, the architecture with low coupling is necessarily good extensibility. Because the idea of SOA, when I want to add a new function, I need to modify the original code, such as the original a service call B,c two services, and then we want to call a service D, we need to change the logic of a service, and the EDA architecture, we do not need to move the existing code, the original B,c two subscribers subscribe to a generated message , now you just need to add a new message subscriber D.

From the CQRS point of view, there is also a very obvious example is the Q-terminal extensibility. Suppose our original Q-side was only implemented using a database, but later the system's traffic increased, the database update is too slow or can not meet the high concurrency of the query, so we want to increase the cache to deal with high concurrency queries. That's easy for the CQRS architecture, and we just need to add a new event subscriber to update the cache. It should be said that we can easily increase the type of data storage at the Q end. Databases, caches, search engines, NoSQL, logs, and so on. We can choose the appropriate Q-side data storage According to our business scenario, and realize the purpose of fast query. It's all due to us. The C-side records all the model-changing events, and when we add a new view store, we can get the latest state of view storage based on these events. This extensibility is difficult to achieve under the traditional architecture.

Availability of

availability, both traditional and CQRS architectures, can be highly available as long as we do not have a single point for each node in our system. But, by contrast, I think there is more room for avoidance and choice in terms of usability of the CQRS architecture.

Traditional architecture, because the read and write is not separated, so the availability of reading and writing together to consider, the difficulty is more difficult. Because of the traditional architecture, if a system at the peak of concurrent write large, such as 2W, concurrent reading is also very large, such as 10W. The system must be optimized to support this high-concurrency write and query at the same time, or the system will be hung up at peak time. This is based on the idea of synchronous call system shortcomings, not a thing to peak load shifting, save the moment more out of the request, and must let the system no matter how many requests, must be able to deal with in time, otherwise it will cause avalanche effect, resulting in system paralysis. But a system that does not always be at a peak, the peak may only be half an hour or 1 hours, but to ensure that the system does not hang up at peak time, we must use enough hardware to support this peak. Most of the time, there is no need for such a high level of hardware resources, so it will cause the waste of resources. Therefore, we say that the implementation cost of a system based on synchronous invocation and SOA thinking is very expensive.

And in the CQRS schema, because the CQRS schema separates read and write, usability is equivalent to being isolated in two parts to consider. We only need to consider how the C-side solves the write availability, Q-side how to solve the read availability. The C-terminal solves usability, which I think is easier because the C-side is message-driven. When we want to do any data modification, we send the command to the distributed message queue, and then the backend consumer handles the command-> generation domain event------the event to the distributed message queue, and the last event is consumed by the Q terminal. This link is message-driven. The usability is much higher than the direct service method invocation of the traditional architecture. Because even if the backend consumer of our command is temporarily suspended, it will not affect the front controller sending Command,controller still available. From this perspective, the CQRS architecture is more available for data modification. But you might say, what if the distributed message queue hangs? Oh, yes, it is indeed possible. But the general distributed message queue belongs to the middleware, the general middleware has the high availability (supports the cluster and the main standby switch), so the usability is much higher than our application. In addition, because the command is sent first to the distributed message queue, the benefits of distributed Message Queuing can be fully exploited: asynchronous, pull-mode, peak load shifting, queue-based horizontal scaling. These features ensure that even if the front controller sends a large number of command moments at peak time, it does not cause the application of the back-end command to hang up, because we pull the command based on our own spending power. This is also the CQRS C-terminal in the availability of advantages, in fact, the essence of distributed Message Queuing benefits. So, from here we can realize the EDA architecture (event-driven architecture) is very valuable, this architecture also embodies our current popular reactive programming (responsive programming) ideas.

Then, for the Q-side, it should be said that there is no difference from the traditional architecture, because it is all about handling high concurrency queries. This is how the previous optimization, and now how to optimize. But as I emphasized in the extensibility above, the CQRS architecture makes it easier to provide more view storage, databases, caches, search engines, NoSQL, and these stored updates can be carried out in parallel and not be dragged down by each other. The ideal scenario, I think, is that if your application is going to implement this complex query with full-text indexing, you can use a search engine on the Q side, such as Elasticsearch, if your query scenario can be met by keyvalue this data structure, Then we can use the NoSQL distributed cache of Redis on the Q side. In short, I think the CQRS architecture makes it easier to solve query problems than traditional architectures because we choose more. But you might say that my scenario can only be solved with a relational database, and the concurrency of queries is very high. That's no way, the only way is to disperse the query Io, we do a database of sub-tables, as well as the database to do a master multi-standby, query go standby machine. On this point, the solution is the same as the traditional architecture.

Performance, scalability

Originally wanted to write performance and scalability, but think of the two actually have a certain correlation, so decided to put together to write.

scalability means that when a system is accessed by 100 people, performance (throughput, response time) is good, and the performance is equally good when 100W people visit, which is scalability. With 100 people visiting and 100W people visiting, the pressure on the system is obviously different. If our system, in architecture, can improve the service capability of the system by simply adding machines, then we can say that this kind of architecture is very scalable. Let's think about the performance and scalability of traditional architectures and CQRS architectures.

When it comes to performance, it's common to think about where a system's performance bottlenecks are. As long as we solve the performance bottleneck, the system means that it has to be scaled horizontally to achieve a scalable purpose (of course, the horizontal expansion of the data store is not considered). So, we just need to analyze where the bottleneck points of traditional architectures and CQRS architectures can be.

Traditional architectures, bottlenecks are usually in the underlying database. Then we generally do, for read: Usually use the cache to solve most of the query problems, for writing: There are many ways, such as sub-database table, or use NoSQL, and so on. For example, Ali used a large number of sub-Library sub-table scheme, and in the future should be all using the tall oceanbase to replace the sub-database sub-table scheme. Through the Sub-Library table, a database server may be able to withstand the high concurrency of 10W write, if we put the data on 10 database server, that each machine only need to bear 1W write, relative to bear 10W write, now write 1W is much easier. Therefore, it should be said that data storage for the traditional architecture, is no longer a bottleneck.

The traditional schema data modification steps are: 1) Remove data from the DB to memory; 2) memory modification data; 3) Update data back to DB. A total of 2 database Io is involved.

Then cqrs the architecture, CQ can spend more time than the traditional architecture, because the CQRS architecture, although there are only 2 database io,1) persistence events, 2) update the Read library according to the event. But because the data on both ends of the CQRS schema CQ also needs to be synchronized through MQ, it also takes time.

So, we say that if you want to use the CQRS architecture, you have to accept the final consistency of the CQ data, because if you complete the process with the update of the read library, the business scenario will probably take more time than the traditional architecture. However, if we end with the C-end processing, then the CQRS architecture is faster because the C-side only needs to insert the event (the C-side can use the In-memory schema, so there is no need to remove the aggregate root from the database). I think it's important here, for the CQRS architecture, we're more concerned with the time it takes to finish the C-end process, and the processing of the Q-side is slightly slower, because the Q-side is just for us to view the data (eventual consistency). If we choose the CQRS architecture, we have to accept the disadvantage of a little delay in updating the Q-side data, otherwise we should not use this architecture. So, I hope you will be in accordance with your business scenarios to do the architecture selection must be fully aware of this. People who have studied the Enode framework should know that the C-terminal, Enode framework does a lot of design that is not easy to do with conventional architectures, and through these designs, it can make the C-end work faster. For example, the in-memory architecture, the concurrency update problem of eliminating aggregate roots through two routes, the event store (which is characterized by big data, high concurrent writes) can be resolved naturally using keyvalue NoSQL (such as HBase). All of this is to improve the performance of the C-terminal.

As a result, both architectures can be overcome in terms of performance bottlenecks in general. And as long as the performance bottleneck is overcome, the scalability is not a problem. Both bottlenecks are in the data persistence, but the traditional architecture because most of the system is to store the data to the relational type, so it can only use the Sub-Library sub-table scheme. And the CQRS architecture, if we only focus on the C-terminal bottleneck, because the C-side only stores events, and the event store allows NoSQL to do so. So developers need to pay a lot less in this area, because after all, the former you have to do the sub-database of the plan, the latter you only need to operational nosql. The CQRS architecture of the read library, the above analysis, we do not necessarily need to use a relational database. So, I think this is one of the benefits of our CQ separation.

Conclusion

I think that both the traditional architecture and the CQRS architecture are good architectures. The traditional architecture is low-threshold and knows more people, and because most projects do not have much concurrent write volume and data volume. So it should be said that the majority of projects, the use of traditional architecture is OK. But through the analysis of this article, we also know that the traditional architecture does have some shortcomings, such as scalability, usability, performance bottlenecks in the solution, are weaker than the CQRS architecture. We have other views, welcome to shoot bricks, communication can progress, hehe. So, if your scenario is high concurrency write, high concurrency read, big data, and want to be better at scalability, availability, performance, scalability, I think I can try CQRS architecture. However, there is a problem, the threshold of the CQRS architecture is very high, I think if there is no mature framework support, it is difficult to use. As far as I know, the industry does not have many mature CQRS framework, Java Platform has axon framework, Jdon Framework;.net platform, Enode Framework is in this direction efforts. So, I think that's one of the reasons why there are few mature cases using the CQRS architecture. Another reason is that using the CQRS architecture requires the developer to have a certain understanding of DDD, otherwise it is difficult to practice, and DDD itself has to understand that it is difficult to apply to reality in a few years. Another reason is that the core of the CQRS architecture relies heavily on high-performance distributed messaging middleware, so choosing a high-performance distributed messaging middleware is also a threshold (Java platform has ROCKETMQ). NET platform I personally developed a distributed message Queue equeue, hehe. In addition, without the support of a mature CQRS framework, the complexity of coding can be complex, such as event Sourcing, message retry, message power processing, event sequence processing, concurrency control, these problems are not so easy to fix. And if there is framework support, by the framework to help us deal with these purely technical issues, developers only need to focus on how to model, implement the domain model, how to update the library, how to implement the query, it is possible to use the CQRS architecture, because this can be more than traditional architecture development easier, And to gain the benefits of many CQRS architectures.

Comparison of the advantages and disadvantages of DDD CQRS architecture and traditional architectures

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.