This article wants to introduce the Enode framework to realize the goal as well as the partial realization analysis thought dissection. Overall, the Enode framework is an application development framework based on CQRS architecture and message driven. Before we talk about implementing ideas, let's take a look at some of the goals that the Enode framework wants to achieve.
Overall objectives of the framework
High throughput (throughput), low latency (lower latency), high-availability (high availability);
Need to be able to make full use of the CPU, that is, to allow easy configuration needs to use the number of parallel processing threads to improve the command processing capacity of a single machine;
Support Command synchronization and asynchronous processing, synchronous processing to allow the client to catch exceptions, asynchronous processing to allow the client to set the callback function;
The application programming model should unify, the framework API should be simple, easy to use, consistent, good understanding;
Allows developers to focus only on the business, do not care where the data, and how to save, also do not care about concurrency, retry, timeout and other technology-related issues;
Based on the message-driven architecture, the message delivery aspect, to be able to do: at least once delivery (that is, if the message is not lost), and can do most of the delivery, because sometimes we can not do the power of the message processing;
To be extensible enough, each component in the framework allows the user to customize and replace it, including the IOC container;
Because it is the CQRS architecture, it is important to ensure that the sequence of events for a single aggregate root is exactly the same as the order in which they are distributed to the query end, otherwise there is a serious problem of data inconsistency;
Analysis of the idea of high throughput, low latency and high availability
Concept:
Throughput is the number of requests that the system can process per second; Latency is the delay in processing a request; In general, the performance of a system is constrained by these two conditions. For example, my system can withstand 1 million of concurrency, but the system's latency is more than 2 minutes, then this 1 million load is meaningless. The system latency is very short, but the throughput is very low and equally meaningless. Therefore, the performance test of a good system is bound to be affected by both conditions. Experienced friends must know that these two things in some relationship: the greater the throughput, the latency will be worse. Because the request is too large, the system is too busy, so the response rate will naturally low. The better the latency, the higher the throughput can be supported. Because the latency short description is fast and performance is high, you can handle more requests. Therefore, we can see that, most fundamentally, we should try to shorten the processing time of a single request. In addition, availability refers to the average failure time of the system, the higher the availability of the system, the longer the average fault-free time. If your system can keep running 7*24 24x7 for 365 days a year, that means your system is highly usable.
General ideas:
What to do if you want to achieve high availability? The easy way to do this is to master the standby mode, that is, a site running at the same time on the main standby server, if the primary server is normal, then all the requests are processed by the primary server, when the primary server is dead, that automatically switch to the standby server; this ensures high availability; even we can set up multiple servers to increase availability But the main standby mode does not solve the problem of high throughput, because the number of requests a machine can handle is always limited. I think we need to have our system support the cluster deployment, that is, not only a machine in service, but also a number of machines in the service, these simultaneous service machines are called a cluster. And in order to enable the server in the cluster to balance the load, in order to avoid a server is very busy, other servers are very empty, we also need load balancing technology. Of course, true high availability also means that you can't have a single point of failure. Just can't because one point in the cluster hangs up and causes the whole cluster to hang up, so we have to eliminate all the data to go through a point of design; instead, to achieve horizontal scaling of each point, the Web application site (Enode framework support), internal Storage caching (Memcached,redis support), persistence (MongoDB support) to support clustering and load balancing. Well, the whole system supports cluster + load balancing to solve the problem of high throughput and no single point, but does not solve the problem of low latency, then how to do? How can you handle a user request as quickly as possible? I think the key is three aspects: in memory+ as fast as possible io+ free race, that is, memory mode and fast data persistence plus non-blocking programming model.
In Memory
What does it mean in memory? In the Enode framework, the main embodiment is that when we want to get the domain aggregation root object and then do some business logic operations, it is obtained from memory, not from the database. The advantage of this is fast. So what are some of the problems that you have to face, such as insufficient memory? With distributed caching, such as memcached, Redis is mature based on the Key-value model of NoSQL products. What if the Redis server hangs up? It doesn't matter, we can let the framework automate, that is, when it discovers that the memory cache does not exist, it is automatically taken from the Eventstore to remove all the events from the current aggregation root, and then use the mechanism of event traceability (sourcing, es) to restore the aggregation root and then try to update to the cache Then return to the user. This solves the problem that the cache hangs, when the Redis cache server restarts, can continue to take the aggregation root from the cache; In fact, we also need to deploy the Redis server based on the situation, so that the data can be sharding and the availability of the cache can be improved on the other hand. Because a Redis cache server is not hung, all cached data for the entire system is lost. Also, you might wonder, where does the data from the Redis cache server come from? Also using ES mode, because we store all the events of all the aggregation roots in Eventstore, we can get the aggregate roots that need to be in the cache when the Redis cache server is started according to the ES pattern.