General static data cache mechanism

Source: Internet
Author: User
Tags failover
Static Data General cache mechanism http://blog.bossma.cn/architecture/microservice-business-static-data-universal-cache-mechanism/

General static data caching mechanism in microservice Architecture
In distributed systems, especially in the recently popular microservice architecture, is there or is it possible to summarize a general cache processing mechanism or solution for business static data, this article will combine some practical R & D experience to clarify the key problems and explore common solutions.

What is static data?
Static Data here refers to data that does not change frequently or has a low frequency, such as the vehicle model library, basic user information, and basic vehicle information. The vehicle model library may be updated once a month, the changes in basic information about users and vehicles come from user registration and modification. The operation frequency is relatively low.

Another characteristic of this type of data is that the accuracy and real-time performance are relatively high, and there is no loss, error, or long-time old reads.

Whether the data should be classified as static data depends on the specific business and the criteria for dividing the change frequency. In the business definition, the above types of data are classified as static data.

Why cache?
In user-oriented or IOV business scenarios, vehicle model information, basic user information, and basic vehicle information have extensive and high-frequency business needs, and many data needs to be associated. The purpose of caching is to improve data query efficiency. Static data is usually stored in relational databases. Io efficiency of such databases is generally not high, and it is often too slow to cope with highly concurrent queries. The use of cache can greatly improve the throughput of read operations, especially the kV class cache. There are no complex relational operations, and the time complexity is generally at O (1 ). Note that the cache here refers to the memory cache.

Of course, in addition to caching, I/O throughput can also be improved by other means, such as read/write splitting and database/table sharding. However, this type of relational database-oriented solutions tend to improve read/write efficiency at the same time, this type of solution is not thorough enough to simply increase the read throughput, and cannot play a better role in the case of limited resources.

General Cache Mechanism
Next we will provide a general processing mechanism that I think will be analyzed.

For a specific business, it involves six core programs:

Business Service: Provides operation interfaces for certain business data, such as vehicle services, and adds, deletes, modifies, and queries basic vehicle information.
Relational Database: use several tables to persist business data, such as sqlserver, MySQL, and Oracle.
Persistent queue: An independently deployed queue program that supports data persistence, such as rabbitmq, rocketmq, and Kafka.
Cache handler: receives data from the queue and writes it to the cache.
Data Consistency processing program: checks whether the data in the cache database is consistent with that in the relational database. If the data is inconsistent, update the data using the relational database.
Cache database (redis): supports persistent cache databases. redis is directly selected here, which is basically industry standard.
And two external definitions:

Data producer: the source of static business data. It can be understood as a function or module of a front-end app or web system.
Data consumer: a service or system that uses static business data. For example, an alarm system needs to obtain the user information of a vehicle to send an alarm.
The following is a question-and-answer mechanism.

Why do we need business services?
Since it is a microservice architecture, of course it is inseparable from services, because we are discussing static business data, so it is a business service. However, for better understanding, the reason for the emergence of the service is briefly described here.

Today's business often needs to be used on multiple terminals, such as PCs, mobile phones, tablets, and so on, both in the form of web pages and apps, in addition, a data may be needed in a variety of different services. If data operations are distributed in multiple programs, data inconsistency may occur. In addition, the Code redundancy is inevitable, read/write performance is more difficult to control, and changes are basically not changeable. Through a business service, you can manage the operations on business data in an orderly manner and provide external operation capabilities through interfaces. The Code does not need to be redundant, and the performance is also optimized, data inconsistency is also under some control, and it is also comfortable to write upper-layer applications.

Why is it not in-process cache?
Many development languages provide support for in-process caching. Even if the cache package or library is not provided, it can be implemented through static variables. Data Query requests are directly completed in the process memory. The efficiency can be said to be a lever. However, the cache in the process has two problems:

Cache data size: the size of data that can be cached by a process is limited by the available memory of the system. If multiple services are deployed on the machine, a service uses too much memory, it may affect the normal access of other services, so it is not suitable for caching large amounts of data.
Cache avalanche: when a large number of caches expire at the same time or the process is restarted, a large amount of cache penetration may occur, and too many requests may hit the relational database, which may cause the relational database to crash, this causes more unavailability issues.
Why is redis?
Redis and other databases can solve two problems in-process cache:

Independent deployment without affecting other services. You can also create clusters to facilitate memory resizing.
Data Persistence is supported. Even if redis is restarted, the cached data can be restored quickly.
In addition, redis provides good read/write performance and convenient horizontal resizing capabilities. It also supports a variety of common data structures that are easy to use and can be said to be the preferred choice for general cache.

Why queue?
The purpose of the queue here is to decouple. Frankly speaking, there is no queue in this solution, and it is also possible for the service to directly update to the cache after the relational database operation is completed. The reason why this queue is added is that the current business development requires obvious System Splitting, especially in the microservice architecture. To reduce the coupling between services, queue is a common choice, some development models are also highly respected, such as actor models.

For example, if you register a new user, you need to give him 300 points and send him an email with successful registration, if you write the registered user, bonus points, and successful emails together for execution, two problems will occur: first, the registration operation takes more time, the second is the increase in the overall unavailability rate caused by a processing, and the third is the poor scalability of the program. By introducing more queues, you can send registration information to the point queue and notification queue respectively, then, the integration module and the notification module handle the issue separately. The coupling between the user, integration, and notification modules is reduced, and the mutual influence is reduced, the overall scalability is enhanced by adding other processes after registration, that is, adding a queue.

As a common decoupling solution, the queue does not have much impact on caching, but it will inevitably have other business processing in addition to caching. Therefore, it is retained here for a unified processing mechanism. (If it is used, it will be carried forward .)

Why does the queue need to be persistent?
Persistence is used to solve the problem of data loss caused by network jitter or breakdown. data may be lost between data from service to queue, queue processing, and then from queue to cache processing program. In order to solve the problem of data loss, it is necessary to confirm the sending, the queue itself is persistent, and the receiving time is confirmed. However, note that the confirmation mechanism may lead to the generation of duplicate data, because it is necessary to resend or receive the message before receiving the confirmation, and the data may be processed normally, but the confirmation is lost. The confirmation mechanism also reduces the throughput of the queue, however, according to our definition, the frequency of changing static business data should not be high. It is a good choice if a high concurrency Shard is required at the same time.

Here, rabbitmq is recommended for persistent queues. Although the throughput is not very large, it is good in all aspects, and concurrency is sufficient.

Why do I need a data consistency check program?
After the business service completes the relational database operation, the data is sent to the queue (or before the queue is used to directly write data to the cache), and the business service crashes, and the data cannot be updated to the cache. Another case is that redis has a failover, and the updates in the master node are not synchronized to slaver. By introducing such a check program, you can regularly check the differences between the data in the relational database and the cached data. If the cached data is outdated, update it. This provides a rescue solution in extreme circumstances.

The Running frequency of this check program requires a comprehensive consideration of the database pressure and the amount of data that can be tolerated, so that the database cannot be checked dead or outdated for too long, resulting in a large amount of data inconsistency. You can set the last check time point to check only the last check time point (or the last few times to prevent the redis failover data from being not synchronized) data changed by the time of this check, so that each check only applies to incremental changes, which is more efficient.

At the same time, we need to understand that in distributed systems, data inconsistency often occurs under the microservice architecture. We must make a trade-off between consistency and availability to minimize the impact, for example, use quasi-real-time or final consistency.

Is it enough for the data consistency check program?
If there is no cache handler, is it enough to regularly synchronize the relational database and the cache database? This still depends on the business. If it is the data of the vehicle model library and a new model is added, it was not available before, and the time is not very sensitive. this is OK. However, when new users or vehicles are added, data consumers still want to use the latest data for processing immediately. The sooner the better, the more synchronous or quasi-synchronous updates can be used.

Why not use the cache expiration mechanism?
Using the cache expiration mechanism, you do not need to cache the processing program and the data consistency check program. The service first queries data from redis and returns the data if the data exists. If the data does not exist, it queries data from the relational database, it is also a common cache processing mechanism, which can be found on the Internet and can be well used by many people.

However, the cache expiration time is a problem: how long the cache expires, the short setting can reduce the data stale, but it will increase the probability of cache penetration, even if a random cache expiration time is used, when redis is restarted or fails over, it may still cause a cache avalanche. When an avalanche occurs, the data push mechanism may also cause service Unavailability for a longer period of time; the length can improve the Cache Usage, but the data is outdated. In the above definition of static data, the accuracy and real-time performance are high, and business requirements cannot be considered. In addition, if there are fluctuations in operation data and queries, is it necessary to introduce a dynamic TTL mechanism to achieve a balance between Cache Usage and direct access to the database, therefore, we need to weigh the business needs and technical solutions.

Summary
Through the above questions and answers, let's take a look at the general static data cache processing mechanism in the microservice architecture proposed above.

Business services are used to encapsulate data operations. Whether it is relational databases or cache databases, data consumers do not need to worry about it. They only care about whether the service can provide high-concurrency real-time data query capabilities.
In distributed systems, queues are often used for decoupling. Business Services do not write data into the cache, add a queue to subscribe to data changes, and then write data from the queue to the cache database.
For the vast majority of normal cases, the real-time performance of updating cache data through the queue is similar to that of updating cache data in the business service. At the same time, business operations and write cache decoupling are achieved.
In the case of data inconsistency caused by extreme crashes, the data consistency check program is used to remedy the issue and update the cached data as soon as possible.
Currently, business services can access the redis cache to provide high-concurrency and quasi-Real-Time query capabilities for static data. Data that does not exist in the cache does not exist and does not pass through the cache.
For the microservice architecture, this mechanism uses the queue-like General Decoupling Method to independently process cache updates through quasi-real-time updates and timed checks, it ensures the real-time and ultimate consistency of the cache in a short period of time in extreme cases, and eliminates cache penetration and avalanche through the cache persistence mechanism, horizontal resizing is supported when the cached data is large or when the read concurrency is high. It can be considered that this provides a widely used cache processing mechanism for static business data.

This scheme may not be necessary in some cases. For example, if you want to cache a list of cities with national traffic restrictions, it is enough to use a In-process cache.

The last question is the workload. This will bring complexity to development and maintenance, whether the queue is easy to use, whether the manpower is sufficient, and what the business needs are.

Postscript
Redis coupling problem: in the figure, the business service directly accesses redis. If you want to make the business service completely transparent to redis, this is still complicated. You can consider adopting the AOP method, I have not implemented the idea of maintaining the same type definition for relational databases and redis BY USING THE ORM and deserialization methods, and the cached data is quasi-real-time, if the requirements are completely consistent, the version queried from the relational database should be provided. In addition, to get rid of the direct dependency on redis, you can use openresty to achieve transparent access to resources. This is not the focus of this article.

Service Availability: This article does not focus on service availability. To ensure high service availability, each service or program should be deployed in multiple copies, whether it is a server Load balancer solution, or the traditional master-slave solution can continue to provide services when some deployment is unavailable.

It is faster to write, and some understandings are biased. please correct me.

Add my independent blog access address: http://blog.bossma.cn/architecture/microservice-business-static-data-universal-cache-mechanism/

General static data cache mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.