"58 Shen Jian Architecture Series" cache and database consistency assurance

Last Update:2018-01-15 Source: Internet

Author: User

Tags serialization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly discusses a few questions:

(1) When the data in the database and the cache is inconsistent

(2) Inconsistent optimization ideas

(3) How to ensure the consistency of database and cache

first, the origin of demand

The previous article, "Details of the design of the cache architecture two or three" (Click to view) has aroused a wide range of discussions, including the conclusion that when data changes, the " first phase out cache, then modify the database " point is the most discussed.

This conclusion is based on the fact that, because the operation cache and the operational database are not atomic, it is very likely that execution failures will occur.

Suppose you write the database first, then retire the cache: the first write database operation succeeds, the second step to eliminate the cache fails, the DB is new data, the cache is old data, inconsistent data "such as: DB is new data, the cache is old data."

Suppose you first retire the cache, then write the database: the first step to eliminate the cache success, the second write database failure, will only raise one cache miss "such as: no data in the cache, the old data in the db."

Conclusion: The cache is eliminated and the database is written.

The point that has aroused heated discussion is "to operate the cache first, before the writing of the database succeeds, if there is a read request occurs, may cause the old data into the cache, causing inconsistent data", this is the topic discussed in this article.

Second, why the data will be inconsistent

Review the process of reading and writing cache, database, and read-write operations in the previous article.

Write process:

(1) Eliminate the cache first

(2) Re-write DB

Read process:

(1) Read the cache first and return if the data hits hit

(2) Read DB if data misses miss

(3) The data read from the DB into the cache

Under what circumstances might the cache be inconsistent with the data in the database?

In the distributed environment, the data read and write are concurrent, there are multiple applications upstream, through a service of multiple deployments (in order to ensure availability, must be deployed multiple), the same data read and write, at the database level concurrent read and write does not guarantee the completion sequence, This means that the subsequent read request is likely to be completed first (read dirty data):

(a) The first step in the write request A,a eliminated the cache (e.g. 1 in)

(b) Step two of a to write the database, issue a modification request (e.g. 2 in)

(d) The second step of B reads the database, making a read request, at which point a The second step of the write data is not completed , read out a dirty data into the cache (as in step 4)

That is, at the database level, after the request 4 than the first request 2 completed, read out the dirty data, the dirty data into the cache, the cache and the data in the database is inconsistent

Three, inconsistent optimization ideas

is it possible to make the first request must be completed first? The common idea is "serialization", and today we will discuss the "serialization" point.

Let's take a closer look at how multiple read and write SQL is performed in a single service.

is a service services in the upstream and downstream and services in detail, the details are as follows:

(1) Upstream of the service is a number of business applications, upstream initiates the request to the same data concurrently read and write operations, the above example in the concurrency of a uid=1 balance modification (write) operation and uid=1 balance query (Read) operation

(2) Downstream of the service is database db, assuming that only one DB is read and written

(3) in the middle is the service layer services, it is divided into so many parts

(3.1) The top level is the task queue

(3.2) The middle is the worker thread, each worker thread completes the actual work task, the typical work task is to read and write the database through the database connection pool

(3.3) The lowest level is the database connection pool, all SQL statements are sent to the database through the database connection pool to execute

The typical workflow for a worker thread is this:

void Work_thread_routine () {

Task t = Taskqueue.pop (); Get task

Task logic processing, generating SQL statements

DBConnection C = cpool.getdbconnection (); Get a DB connection from the DB connection pool

C.execsql (SQL); Execute SQL statement via DB connection

Cpool.putdbconnection (c); Put the DB connection back into the DB connection pool

}

Question: Task queue has actually done the task serialization work, can ensure that the task is not concurrent execution?

Answer: No, because

(1) 1 services with multiple worker threads, serial popup tasks are executed in parallel

(2) 1 services have multiple database connections, and each worker thread gets a different database connection that executes concurrently at the DB level

Question: Assuming that the service deploys only one copy, can the task be guaranteed not to execute concurrently?

Answer: No, the reason is ditto

question: Suppose 1 only 1 of services database connection, can you ensure that the task does not execute concurrently?

Answer: No, because

(1) 1 Services only 1 database connections, only to ensure that requests on one server are serially executed at the database level

(2) Because the service is distributed, requests on multiple services may still be concurrently executed at the database level

question: Suppose the service deploys only one copy, and 1 only 1 of services connection, can the task be guaranteed not to execute concurrently?

A: Yes, the overall request is serially executed, the throughput is low, and the service cannot guarantee availability

It's over, it seems hopeless,

1) Task queue does not guarantee serialization

2) Single-service multi-database connection does not guarantee serialization

3) Multi-service single database connection does not guarantee serialization

4) Single-service single-database connections may be serializable, but with a low throughput level and no guarantee of service availability, is there a solution?

To take a step back, you don't need to serialize the global request, just "let the same data access be serialized".

Within a service, how to "serialize access to the same data" simply requires "access to the same data through the same DB connection".

How to "make access to the same data through the same DB connection", only need to "at the DB connection pool level slightly modified, by the data to take the connection"

Get DB connection Cpool.getdbconnection () "Return any one of the available DB connections" instead

Cpool.getdbconnection (LongId) "Return ID modulo associated DB connection"

The benefits of this modification are:

(1) Simple, only need to modify the DB connection pool implementation, as well as the DB connection acquisition place

(2) The connection pool changes do not need to pay attention to the business, incoming ID is what meaning connection pool is not concerned, directly follow the ID to return to the DB connection

(3) can be applied to a variety of business scenarios, take the user data service incoming User-id to take the connection, take the order data service incoming Order-id to take the connection

In this way, it is possible to ensure that the same data such as UID execution at the database level must be serial

Wait a moment, the service is deployed a lot of, the above scheme can only guarantee the same data on a service access, the implementation of the DB level is serialized, in fact, the service is distributed deployment, in the global scope of access is still parallel, how to solve it? Can the same data access must fall to the same service?

Iv. can access to the same data fall on the same service?

It analyzes the upstream and downstream structure of service layer services, and then looks at the upstream and downstream of the application layer and the internal structure.

is a business application of the upstream and downstream and services within the detailed expansion, details are as follows:

(1) Upstream uncertainty of the business application, which may be a direct HTTP request, may also be an upstream invocation of a service

(2) Downstream of business applications are multiple service services

(3) The middle is the business application, it is divided into so several parts

(3.1) The top level is the task queue "maybe web-server, for example, Tomcat did this for you."

(3.2) The middle is the worker thread "maybe Web-server worker thread or CGI worker thread to help you do the thread assignment", each worker thread completes the actual business task, the typical work task is to make RPC calls through the service connection pool

(3.3) The lowest layer is the service connection pool, and all RPC calls are executed through the service connection pool to downstream services.

The typical workflow for a worker thread is this:

Voidwork_thread_routine () {

Task t = Taskqueue.pop (); Get task

Task logic processing, composing a network packet packet, calling downstream RPC interface

Serviceconnection C = cpool.getserviceconnection (); Get a service connection from a service connection pool

C.send (packet); Sending messages via service connection to perform RPC requests

Cpool.putserviceconnection (c); Put the service connection back into the service connection pool

}

Familiar, huh? Yes, just a few changes to the service connection pool:

Get Service Connection cpool.getserviceconnection () "Return any available service connection" to

Cpool.getserviceconnection (LongId) "Returns the service connection of the ID modulo associated"

In this way, the request to the same data, such as UID, is guaranteed to fall on the same service.

v. Summary

Due to the database-level read and write concurrency, the problem caused by inconsistent database and cache data (essentially a read request that occurred after the first return), may be resolved by two minor changes:

(1) Modify the service connection pool, ID pick-up service connection, to ensure that the same data read and write fall on the same back-end service

(2) Modify database DB connection pool, id modulo Select DB connection, can ensure that the same data read and write at the database level is serial

Vi. Legacy Issues

Question: Does the modulo access service affect the availability of the service?

A: No, when a downstream service is hung up, the service connection pool detects the availability of the connection and excludes the unusable service connection when the module is taken.

Question: Access the service with a modulo with the Modulo Access db , does it affect the load balancing of requests on each connection?

A: No, as long as the data access ID is balanced, from the overall point of view, the probability of getting each connection by ID is equal, that is, the load is balanced.

question: If the database schema does master-slave synchronization, read and write separation: Write request write the main library, read request read from the library may also cause the cache to enter dirty data ah, this situation how to solve it (read and write requests do not fall in the same DB and read and write DB have synchronous delay)?

A: Next article to share with you.

"The article is reproduced from the public number" architect's Road "

"58 Shen Jian Architecture Series" cache and database consistency assurance

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More