The application of multi-version concurrency control (MVCC) in Distributed system

Last Update:2014-05-14 Source: Internet

Author: User

Tags value store

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

problem

A concurrency control problem with a distributed system has been encountered in recent projects. The problem can be abstracted as: a distributed system consists of a data center D and a number of business processing centers l1,l2 ... LN is essentially a Key-value store, which provides an HTTP protocol-based CRUD operation interface externally. The business logic of L can be abstracted into the following 3 steps:

read: According to keyset {k1, ... kn} Get Keyvalueset {k1:v1, ... kn:vn}
DO: Business processing based on Keyvalueset, getting the dataset to be updated Keyvalueset ' {K1 ': V1 ', ... km ': VM '} ( note : Read keyset and updated keyset ' may be different)
update: Update Keyvalueset to D ( Note : D guarantees that the atomicity of multiple keys is updated at one call)

In the absence of transaction support, Multiple L concurrent processing can lead to data consistency issues. For example, consider the following order of execution for L1 and L2:

l1 reads key:123 values from D
l2 from D read key:123 corresponding to the
l1 adds 1 to the value, updates key:123 to 100 + 1
l2 adds 2 to the value, updates key:123 to + 2

If L1 and L2 are executed serially, the key:123 corresponding value will be 103, but the execution effect of L1 in the above concurrent execution is completely covered by L2, and the actual key:123 corresponding value becomes 102.

Solution 1: Lock mechanism

To make the processing of L serializable (Serializable), one of the most straightforward solutions is to consider adding a simple lock-based transaction to D. Let L lock D before doing business processing and release the lock after completion. In addition, in order to prevent the holding of the lock L for some reason the transaction has not been committed for a long time, d also needs to have a timeout mechanism, and when l try to commit a transaction that has timed out, it gets an error response.

The advantage of this scheme is that the implementation is simple, the disadvantage is that the whole data set is locked, the granularity is too large, the time contains the whole processing time of L, the span is too long. To do this, consider lowering the lock granularity to the data item level and locking by key, but this can lead to other problems. Because the updated keyset ' may be uncertain beforehand, all keys may not be locked at the start of the transaction, and Deadlock (Deadlock) may occur if the required key is locked in a phased way. In addition, pressing key to lock in the case of lock contention does not solve the problem of too long locking time. Therefore, there are still important deficiencies to lock by key.

Solution 2: Multi-version concurrency control

In order to realize serializable and avoid various problems of lock mechanism, we can adopt the lock-free concurrency mechanism based on multi-version concurrency control (multiversion concurrency CONTROL,MVCC). In general, the lock-based concurrency controller is called a pessimistic mechanism, and the mechanism of MVCC is called the optimistic mechanism. This is because the locking mechanism is a preventative, read will block write, write will also block read, when the lock granularity is large, the time is longer is the concurrency performance will not be too good; and MVCC is a posteriori, read not block write, write not block read, wait until the time of submission to test whether there is a conflict, because there is no lock, so read and write does not block each other, This greatly improves concurrency performance.

We can borrow source code versioning to understand MVCC, everyone is free to read and modify the local code, not blocking each other, only at the time of submission of the version controller will check the conflict, and prompted the merge. Currently, Oracle, PostgreSQL, and MySQL support MVCC-based concurrency mechanisms, but the implementation is different.

A simple implementation of MVCC is a conditional update (Conditional Update) based on the CAS (compare-and-swap) idea. The normal update parameter contains only a Keyvalueset ', Conditional Update adds a set of update criteria Conditionset {... data[keyx]=valuex, ...}, That is, the data is updated to Keyvalueset ' only if D satisfies the update condition; otherwise, an error message is returned. Thus, l forms the processing mode of the Try/conditional update/(Try again) as shown:

While there is no guarantee that a single L can be successfully updated every time, there are always tasks that can be performed smoothly from the perspective of the system. This scheme uses conditional update to avoid large-grained and long-time locks, and concurrency is good when resource contention between the various businesses is small. However, since conditional update requires more parameters, if the length of value in condition is long, the amount of data sent per network is larger, resulting in performance degradation. Especially when the need to update the Keyvalueset ' is small, and the condition is very large, it is very economical.

In order to avoid the performance problems caused by condition, you can add an int version number field for each data item, maintain the version number by D, increment the version number each time the data is updated, and replace the specific value with the version number at the time of the conditional update.

Another problem is that the solution above assumes that D is capable of supporting conditional update, so what if D is a third-party Key-value store that does not support conditional update? At this point, we can add a p as a proxy between L and D, and all crud operations must go through P, allowing p to perform a conditional check, while the actual data operation is placed in D. This approach allows for the separation of condition checking and data operations, but at the same time reducing performance, you need to add the cache in p to improve performance. Because P is the only client of D, the cache management of P is very simple and does not have to worry about caching failures like multi-client scenarios. However, in fact, as far as I know, both Redis and Amazon SimpleDB have support for conditional update.

lock mechanism and MVCC comparison

The above describes the locking mechanism and the basic principles of MVCC, but for them to apply to what occasions, the different circumstances of the two mechanisms of the merits and demerits of the specific performance where is not very clear. Here I have a simple analysis of some typical application scenarios. It is important to note that the following analysis is not for distributed, locking mechanisms and MVCC two mechanisms exist in distributed systems, single-database systems, and even at all levels of memory variables.

Scenario 1: High response speed for read

There is a class of system updates that are particularly frequent and require high response times for reading, such as stock trading systems. In the lock mechanism, the write will block the read, then when there is a write operation, the response speed of the read operation will be affected, while the MVCC does not have a read-write lock, the read operation is not blocked, so read the response faster and more stable.

Scenario 2: Read far more than write

For many systems, the ratio of read operations tends to be much larger than write operations, especially in the case of some massive concurrent read systems. Under the lock mechanism, when there are write operations occupy the lock, there will be a large number of read operations are blocked, affecting the concurrency performance, while the MVCC can maintain a relatively high and stable read concurrency capability.

Scenario 3: Write operations conflict frequently

If the scale of write operations in the system is high and conflicts are frequent, careful evaluation is required. Assuming that two conflicting business L1 and L2 are executed separately, they are time-consuming t1,t2. Under the lock mechanism, their total time is approximately equal to the time of the serial execution:

T = t1 + T2

And under MVCC, assuming that L1 is updated before L2, L2 need retry once, their total time is approximately equal to L2 execution two times (assuming L2 's two execution time is equal, the better case is, if the 1th time can cache the next part of the valid results, the second execution L2 time is likely to decrease):

T ' = 2 * t2

The key is to assess the cost of retry, if the cost of retry is very low, for example, a counter increment, or the second execution can be much faster than the first time, then the use of MVCC mechanism is more appropriate. Conversely, if the cost of retry is very large, for example, the report statistics operations need to count hours or even a day that should adopt the locking mechanism to avoid retry.

From the above analysis, we can easily conclude that the response speed and concurrency requirements for reading are more suitable for MVCC, while the higher the retry cost, the better the lock mechanism.

Summary

This paper introduces a method of conditional update to solve the concurrency control problem of distributed system based on the multi-version concurrency control (MVCC) idea. Compared with the locking mechanism, this method avoids large-granularity and long-time locking, and can better adapt to the high-speed and concurrency-demanding scenarios of reading.

Original address: http://www.kuqin.com/system-analysis/20120319/319108.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More