Application of MVCC in Distributed Systems

Source: Internet
Author: User
Document directory
  • Scenario 1: high read response speed requirements
  • Scenario 2: read far more than write
  • Scenario 3: frequent write operations
Problem

Recently, the project encountered a distributed system concurrency control problem. This problem can be abstracted as: a distributed system consists of a data center D and several business processing centers L1, L2... in essence, D is a key-value storage, which provides HTTP-based CRUD operation interfaces. L the business logic can be abstracted as the following three steps:

  1. Read: Get keyValueSet {k1: v1,... kn} from D according to keySet {k1,... kn: vn}
  2. Do: perform business processing based on keyValueSet to obtain the keyValueSet '{k1': v1 ',... km': vm '}(Note: The keySet to be read may be different from the updated keySet)
  3. Update: update keyValueSet to D (Note: D Ensure that the atomicity of multiple keys is updated at one call)

If no transaction is supported, concurrent processing of multiple L may cause data consistency problems. For example, consider the following execution sequence of L1 and L2:

  1. L1 reads key from D: 123 corresponding value 100
  2. L2 reads key: 123 corresponding to 100 from D
  3. L1 value increases by 1, and key: 123 is updated to 100 + 1
  4. 2 is added for L2 pairs, and key: 123 is updated to 100 + 2.

If L1 and L2 are executed serially, the value corresponding to key: 123 is 103, but the execution Effect of L1 in the preceding concurrent execution is completely overwritten by L2. The actual key: the value corresponding to 123 is 102.

Solution 1: Lock Mechanism

To make L processing Serializable, the most direct solution is to consider adding a lock-based simple transaction to D. Let L lock D before business processing, and release the lock after completion. In addition, to prevent lock hold L from committing transactions for a long time due to some reason, D also needs a timeout mechanism. When L tries to commit a transaction that has timed out, it will get an error response.

The advantage of this solution is that it is easy to implement. The disadvantage is that the entire dataset is locked and the granularity is too large. The time contains the entire processing time of L, and the span is too long. To this end, you can reduce the lock granularity to the data item level and lock by key, but this will bring other problems. Because the updated keySet may be unknown in advance, it may not be able to lock all keys when starting the transaction. If you lock the required key in stages, a Deadlock may occur) problem. In addition, locking by key does not solve the problem that the lock takes too long in the case of lock contention. Therefore, key-based locking still has important shortcomings.

Solution 2: Multi-version Concurrency Control

To achieve serializability and avoid various problems in the lock mechanism, we can adopt a lock-free concurrency mechanism based on the Multiversion concurrency control (MVCC) idea. Generally, lock-based concurrency controllers are called pessimistic mechanisms, while MVCC and other mechanisms are called optimistic mechanisms. This is because the lock mechanism is a kind of Preventive Mechanism. Reading will block writing and writing will also block reading. When the lock granularity is large and the time is long, the concurrency performance will not be very good; MVCC is a type of posterior. Reading and writing are not blocked, and reading is not blocked. It checks whether a conflict exists only when it is submitted because there is no lock, therefore, read/write is not blocked, which greatly improves the concurrency performance.

We can use source code version control to understand MVCC. Everyone can read and modify local code freely without blocking each other. Only when submitting the code, the version controller checks for conflicts, and prompt merge. Currently, both Oracle, PostgreSQL, and MySQL support MVCC-based concurrency mechanisms, but the specific implementation varies.

One simple implementation of MVCC is Conditional Update based on the concept of CAS (Compare-and-swap ). The common update parameter only contains one keyValueSet, and the Conditional Update adds a set of update Conditions conditionSet {... data [keyx] = valuex ,...}, that is, data is updated to keyValueSet only when D meets the update condition. Otherwise, an error message is returned. In this way, L forms the Try/Conditional Update/(Try again) processing mode as shown in:

Although a single L cannot guarantee a successful update every time, from the perspective of the entire system, there are always tasks that can be smoothly performed. This scheme uses Conditional Update to avoid large-granularity and long-time locking. when resources are not used for competition among various services, the concurrency performance is good. However, because Conditional Update requires more parameters, if the value length in condition is very long, the amount of data transmitted each time over the network will be large, leading to performance degradation. Especially when the keyValueSet to be updated is small and the condition is large, it is very economic.

To avoid performance problems caused by a large condition, you can add an int-type version number field for each data item. The version number is maintained by D, and the version number is added each time the data is updated; L replace the specific value with the version number during Conditional Update.

Another problem is that the above solution assumes that D can support Conditional Update. What if D is a third-party key-value storage that does not support Conditional Update? At this time, we can add a P between L and D as the proxy. All CRUD operations must pass through P, so that P can perform condition checks, while the actual data operations are placed in D. This method achieves the separation of condition checks and data operations, but also reduces the performance. You need to add cache in P to improve the performance. Because P is the only client of D, the cache Management of P is very simple, so it does not have to worry about cache failure like multi-client. However, as far as I know, both redis and Amazon SimpleDB already support Conditional Update.

Comparison between lock mechanism and MVCC

The above describes the basic principles of the lock mechanism and MVCC, but the advantages and disadvantages of the two mechanisms are not very clear about where they apply. Here I will analyze some typical application scenarios. Note that the following analysis does not apply to the distributed architecture. The lock mechanism and MVCC mechanism exist at all levels of the distributed system, single database system, and even memory variables.

Scenario 1: high read response speed requirements

There is a type of system that is frequently updated and requires high read response speeds, such as stock trading systems. Under the lock mechanism, write will block read, so when there is a write operation, the response speed of the read operation will be affected; while MVCC does not have a read/write lock, read operations are not blocked, so the read response speed is faster and more stable.

Scenario 2: read far more than write

For many systems, the proportion of read operations is often far greater than that of write operations, especially for some systems with massive concurrent reads. Under the lock mechanism, when a write operation occupies the lock, a large number of read operations will be blocked, affecting concurrent performance. MVCC can maintain a high and stable read concurrency.

Scenario 3: frequent write operations

If the proportion of write operations in the system is high and conflicts occur frequently, You need to carefully evaluate them. Assume that two conflicting businesses L1 and L2 are executed separately for t1 and t2 respectively. Under the lock mechanism, their total time is about equal to the time of serial execution:

T = t1 + t2

In MVCC, assume that L1 is updated before L2 and L2 needs to be retried once. Their total time is about equal to the time when L2 is executed twice (Here we assume that the two executions of L2 consume the same time, in a better case, if some valid results can be cached for 1st times, the second execution of L2 may reduce the time consumption ):

T' = 2 * t2

In this case, the key is to evaluate the retry cost. If the retry cost is low, for example, increasing a counter or executing a second operation can be much faster than the first operation, in this case, the MVCC mechanism is more suitable. Otherwise, if retry costs a lot, for example, the report statistics operation takes several hours or even a day, the lock mechanism should be used to avoid retry.

From the above analysis, we can draw a simple conclusion: scenarios with high read response speed and concurrency requirements are suitable for MVCC, while those with higher retry costs are more suitable for Lock mechanisms.

Summary

This article introduces a Conditional Update method based on the multi-version concurrency control (MVCC) idea to solve the concurrency control problem of distributed systems. Compared with the lock mechanism, this method avoids lock at large granularity and long time, and can better adapt to scenarios with high read response speed and concurrency requirements.

Reference

Wikipedia-Serializability

Wikipedia-Compare-and-swap

Wikipedia-Multiversion concurrency control

Lock-free algorithms: The try/commit/(try again) pattern

Amazon SimpleDB FAQs-Does Amazon SimpleDB support transactions?

Redis-Transactions

A Quick Survey of MultiVersion Concurrency Algorithms

Application of Non-Blocking Algorithm in the Development of relational database applications

Friendship recommendation

The figure in this article is drawn using the textdi.pdf tool developed by myself. Please try it out! If you like it, please recommend it to your friends. Thank you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.