) Basic database Concurrency Control

Source: Internet
Author: User

For multi-user systems, database operation concurrency issues are common, resulting in errors such as data loss and incorrect data reading. The underlying cause is data inconsistency: A process reads data in the memory and the "same batch" data in the database at a specific time point (the data in the database may have been modified by another process), but the program does not know, as a result, various errors are caused.

The main solution is offline concurrency. Other concurrency problems can be solved through system transactions and simple logic. Offline concurrency is usually related to the business logic. When it is inevitable, we expect business transactions to work like system transactions. However, long business transactions sometimes cannot be placed in a system transaction (not flexible, reducing concurrency), so you have to control the concurrency of such cross-system transactions on your own. When the business logic is complex, we need to weigh between correctness and concurrency. These two aspects are sometimes contradictory and cannot be perfectly solved. For example, some processing may make some operation incorrect, but it has little impact and won't cause loss, but it can improve the concurrency of the system. Such correctness can be sacrificed.

Since the underlying cause is data inconsistency, it is generally necessary to solve the problem from two aspects:

1. Data inconsistency is prohibited.

That is to say, the lock should be used to lock the data in the database, and other processes are not allowed to modify the data. When the business transaction is completed, the lock will be released. Changing resources to exclusive type can avoid data inconsistency. The locks of system transactions are maintained at the underlying layer of the database. This method may cause non-flexibility and reduce concurrency, because when a process executes a business transaction, other processes cannot perform related operations.

The simplest way is to put the entire business transaction in one system transaction for execution (this is not a problem of offline concurrency). The problem is that when a process executes a business transaction, when other processes perform related operations, they enter the unresponsive wait state. If the business transaction lasts for one day, other processes may wait for one day, and may cause deadlocks.

Since a lock is used, it is necessary to prevent deadlocks. Think about it. There are two necessary conditions for a deadlock: 1. When a lock is held on some resources, a new resource is requested; 2. If the requested resource has been locked by others, enter the waiting status. Neither of the two conditions will cause a deadlock.

The better way is to control the lock by yourself. You can create a region in the memory or database of the application server to specifically maintain the lock. For example, to create a lock table in a database or add a lock field to each data entry, mark the lock when you want to lock a data entry, when other processes perform related operations, check whether the data has been locked. If the data has been locked, the execution will not continue. Of course, you can also set a lock for a group of data based on the business logic. This is different from the system transaction. When the requested resource is locked, it does not enter the waiting state, but stops executing the request, improving the flexibility (users can perform other operations ), it can also effectively reduce the possibility of deadlock (so that it does not meet the second condition of the deadlock ). However, this control method is the most troublesome. developers need to figure out the business logic of data, that is, various business transactions and their relationships, and also involve Lock Management and System transactions. It also involves session maintenance, because the user may end Service Execution abnormally (crashes, network interruptions, power outages, and the Earth does not turn around), leaving unreleased locks. The server must do this. If a user has stopped the session abnormally, the lock in the session will be released.

In addition, to prevent deadlocks, you should note that when performing an operation, try to lock all required resources at the beginning, during execution, no other resources are requested (the first condition of the deadlock is not met), and the lock is released after the operation is completed. If you want to be more cautious, you can also add a time limit to the lock. When a lock exceeds a certain period of time, it will automatically expire, and all operations performed during the lock process will become invalid, this is actually the second condition to make it not meet the deadlock.

2. Data inconsistency is allowed, but consistency check is required when data is updated.

Read a batch of initial data from the database and perform a series of operations based on the data. When the operation results are updated to the database, first, check whether the initial data of the "same batch" in the database has changed. If yes, perform some processing based on the business needs. This method features high concurrency, flexibility, and no lock, which avoids a lot of trouble and is easier to implement.

When checking data consistency, you can check whether the data values in the database are the same as those in the memory. However, if the data volume is large, this check may be inefficient. In fact, the consistency check should consider the granularity issue and select the appropriate granularity based on the actual situation. Check whether the value of the specific data changes. This can be said to be the finest granularity. You can use data version control to perform more coarse-grained checks. Version Management is performed based on data records. You can add an integer field to the data table as the version. When you insert a data entry, set an initial version value. When updating data, first check whether the database version value has changed. If there is no change, execute update and add the updated data version value to 1. According to the business logic, sometimes a group of data needs to be managed using one version (with a coarse granularity), which can be achieved using a table with a dedicated storage version. Note that consistency check and data update must be performed in the same system transaction to ensure consistency.

There is no perfect world. The disadvantage of this method is that when an update is submitted, inconsistency is found, so that all operations before submission may become invalid. For example, the user first reads some initial data from the database and then makes a lot of modifications or input work. After two or three hours, the system checks the data inconsistency During the submission, if the result fails to be submitted and all the work is done in vain, the user will be very depressed and will not want to use such a system, which we do not want to see. One way to alleviate this problem is to check the consistency at the end of the commit. You can check the consistency during the business transaction process from time to time, the sooner an inconsistency is found, the sooner it can be processed. This is only a relief and cannot be fundamentally solved.

The first method is called the optimistic concurrency control method, and the second method is called the pessimistic concurrency control method ("everyone" likes to come up with some interesting terms ). In short, offline concurrency is not just a technical issue. First, you must thoroughly analyze the business logic of the system, find out when and where the system may encounter concurrency problems, and then, based on the actual situation and needs, weigh the pros and cons and select an appropriate solution. Long system transactions should be considered first. If the entire business transaction can be placed in one system transaction, the offline concurrency problem will be avoided. If it is unacceptable, consider the optimistic control method. After all, this implementation is relatively simple, but only when the occurrence frequency of concurrency conflicts is low, the possibility is relatively small, this method is suitable for the use of optimistic methods (if the user always fails to submit, it is even worse for such a system ). Then we will consider the pessimistic control method. Although it is troublesome, it can be used in cases where concurrency conflicts occur frequently. In fact, it limits the concurrency of the system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.