I. Introduction
Concurrency Control refers to the mechanism for standardizing concurrent operations on databases in a multi-user environment. The objective is to avoid data loss and modification, read dirty data, and non-repeated read, so as to ensure data correctness and consistency. Concurrency control is very important in the multi-user mode, but this is often ignored by some database developers, and because of the rich concurrency control layers and types, sometimes it is confusing to choose, and it is unclear how to determine the concurrency control hierarchy. This article will start with an example and summarize the methods and methods of concurrency control in database applications based on the knowledge of database theory, hope to help readers find reasonable concurrency control methods.
2. Example of a concurrency control failure
To better understand the concept of concurrency control, let's repeat a frequently cited example. First, we will give a simplified data dictionary definition of account information in the Bank database. We will use this table as an example in the future.
Column name column code column type
Account ID (key value column) Char (10)
Owner uname char (10)
Deposit amount mdeposit currency
Expenditure mpayout currency
Deposit Balance mbalance currency
This example is generated during the session between the client program and the server database: a customer's representative withdraws RMB 2,000 from the bank's front desk, the bank cashier's deposit information shows that the bank's deposit balance is 20,000 yuan. At this moment, another bank account transfers the cheque to pay for the Account 5,000 yuan, and the machine query also obtains the current user's deposit 20,000 yuan, at this time, the cashier of the bank saw that the user's deposit had exceeded the amount of the deposit, and paid the customer 2,000 yuan and changed the user's deposit to 18,000 yuan. Then another operator of the bank based on the check, add the imported 5,000 yuan and change the user's balance to 25,000 yuan. Obviously, the bank will lose 2000 yuan, because the change made by another cashier is overwritten.
This is because of the failure to control concurrent operations. Because two concurrent operations are not reasonably isolated, data is reasonably locked, as a result, the client dataset obtained by the cashier query is inconsistent with the database data, resulting in loss of modifications.
Iii. Theoretical Basis of database Concurrency Control
Here, we will explain the frequently used concepts in concurrency control. For details, please refer to your reference. A transaction is an important concept in a database. It is a set of programs that are both done or not done. It is the unit of database concurrency control. If the transaction concurrency is improperly controlled, data inconsistency such as lost modification, dirty data reading, and non-repeated read may occur. However, in applications, such inconsistencies can be tolerated to improve the concurrency. For example, most business logic can tolerate repeated reads after appropriate adjustments. Today's popular relational database systems (such as Oracle and SQL Server) define the goal of concurrency control through the transaction isolation level (transaction isolation level) and blocking mechanism, according to the protocol provided by Alibaba Cloud, we can obtain almost any reasonable concurrency control method. For example, Microsoft SQL Server has four types of locks: Shared locks, exclusive locks, and intention locks (also divided into shared intention locks, exclusive intention locks, and shared intention exclusive locks. There is a definite compatibility relationship between various locks. There are four transaction isolation levels: uncommitted read, committed read, Repeatable read, and serialized read. Different isolation levels define different blocking protocols. This part is rich in content. You can write a book with limited space and will not elaborate on it. For more information, see the database theory teaching materials.
If the blocking type and isolation level are so rich, what principle should we choose? That is, data consistency requirements and concurrency. For example, the data consistency of the four isolation levels increases sequentially, but the concurrency decreases sequentially. Generally, the system's default isolation level is commit read. In general, this can meet the application requirements, but this isolation level cannot avoid repeated reads, that is, during your browsing of database records, the same record read at different times can have different contents. Sometimes it is necessary to dynamically change the blocking status or isolation level through SQL statements.
IV. Implementation of concurrency control technology
There are many ways to implement concurrency control. If DBMS supports it, of course, it is best to use its own concurrency control capabilities. If the system does not provide such a function, you can use the support of development tools, you can also consider adjusting database applications, in addition, you can adjust the working mode to avoid such concurrent operations that will affect the efficiency. The author makes a summary of various strategies, mainly including the following points:
(1) Adjust the working mode and modify the application to avoid unnecessary concurrency.
This is feasible in some cases. For example, it is required that the input personnel can only modify the records they have created, so there will be no errors in concurrent operations, this is because the records that different users can update do not overlap. In this case, you need to add user columns to the database table. When a user browses a record, the user column is used as a filter condition to adjust the SQL statement of the application. However, the role of this policy is limited, because concurrency control is inevitable in many cases.
(2) Use DBMS functions.
All large-scale relational systems have good concurrency control functions. For example, you can use the update cursor, explicit locking, and changing the transaction isolation level. Of course, there are a lot of tips to pay attention to in its use, such as: (1) it is best not to include the customer interaction part in transaction definitions. (2) only when the data consistency requirements are particularly strict, but the concurrency requirements are not high, adopt the isolation level of repeatable and serializable read. (3) In the same transaction, you should change the lock level of the data as needed, but generally do not use tablock as a coarse-grained block. (4) You can explicitly set the isolation level based on the concurrency requirements of different transactions. (5) Use a cursor in an operation that includes customer interaction, and shorten the interaction time as much as possible.
Let's look at an example of using an update cursor in the Informix database. Define update cursor Syntax: declare cursor-name cursor for select-statement for update [of column-list]. Update cursors implicitly lock the current record when browsing and modifying data. Note that update cursors are only valid for updatable views. To improve the concurrency, you often need to use the scroll cursor together. The definition method of the scroll cursor is declare cursorname scroll cursor [with hold] For selectstatement. However, the scroll cursor does not lock the current record.
The following code completes the customer's browsing and modification of the account content. The code is written in esql/C of Informix (using the C language as the host language), showing how to update the cursor:
$ Declare mycurs cursor for select mdeposit, mpayout, mbalance from acount for update; // defines the update cursor $ open mycurs; // open the cursor
For (;;)
{$ Fetch mycurs into $ mdeposit, $ mpayout, $ mbalance; // read records from the cursor
If sqlcode = sqlnotfound then exit; // if the record is obtained, exit the loop.
.... // Display the record content to the user
.... // If the user decides to modify the record, continue to execute
$ Update acount set (mdeposit, mpayout, mbalance) =
($ Mdeposit, $ mpayout, $ mbalance) where current of mycurs; // update the value
}
The cursor statement with for update has the lock function. Based on the above Code, after the fetch operation, the current record pointed to by the cursor is locked by a shared lock. When the user decides to modify the lock, the lock on the record is upgraded to an exclusive lock. At this time, other users cannot update this record. This method has the disadvantage of locking the current record even if the user does not modify the current record, which affects the concurrency. The following method can be used: (1) define a scroll cursor to complete the query; (2) obtain a record of the cursor and display it to the user; (2) view the record until the record to be modified or deleted; (3) when you select to modify a record, define an update cursor for the record you want to modify (4) use the update cursor to retrieve the record and display the locked record again;
(5) update this record. Then the above program can be changed:
$ Declare mycurs scroll cursor for select ID, mdeposit, mpayout, mbalance from acount; // define a scroll cursor $ open mycurs // open a scroll cursor
For (;;)
{$ Fetch mycurs into $ id, $ mdeposit, $ mpayout, $ mbalance; // read the record value from the cursor if sqlcode = sqlnotfound then exit; // If read, exit the loop
.... // Display the record content for the user
.... // If the user decides to modify the record, continue to execute
$ Declare mycurs_update cursor for select mdeposit, mpayout, mbalance
From acount where id = $ ID; // defines the update cursor.
$ Fetch mycurs_update into $ mdeposit, $ mpayout, $ mbalance; // read the value
$ Update acount set (mdeposit, mpayout, mbalance) = ($ mdeposit, $ mpayout, $ mbalance) where current of mycurs_update; // update a value
(3) Use the support of development tools.
Many database development tools have some convenient options or components to support concurrency control, regardless of whether the DBMS supports concurrency control. Let's take a look at the concurrency control methods of Delphi and PowerBuilder.
Delphi is an excellent c/s development tool. The Database Control used to query data is tquery. It can be combined with the tupdatesql control to browse and update database table data. The tquery control has an updatemod attribute, which has three options: (1) upwhereall: As long as someone modifies a column of the record during browsing and modification, no matter whether you have modified this column or not, your modification cannot be successful at the time of submission. (2) upwherechanged: only determines whether your modification is successful based on the key-value column and the column you have modified, if the columns of this record modified by others do not conflict with the columns you modified, your modification is still successful. (3) upwhereonly: determines whether your update is successful based on whether the key value is modified.
The tupdatesql control used with the tquery control automatically generates the required update statement based on the specified modification attribute, which is very convenient. The 2nd mode is the most commonly used modification mode. As long as the modifications made by others to the record do not overlap with their own, the submission is successful, which ensures that no data is lost or overwritten, and has a high degree of concurrency. For example, if the mpayout and mbalance columns in the record are modified after the customer browses the record, the SQL statement generated by the tupdatesql control under the upwherechanged option is:
Update acount set (mpayout, mbalance) = ($ mpayout, $ mbalance)
Where key = key_old and mpayout = mpayout_old and mbalance = mbalance_old;
Here, key_old, mpayout_old, and mbalance_old are the intermediate variables generated by Delphi for the user. They are used to save the old values of the original data record and compare whether the old values are equal to the current values. If they are not equal, this indicates that another user has changed the record. To avoid the loss of the modification, the user's update operation cannot be completed, and vice versa. When the cashier modifies the account, if another user has modified the account, the change is unsuccessful. You must refresh the record to modify the account. In this case, we can avoid bank losses.
Another tool comparable to Delphi is the famous PowerBuilder. In its datawindows design, we choose rows | Update ..., The specify update characteristics setting window appears. In this window, we set the WHERE clause generation in the update statement for concurrency control. Here there are three options: (1) key columns: The generated where clause only compares the values of the primary key columns in the table with those in the initial query to determine the record to be modified. Corresponds to the upwhereonly option in Delphi. (2) Key and updateable columns: The generated where clause compares the values of primary key columns and modifiable columns in the table with those in the initial query. Corresponds to upwhereall of Delphi. (3) Key and modified columns: corresponds to the upwherechanged option of Delphi. The where clause compares the primary key and the column to be modified.
(4) Adjust the application.
Some databases do not provide concurrent control functions, such as FoxPro, and some versions of MySQL do not support transactions. In addition, some development tools (such as some web script editors) do not provide components for concurrency control, so concurrency control should be implemented, we can only adjust the structure of our applications and databases.
The application can be adjusted according to the basic idea of blocking. Add a lock field to the database table for concurrency control. This field can be a Boolean variable. If it is true, the lock field is locked. If it is false, the lock field is idle. The table structure becomes:
Column name column code column type
Account ID (key value column) Char (10)
Owner uname char (10)
Deposit amount mdeposit currency
Expenditure mpayout currency
Deposit Balance mbalance currency
Lock Boolean
If a customer can modify the table record when querying this table in an application, to prevent other customers from modifying this record during the user's editing, when you browse the data of the record, add a lock to the record (change the lock field to true). Release the lock after modification (change the lock field to false ). If other customers want to modify the record of this table, they should first check whether the record has been locked. If the record has been locked, they cannot modify it. If the lock field is idle, the record is first locked and then viewed and edited by the customer. During this period, other customers cannot modify the record. This effectively prevents the loss of modifications.
The above is a common method, but it does not have any disadvantages at all. It may produce a side effect: when a user decides to modify a record, the record is locked, wait for the user to modify, but now the user leaves, the record will be locked until the user submits (may be a few hours later) or the session times out, in this case, other users cannot change this record, resulting in low concurrency. One solution to this problem is to record the browsing records to old_record (custom variable), and then copy the old_record content to a new record new_record (custom variable) the user edits new_record. When new_record is submitted, the old_record is compared with the original record. If it is different, it indicates that the user has modified the original record. At this time, the new_record modification is abandoned; otherwise, submit the content.
The above method also has a small deficiency: when comparing the old and new records, you must compare the entire record, which takes time and is troublesome to write the program. You can consider adding a timestamp column to the original table (you can cancel the lock column at this time), then the table structure becomes:
Column name column code column type
Account ID (key value column) Char (10)
Owner uname char (10)
Deposit amount mdeposit currency
Expenditure mpayout currency
Deposit Balance mbalance currency
Timestamp date datetime
When browsing a record, the timestamp value is recorded in old_date (custom variable), and the record content is recorded in new_record, the user edits new_record, when new_record is submitted, the old_date and the timestamp in the original record are compared. If they are different, the user has modified the original record. At this time, the new_record modification is abandoned; otherwise, submit the content. The update statement is:
Update acount set (mpayout, mbalance, date) = ($ mpayout, $ mbalance, $ date)
Where key = key_old and date = $ date_old;
V. Summary
We generally refer to Concurrency Control in DBMS (Database Management System. Concurrency control methods are rich, far more than that. You can use the capabilities of the database, development tools, and programs. In database applications, concurrency control methods and implementation methods are diverse. In the selection, the basic principle is that data consistency must meet the needs of applications. On this basis, we try to improve concurrency.