Concurrency: access to large data volumes

Source: Internet
Author: User
Today, I suddenly noticed this problem. I have read a lot from the Internet and have benefited a lot. Record it and review it later ~ I have encountered this problem in my work before. Two users operate on one record at the same time. User A queries A record and user B deletes this record, user A saves some values of A query record to other tables. This bug has been plagued for a long time because

Today, I suddenly noticed this problem. I have read a lot from the Internet and have benefited a lot. Record it and review it later ~ I have encountered this problem in my work before. Two users operate on one record at the same time. User A queries A record and user B deletes this record, user A saves some values of A query record to other tables. This bug has been plagued for a long time because

Today, I suddenly noticed this problem. I have read a lot from the Internet and have benefited a lot. Record it and review it later ~
I have encountered this problem in my work before. Two users operate on one record at the same time. User A queries A record and user B deletes this record, user A saves some values of A query record to other tables. This bug has also been plagued for A long time, because user A's method is particularly complicated and the execution time is relatively long, so the probability of this problem is still very high. The solution is to check this record before saving it. It is a lot better to solve this problem from the aspect of code logic, but it is always regarded as a permanent cure. After reading this article, I think there are better solutions.
Part1:
Requests with large concurrency and large data volume are generally divided into several situations:
1. A large number of users simultaneously search for and update different functional pages of the system
2. A large number of users simultaneously query the large data volume of the same table on the same page of the system
3. A large number of users simultaneously update the same page and table of the system.


In the first case, the general solution is as follows:
I. Server-level processing
1. Adjust the length of the IIS 7 Application pool queue
Changed from the default 1000 to 65535.
IIS Manager> ApplicationPools> Advanced Settings
Queue Length: 65535
2. Adjust the appConcurrentRequestLimit settings of IIS 7
Changed from the default 5000 to 100000.
C: \ windows \ system32 \ inetsrv \ appcmd.exe set config/section: serverRuntime/appConcurrentRequestLimit: 100000
You can view the settings in % systemroot % \ System32 \ inetsrv \ config \ applicationHost. config:
[Html]View plaincopy

[Html]View plaincopy


Concurrent Concurrency Control-This row is unavailable to users during the period from the time the record is retrieved to the time when the record is updated in the database. Open Concurrency Control-This row is unavailable to other users only when the data is actually updated. The update checks the row in the database and determines whether any changes have been made. If you try to update a modified record, a concurrency conflict occurs. The last update takes effect.-This row is unavailable to other users only when the data is actually updated. However, it does not compare updates with initial records, but only writes records, which may overwrite the changes made by other users since the last refresh record. Concurrent write

Concurrent concurrency is usually used for two purposes. First, in some cases, there are a large number of contention for the same record. The cost of storing locks on data is less than the cost of rolling back and changing when a concurrency conflict occurs.

In the case that the record cannot be changed during the transaction process, the parallel concurrency is also very useful. The inventory application is a good example. Assume that a company representative is checking inventory for a potential customer. You usually need to lock the record until the order is generated, which usually marks the item as "ordered" and removes it from the available stock. If no order is generated, the lock is released so that other users who check the inventory can obtain an accurate available inventory count.

However, concurrent control cannot be performed in the disconnected structure. The connection can only be opened for reading or updating data, so the lock cannot be kept for a long time. In addition, applications that keep the lock for a long time cannot be scaled.

Open concurrency

In open concurrency, the lock is set and kept only when the database is accessed. These locks prevent other users from updating records at the same time. In addition to the exact Update time, data is always available. For more information, see open concurrency.

When an update attempt is made, the initial version of the changed row is compared with the existing row in the database. If the two are different, the update will fail and cause a concurrent error. At this time, you will use the created business logic to coordinate the two lines.

The last update takes effect.

When "the last update takes effect" is used, the initial data is not checked, but the update is only written to the database. Obviously, the following situations may occur:

User A obtains A record from the database. User B obtains the same record from the database, modifies it, and writes the updated record back to the database. User A modifies the "old" record and writes it back to the database.

In the above case, user A will never see the changes made by user B. If you plan to use the "Last Update effective" method of concurrency control, make sure this situation is acceptable.

Concurrency Control in ADO. NET and Visual Studio. NET

Because the data structure is based on disconnected data, ADO. NET and Visual Studio. NET use open concurrency. Therefore, you need to add business logic to solve problems with open concurrency.

If you choose to use open concurrency, you can use two common methods to determine whether the change has occurred: Version method (actual version number or date timestamp) and save all value methods.

Version Number Method

In the version number method, the record to be updated must have a column containing the date timestamp or version number. When this record is read, The timestamp or version number is saved on the client. Then, the value is partially updated.

One way to process concurrency is to update only when the value in the WHERE clause matches the value in the record. The SQL representation of this method is:

UPDATE Table1 SET Column1 = @newvalue1, Column2 = @newvalue2WHERE DateTimeStamp = @origDateTimeStamp

Alternatively, you can use the version number for comparison:

UPDATE Table1 SET Column1 = @newvalue1, Column2 = @newvalue2WHERE RowVersion = @origRowVersionValue

If the date and time stamp or version number match, it indicates that the record in the data storage area has not been changed, and the record can be safely updated using the new value in the dataset. If not, an error is returned. You can write code to implement this type of concurrent check in Visual Studio. NET. You must also write code to respond to any update conflicts. To ensure the accuracy of the date timestamp or version number, you need to set a trigger on the table to update the date timestamp or version number when a row is changed.

Save all value methods

An alternative to a date timestamp or version number is to retrieve copies of all fields when reading a record. The DataSet object in ADO. NET maintains two versions of each modification record: the initial version (the version that was initially read from the data source) and the modified version (indicating the user update ). When you try to write records back to the data source, the initial values in the Data row will be compared with the records in the data source. If they match, it indicates that the database records have not been changed after being read. In this case, the changed values in the dataset are successfully written to the database.

For the data adapter's four commands (DELETE, INSERT, SELECT, and UPDATE), each command has a set of parameters. Each Command has parameters used for the initial value and the current value (or modified value.

For the second case:

Because it is a large concurrent request, the first case can also be used. In addition, because it is to retrieve large data volumes, the query efficiency needs to be considered.

1. Index A table based on query Conditions

2. Optimize Query statements

3. You can use cache for data query.

Solution to the third case:

You can also use the processing method in the first case. In addition, you can consider using the following method to update the same table:

1. Save the data to the cache. When the data reaches a certain amount, it is updated to the database.

2. divide a table by indexes (Table shards and partitions). For example, for a table that stores people's information across the country, the data volume is large. If multiple tables are divided by province, people's information across the country is stored in the corresponding table by province, and then queried and updated by province. This reduces the number of concurrent and large data volumes.

Part2:

How to handle massive concurrent data operation File Cache, database cache, optimize SQL, data shunting, horizontal and vertical division of database tables, and optimize the code structure! Summary 1. why do the following data inconsistencies occur when multiple users are locked for concurrent database operations? Update A and B are lost, read and modify the same data, the Modification result of one user destroys the Modification result of another user. For example, the ticket booking system dirty reads the data modified by user A, and then user B reads the data again, however, for some reason, user A canceled the modification of the data and restored the original value. In this case, the data obtained by user B is inconsistent with the data in the database and cannot be read repeatedly by user, then user B reads the data and modifies it. When user A reads the data again, it finds that the two values are inconsistent. The main method of concurrency control is blocking, the lock is to prohibit the user from performing some operations within a period of time to avoid data inconsistency. The binary lock category has two methods: 1. from a database system perspective: divided into exclusive locks (that is, exclusive locks), shared locks and update locks MS-SQL servers use the following resource lock modes. Lock mode description sharing (S) is used for operations without changing or updating data (read-only operations), such as SELECT statements. Update (U) is used in updatable resources. It prevents common deadlocks when multiple sessions are reading, locking, and subsequent resource updates. Arrange it (X) for data modification operations, such as INSERT, UPDATE, or DELETE. Make sure that multiple updates are not performed for the same resource at the same time. Intention locks are used to establish a lock hierarchy. The intention lock type IS: Intention sharing (IS), intention ranking (IX), and intention ranking sharing (SIX ). The schema lock is used to perform operations dependent on the table schema. The schema lock types are: schema modification (Sch-M) and schema stability (Sch-S ). Large-capacity Update (BU) is used to copy data to a table in large capacity and specify the TABLOCK prompt. The shared lock sharing (S) Lock allows concurrent transactions to read (SELECT) a resource. When a shared (S) lock exists on the resource, no other transactions can modify the data. Once the data has been read, the shared (S) lock on the resource is released immediately, unless the transaction isolation level is set to repeated read or higher, or use the lock prompt to keep the share (S) Lock within the transaction lifecycle. Update lock Update (U) Lock can prevent normal deadlocks. Generally, the update mode is composed of a transaction. The transaction reads the record, obtains the share (S) lock of the resource (page or row), and then modifies the row, this operation requires that the lock be converted to an exclusive (X) Lock. If two transactions obtain the Shared Mode Lock on the resource and attempt to update the data at the same time, a transaction attempts to convert the lock to the lock (X. The conversion from the sharing mode to the exclusive lock must wait for a while, because the exclusive lock of a transaction is incompatible with the Sharing Mode Lock of other transactions; a lock wait occurs. The second transaction attempts to obtain the row lock (X) for update. Because both transactions need to be converted to the exclusive (X) lock, and each transaction waits for another transaction to release the share mode lock, a deadlock occurs. To avoid this potential deadlock problem, use the update (U) Lock. Only one transaction can obtain the resource Update (U) Lock at a time. If the transaction modifies the resource, the update (U) Lock is converted to the row (X) Lock. Otherwise, the lock is converted to a shared lock. The exclusive lock locks (X) to prevent concurrent transactions from accessing resources. Other transactions cannot read or modify the data locked by the lock (X. Intention lock intention lock indicates that SQL Server needs to obtain the share (S) lock or arrange it (X) Lock on some underlying resources in the hierarchy. For example, a table-level share intention lock indicates that the transaction intends to place the share (S) lock on the page or row of the table. Setting the intention lock at the table level can prevent another transaction from getting the row lock (X) on the table containing that page. Intention locks can improve performance, because SQL Server only checks intention locks at the table level to determine whether transactions can safely obtain the locks on the table. Instead of checking the locks on each row or page in the table to determine whether the transaction can lock the entire table. Intention locks include intention sharing (IS), intention arranging it (IX), and intention sharing (SIX ). Lock mode description intention sharing (IS) by placing the S lock on each resource, it indicates that the transaction intention IS to read some (not all) of the underlying resources in the hierarchy. By placing the X lock on each resource, the intention of the transaction is to modify some (rather than all) underlying resources in the hierarchy. Ix is the superset of IS. By placing an IX lock on each resource, SIX shares with the intention to indicate that the transaction intends to read all the underlying resources in the hierarchy and modify some (rather than all) of the underlying resources. Allow concurrent IS locks on top-level resources. For example, the table's SIX lock places a SIX lock on the table (the concurrency IS allowed), and the IX lock on the current modification page (the X lock on the modified row ). Although each resource can have only one SIX lock for a period of time, to prevent other transactions from updating resources, however, other transactions can read the underlying resources in the hierarchy by obtaining the table-level IS lock. Exclusive lock: only the lock operation is allowed by the program. Other operations on the lock operation will not be accepted. When the data update command is executed, SQL Server automatically uses the exclusive lock. An exclusive lock cannot be applied to an object when other locks exist. Shared lock: the shared lock can be read by other users, but other users cannot modify it. When executing Select, SQL Server will apply a shared lock to the object. Update lock: When SQL Server is preparing to update data, it first locks the data object so that the data cannot be modified but can be read. When SQL Server determines that it wants to update data, it will automatically replace the update lock with an exclusive lock. When other locks exist on the object, it cannot be updated. 2. From the programmer's perspective: Optimistic locks and pessimistic locks. Optimistic lock: it relies entirely on the database to manage the lock. Pessimistic lock: the programmer manages the lock processing on data or objects by himself. The MS-SQLSERVER uses the lock to implement pessimistic concurrency control among multiple users who execute modifications in the database at the same time. the granularity of the three locks is the size of the blocked target, and the granularity of the lock is small, the concurrency is high, but the overhead is large, if the lock granularity is large, the concurrency is low, but the overhead is small. SQL Server supports the lock granularity, which can be divided into rows, pages, keys, key ranges, indexes, tables, or databases to obtain the lock Resource description. Used to lock a row in a table. The row lock in the key index. Used to protect the key range in a serializable transaction. Page 8 KB data page or index page. A group of eight adjacent data pages or index pages in the extended Disk Area. The entire table includes all data and indexes. DB database. 4. The length of lock holding time is the length of time required to protect the requested resources. The duration of the shared lock used to protect read operations depends on the transaction isolation level. When the default transaction isolation level of read committed is adopted, the shared lock is only controlled during page reading. During the scan, the lock is released only when the lock is obtained on the next page within the scan. If you specify the HOLDLOCK prompt or set the transaction isolation level to repeatable read or SERIALIZABLE, the lock will not be released until the transaction ends. Based on the concurrency options set for the cursor, the cursor can obtain the scroll lock in the sharing mode to protect the extraction. The scroll lock is released only when the cursor is extracted or closed for the next time (whichever comes first. However, if HOLDLOCK is specified, the rolling lock is released until the transaction ends. The exclusive lock used to protect updates will not be released until the transaction ends. If a connection attempts to obtain a lock and the lock conflicts with the lock controlled by another connection, the connection attempting to obtain the lock will be blocked: release the conflicting lock and the connection obtains the requested lock. The connection timeout interval has expired. No timeout interval by default, however, some applications set the timeout interval to prevent the user from waiting for the lock in SQL Server for an indefinite period. 1. Processing the deadlock and setting the deadlock priority deadlock means that multiple users apply for different locks, SET DEADLOCK_PRIORITY can be used to control the session response mode in the case of a deadlock. If both processes lock data and wait until other processes release their own locks, each process can release its own locks, that is, deadlock occurs. 2. Process timeout and set the lock timeout duration. @ LOCK_TIMEOUT the current lock timeout setting for the current session is returned. The unit is SET LOCK_TIMEOUT in milliseconds. This setting allows the application to SET the maximum time for the statement to wait for the resource to be blocked. When the waiting time of a statement is greater than the LOCK_TIMEOUT setting, the system automatically cancels the blocking statement, in the following example, the system Returns Error 1222 that exceeds the lock request timeout period. The lock timeout period is set to 1,800 Ms. SET LOCK_TIMEOUT 1800 3) SET the transaction isolation level. 4) use the table-level locking prompt for SELECT, INSERT, UPDATE, and DELETE statements. 5) to configure the index lock granularity, you can use the sp_indexoption system stored procedure to set the lock granularity for the index. 6. view the lock information. 1. Execute EXEC SP_LOCK to report the lock information. 2. Press Ctrl + 2 in the query analyzer. see the lock information. 7. Precautions for use: how to avoid deadlock. 1. When using transactions, shorten the logical processing process of the transaction as much as possible and commit or roll back the transaction as soon as possible. 2. Set the deadlock timeout parameter to a reasonable range, for example, 3 minutes to 10 minutes. If the timeout period is exceeded, the operation is automatically abandoned, avoid process suspension; 3. Optimize the program to check and avoid deadlock; 4. all scripts and SP should be carefully tested before the version is correct. 5. All the SP servers must handle errors (via @ error) 6. Generally, do not modify the default transaction level of SQL SERVER. It is not recommended to force lock to solve the problem. How can we lock a row-table database?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1. How to lock a row in a tableSETTRANSACTIONISOLATIONLEVELREADUNCOMMITTEDSELECT*FROMTableROWLOCKWHEREId = 12. Lock a table in the databaseSELECT*FROMTableWITH(HOLDLOCK)Lock statement:Sybase:UpdateTableSetCol1 = col1Where1 = 0;MSSQL:SelectCol1FromTable (tablockx)Where1 = 0;Oracle:LOCKTABLETableIN EXCLUSIVE MODE ;

No one else can operate after the lock, until the lock user is unlocked. Several examples of unlocking with commit or rollback help you to deepen your impression of table1 (A, B, C) a B C a1 b1 c1 a2 b2 c2 a3 b3 c3 1) create two new connections with the exclusive lock?

1 2 3 4 5 6 7 8 9 10 11 12 Execute the following statement in the first connection:BeginTranUpdateTable1SetA = 'A'WhereB = 'b2 ′Waitfor delay '00: 00: 30'-wait 30 secondsCommitTranExecute the following statement in the second connectionbegin tran select * from table1 where B=’b2′ commit tran

If you execute the preceding two statements at the same time, the select query must wait until the update statement is executed. That is, you must wait 30 seconds. 2) What is the shared lock?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Execute the following statement in the first connection:BeginTranSelect*FromTable1 holdlock-holdlock artificial lockWhereB = 'b2 ′Waitfor delay '00: 00: 30'-wait 30 secondsCommitTranExecute the following statement in the second connectionbegin tran select A,C from table1 where B=’b2′ update table1 set A=’aa’ where B=’b2′ commit tran

If the preceding two statements are executed at the same time, the select query in the second connection can be executed, and the update statement can be executed only after the first transaction releases the shared lock and converts it to the exclusive lock. That is, it takes 30 seconds. 3) What is the deadlock?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Add table2 (D, E)D ED1 e1D2 e2Execute the following statement in the first connection:BeginTranUpdateTable1SetA = 'A'WhereB = 'b2 ′Waitfor delay '00: 00: 30 ′UpdateTable 2SetD = 'd5 ′WhereE = 'e1 ′CommitTranExecute the following statement in the second connectionbegin tran update table2 set D=’d5′ where E=’e1′ waitfor delay ’00:00:10′ update table1 set A=’aa’ where B=’b2′ commit tran

At the same time, the system detects a deadlock and terminates the process. Additionally, the table-level locks supported by SQL Server indicate that HOLDLOCK holds a shared lock until the entire transaction is completed, the lock object should be released immediately when it is not needed. It is equal to the SERIALIZABLE transaction isolation level. When the NOLOCK statement is executed, no shared lock is issued and dirty reads are allowed, equal to the read uncommitted transaction isolation level. PAGLOCK uses multiple page locks when a table lock is used. READPAST allows SQL server to skip any locked rows and execute transactions, applicable to read uncommitted transaction isolation level, which only skips the RID lock and does not skip pages. region and table lock ROWLOCK forcibly uses the row lock TABLOCKX to forcibly use the exclusive table lock, this lock prevents any other transactions from using this table during the transaction.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.