Mysql article notes, mysql article

Source: Internet
Author: User

Mysql article notes, mysql article
Mysql glossary/meaning/Reading Notes MVCC (Multiversion concurrency control)

MySQL InnoDB Storage engine implements Multi-Version Concurrency Control protocol-MVCC (Multi-Version Concurrency Control) (Note: Compared with MVCC, it implements lock-based Concurrency Control, lock-Based Concurrency Control ). The biggest benefit of MVCC is that it is also familiar: Read is not locked, and read/write is not conflicted. In OLTP applications with multiple reads and writes, read/write conflicts are very important, greatly increasing the system's concurrency performance. This is why almost all RDBMS support MVCC at this stage.

In MVCC concurrency control, read operations can be divided into two types: snapshot read and current read ). Snapshot reading reads the visible version of the record (which may be a previous version) without locking. The current read reads the latest version of the record, and the record returned by the current read will be locked to ensure that other transactions will not modify this record concurrently.

Snapshot read

A simple select operation is a snapshot read without locking. (Of course, there are exceptions, which will be analyzed below)
Select * from table where ?;

Current read

Current read: Special read operations, insert/update/delete operations, which belong to the current read and need to be locked.
select * from table where ? lock in share mode;S lock
Select * from table where? For update; X lock
Insert into table values (...); X lock
Update table set? Where ?; X lock
Delete from table where ?; X lock

Clustered index

Official documentation:
Every InnoDB table has a special index called the clustered index where the data for the rows is stored. typically, the clustered index is synonymous with the primary key. to get the best performance from queries, inserts, and other database operations, you must understand how InnoDB uses the clustered index to optimize the most common lookup andDMLOperations for each table.

  • If you definePRIMARY KEYOn your table, InnoDB uses it as the clustered index. (If the primary key is defined, InnoDB selects the primary key as the clustered index .)

  • If you do not definePRIMARY KEYFor your table, MySQL picks the firstUNIQUEIndex that has onlyNOT NULLColumns as the primary key andInnoDBUses it as the clustered index. (if there is no primary key, Mysql selects a unique column created and the unique column Not Null as the InnoDB index .)

  • If the table has noPRIMARY KEYOr suitableUNIQUEIndex,InnoDBInternally generates a hidden clustered index on a synthetic column containing rowIDValues. The rows are ordered byIDThatInnoDBAssigns to the rows in such a table. The rowIDIs a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the rowIDAre physically in insertion order. (If the table's primary key and unique creation do not exist, InnoDB generates a hidden column number that contains the column number and serves as the clustering index .)

2PL-Two-Phase Locking: Two-Phase Lock

In databases and sessions, 2PL is used to ensure thread security, that is, obtaining and Releasing locks. 2PL has two primitives:

-Expanding phase: locks are acquired and no locks are released;
-Shrinking phase: locks are released and no locks are acquired;
2PL is scheduled to differentiate two locks: Shared locks and Exclusive locks.

See Two-phase locking.

Mysql transaction Isolation level: Isolation level

In database systems, does isolations determine the visibility of transactions to other users and systems? PS: ACID (Atomicty, Consistency, Isolation, Durability ).

Some Glossary in transaction operations

In the two queries of a transaction, the number of data records is inconsistent. For example, one transaction queries several rows of data, while the other transaction inserts several new rows of data at this time, in the subsequent query, the previous transaction will find several rows of data that it did not have before.

Dirty read

A transaction has updated the data, and another transaction reads the same data at this time. For some reason, the previous RollBack operation, the data read by the last transaction is incorrect because the transaction can see the uncommitted results of other transactions.

Non-repeated read

The data in two queries of a transaction is inconsistent. This may be because the original data updated by a transaction is inserted in the middle of the two queries, the reason is that the transaction can see the results of other transaction submissions.

4th isolation level Serializable

This is the highest isolation level. It forces transaction sorting to make it impossible to conflict with each other, thus solving the phantom read problem. In short, it adds a shared lock to each read data row. At this level, there may be a lot of timeout and lock competition.

Repeatable reads

This is the default transaction isolation level of MySQL. It ensures that multiple instances of the same transaction will see the same data rows when reading data concurrently. However, theoretically, this will lead to another tricky problem: Phantom Read ). In short, phantom read refers to when a user reads data in a certain range, another transaction inserts a new row in this range. When the user reads data in this range, there will be a new Phantom line. The InnoDB and Falcon storage engines solve this problem through the Multi-version Concurrency Control (MVCC, Multiversion Concurrency Control) mechanism, the GAP lock.

Read committed

This is the default isolation level for most database systems (but not for MySQL ). It satisfies the simple definition of isolation: a transaction can only see changes made by committed transactions. This isolation level also supports the so-called Nonrepeatable Read, because other instances of the same transaction may have a new commit during the processing of this instance, so the same select may return different results.

Read uncommitted

At this isolation level, all transactions can see the execution results of other uncommitted transactions. This isolation level is rarely used in practical applications, because its performance is no better than other levels. Read uncommitted data, also known as Dirty Read ).

To analyze the locking status of an SQL statement, see he dengcheng's blog
  • SQL: select * from t1 where id = 10;
  • SQL: delete from t1 where id = 10;
    Before answering this question, we need to clarify the following prerequisites:
  • Prerequisite 1: whether the id column is a primary key
  • What is the isolation level of the current database engine?
  • If the Id column is not a primary key, there is no index on the Id column.
  • If there is a secondary index on the Id column, whether the Id is a Unique Key
  • What are the execution plans of two SQL statements? Index scan? Full table scan?
    Another SQL statement uses indexes even after analysis, but the actual execution plan has many complicated other conditions, even if "It looks", it will go through the index, but the whole table scan is performed through the execution plan.
Combination 1: The id column is the primary key and the RC isolation level. Execute delete from t1 where id = 10;
create table t1(    id int(32) not null,    name varchar(50) not null,    primary key(id));

Conclusion: If the id column is a primary key, you only need to add the X lock to the column id = 10.

Combination 2: id is Unique_key, RC isolation level, execute delete from t1 where id = 10;
create table t1(    id int(32) not null,    name varchar(50) not null,    primary key (`name`),    unique key `key_name`(`name`));

Execution diagram (obtained by HE dengcheng's blog ):

In this combination, the id below is a two-phase index. In this case, unlike the combination of a lock, the DB Engine goes first with the Id index of the where condition, apply the X lock to the record with Id = 10 on the corresponding id index, return to the cluster index based on the name value, and apply the X lock to the value with name = d. Why does the X lock also need to be applied to clustered indexes? If the X lock is not applied to the delete operation, if an update t1 set id = 100 where name = 'D', a conflict may occur.
Conclusion: If id is a unique index and name is the primary key, the X lock will be applied to the records with id = 10 on the id index, in addition, the X lock is applied to the records with name = 'd on the name Cluster Index.

Combination 3: The id is a non-unique index, and the RC isolation level is delete from t1 where id = 10;
create table t1(    id int(32) not null ,    name varchar(50) not null ,    primary key (`name`),    key `key_name`(`name`));

Execution diagram (obtained by HE dengcheng's blog ):

As shown in the figure, the X lock will be applied to all records with id = 10 matched by the where condition, and the records corresponding to the index will also be locked.
Conclusion: If the id column has a non-unique index, all corresponding records that meet the SQL query conditions will be locked. At the same time, these records on the primary key index will also be locked.

Combination 4: No index on the id column, RC isolation level delete from t1 where id = 10;
create table t1(    id int(32) not null ,    name varchar(50) not null,    primary key (`name`),);

Execution diagram (obtained by HE dengcheng's blog ):

Conclusion: When mysql uses the where condition, because it cannot quickly confirm the affected rows through the index, it will add an X lock to the record rows of all clustered indexes and then return all records. Mysql was optimized in specific implementation, and the where condition was used again to determine whether the X lock was released through unlock_row for non-satisfied records (against the 2PL Specification );

Combination 5: id is the primary key column, RR isolation level

In this case, the locking mechanism is consistent with the combination.

Combination 6: The id column is a unique index with the RR isolation level.

In this case, the lock is consistent with the combination

Combination 7: The id column is a non-unique index and the RR isolation level (GAP lock)

When Mysql transactions are separated from RC, phantom read is allowed, but the isolation level does not run phantom read when RR is used. How does Mysql achieve RR isolation without phantom reading? This combination is described.
Execution diagram (obtained by HE dengcheng's blog ):

The locking mechanism here is similar to the locking mechanism under RC. The only difference is that a GAP lock is added in RC locking, and the GAP lock is not added to a specific record, instead, it loads a lock between the record and the record.
Let's talk about phantom read: phantom read means that when a select * from t1 where id = 10 for update SQL statement is executed twice in a row, the number of records read before and after the two times is inconsistent (the second time will not return more records than the first time ).
At the RR isolation level, because B + tree indexes are ordered, you must ensure that records with id = 10 cannot be inserted between IDs = [6, 10). For details, refer to [6, c] inserts a value similar to [10, aa] between [10, B] or between [10, B] and [10, d, c] there must be a lock to make these inserts unexecutable (that is, the GAP lock ).

The reason why the GAP lock does not appear in combination 5 and Combination 6 is that the above two combinations ensure the uniqueness of the record, and there is no need to use the GAP lock.

Conclusion: At the Repeatable Read isolation level, the id column has a non-unique index, corresponding to SQL: delete from t1 where id = 10. First, locate the first record that meets the query conditions using the id index, add the X lock on the record, add the GAP lock on the GAP, add the X lock on the record on the primary key clustered index, and then return the lock; then read the next one and repeat it. Until the first record that does not meet the conditions [11, f] is performed, the record X lock is not required, but the GAP lock is still required, and the return ends.

Combination 8: No index on id, RR transaction isolation level

Lock status (obtained by HE dengcheng's blog ):

Conclusion: The locking mechanism is similar to the RC isolation level. The difference is that a GAP lock is added between each record. Any SQL statements related to locking, such as update, modification, and insertion, cannot be executed. We are delighted that Mysql will filter out the pre-release locks that do not meet the where conditions in advance.

Composite 9: Serializable

In the case of Serializable, when delete from t1 where id = 10 is passed through RR, the Gap lock is used to solve phantom read. Serializable affects select * from t1 where id = 10. This SQL statement does not lock the fast illumination under the RR and RC statements, but locks the data in the case of Serializable statements.

Analysis of a complex SQL statement

The SQL example is as follows (obtained by HE dengcheng's blog ):

After analyzing the composition of the SQL where condition, let's take a look at the SQL locking (RR isolation level), as shown in:

It can be seen that, at the Repeatable Read isolation level, the range determined by the Index Key is added with the GAP lock; the condition (userid = 'hdc ') given by the Index Filter lock when to Filter, according to MySQL version, Index Condition Pushdown (ICP) is not supported before MySQL 5.6. Therefore, Index Filter is used at the MySQL Server layer and Index Condition Pushdown is supported after MySQL 5.6, then filter on the index. If you do not support the ICP record and do not meet the Index Filter Record, you also need to add the record X lock. If you support the ICP record, you do not need to add the record X lock (in the figure, the X lock marked by the Red Arrow, whether to add, depends on whether the ICP is supported); and the Filter condition corresponding to Table Filter, after reading in the clustered index, mySQL Server-level filtering, so the clustering index also requires the X lock. Finally, a record meeting the conditions [8, hdc, d, 5, good] is selected, but the number of locks is far greater than the number of records meeting the conditions.

Conclusion: At the Repeatable Read isolation level, the where condition must be extracted for a complex SQL statement. The GAP lock must be applied to the range determined by the Index Key. The Index Filter condition depends on whether the MySQL version supports ICP. If the Index Filter record is not supported, the X lock is not applied, otherwise, the X lock is required. The X lock is required no matter whether the Filter conditions are met or not.

Principles and Analysis of deadlocks

Deadlock 1 (obtained by HE dengcheng's blog ):

Deadlock 2 (obtained by HE dengcheng's blog ):

The two deadlock cases above. The first one is very easy to understand and is also the most common deadlock. Each transaction executes two SQL statements and holds a lock respectively, and then adds another lock to generate a deadlock.

In the second case, although each Session has only one statement, a deadlock still occurs. To analyze the deadlock, you must first use the MySQL locking rules mentioned earlier in this article. For Session 1, the [hdc, 1], [hdc, 6] Read starting from the name index meet the conditions, not only will the record X lock on the name index be added, in addition, the record X lock on the clustered index will be added. The lock sequence is [1, hdc, 100], and then [6, hdc, 10]. Session 2, starting from the pubtime index, [] and [, 1] all meet the filtering conditions. The record X lock on the clustered index will also be added, and the lock order will be [6, hdc, 10], and later [1, hdc, 100]. No. The lock sequence of Session 1 is the opposite. If both sessions hold the first lock and the second lock is requested, the deadlock will occur.

See documentation

He dengcheng article: http://hedengcheng.com /? P = 771
Mysql documentation: http://dev.mysql.com/doc/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.