Database acid, isolation level, and mvcc__ database

Source: Internet
Author: User
Tags commit serialization
First, you need to define the concept of the transaction: a set of atomic SQL queries that, if the database engine succeeds in applying all the statements of the group query to the database, executes the group statement, or none of the statements are executed.

Transactions have acid four characteristics, namely:

Atomicity: A transaction is an indivisible minimum unit of work in which the operation is either wholly successful or all failed;

Consistency: The database is always converted from one consistency state to another. The so-called consistency state, that is, all the integrity of the database constraints (especially the user-defined constraints) are adhered to, in the case of bank transfers, "the transfer operation will inevitably lead to an account reduction, another account increase amount, and the total amount of these two accounts sum unchanged" is an integrity constraint.

Persistence: Once a transaction commits, its modifications are persisted to the database

Isolation: Isolation is used to define the degree of isolation between transactions, with four isolation levels.

First you need to explain a few conceptual definitions related to isolation:

(1) Dirty reading: Refers to the transaction read dirty data, the so-called dirty data, refers to the incorrect data, such as the transaction during the execution of the modification of a record, and then rollback, if the other transaction read the record in the middle of the modified value, dirty read.

(2) Non-repeatable reads: A transaction reads the same existing record multiple times during execution, and the values are different. Read Commit isolation level there is an issue of non-duplication, transactions 1, 2 concurrent execution, transaction 2 first read record 1, and then transaction 1 modify record 1 and commit, transaction 2 continue to read record 1, the transaction 22 read to the different values.

(3) Phantom reading: Phantom reading is the number of records that may be read when you use a condition to read a batch of records. The difference between phantom reading and Dirty reading, and non repeatable reading, is that dirty reads, non-repeatable reads are for a certain existing record of the value that appears not required (read dirty data or read the different values), while the Phantom read multiple times using the same criteria to query a batch of records, read the number of records are different, that is, dirty read, Non-repeatable reads are caused by multiple transactions executing the update in parallel, while Phantom reading is caused by multiple transactions executing the insert in parallel (the problem caused by the concurrent delete appears to be the case). )。

The four isolation levels are:

(1) Read uncommited: Reading uncommitted, meaning multiple concurrent transactions, any one transaction can read to other transactions have not committed changes:

There is dirty read, not repeatable read, the possibility of phantom reading.

(2) Read commited: Reads have been submitted, meaning multiple concurrent transactions, and any one transaction can only read changes that have been committed by other transactions:

To resolve dirty reads, there is the possibility of not repeating reading and reading.

(3) REPEATABLE READ: Repeatable read, meaning that when multiple transactions are executed concurrently, any one transaction reads the existing record repeatedly, and each read value is the same

Resolves dirty read, not repeatable read, there is the possibility of phantom reading.

(4) Serializable: serialization, meaning that all transactions are executed serially, so there is no case of concurrent execution of the transaction.

Resolves dirty reads, non-repeatable reads, and Phantom reads.

Multi-version concurrency control MVCC

Of the four isolation levels, READ UNCOMMITTED isolation is the worst, and the performance is not much improved and is rarely used, compared to read submissions, and serialization is the best, but poorly performing and rarely used. The default isolation level for a generic database is either Read committed or repeatable (e.g., MySQL's InnoDB engine) or read-committed (for example, Oracle).

If you use row-level read-lock, write-lock to implement read-committed or repeatable-read, the following steps should be:

1, the transaction 1 will modify the line 1, then the row 1 plus write lock, start the transaction;

2. Transaction 2 is a pure read operation, which requires reading Line 1, attempting to add a read lock on row 1, and transaction 1 waits until transaction 2 is completed because transaction 1 has been written lock.

3. If transaction 2 starts first, then transaction 1 also needs to wait for transaction 2 to complete and release the read lock before it can begin execution.

Even if you say, writes to a row block all read operations on the row, and a read operation on a row blocks all writes to the row, and is constrained by locks, regardless of how high the system IO capability is, in the case of a read or write concurrency.

MVCC used to solve this problem to improve the performance of the system, MVCC there is no unified standard, the implementation of each database in different ways to achieve MVCC,INNODB implementation of the following ways:

Preparatory work:

(1) Add row and delete labels to two fields for each line record;

(2) Maintain a global system version number, each start a transaction (note that the Select is also a transaction, read the transaction), add the system version number 1 and as the transaction version number

The row flag for the inserted record is set to this transaction version number, and the deletion flag is empty;

Delete the deletion flag of the record is set to this transaction version number;

Modified processing: Modifies the original record's deleted version number to this transaction version number, inserts a record, contains the original record data and this modification, the row record flag is set to this transaction version number, the deletion flag is empty;

Read processing process:

Read only record rows that meet the following criteria:

(1) The line flag is less than or equal to this transaction version number (equal to the increase that is used to guarantee that the submission within this transaction can be read);

(2) The deletion flag is empty or larger than this transaction version number (excluding records that are equal to guarantee that the deletion of this transaction will not be read);

The equivalent of a snapshot of a system at the beginning of a read transaction, all the data read by the transaction is read from the snapshot, so that it satisfies the repeatable read condition and solves the problem of Phantom reading, and does not read the same query condition. Problem with number of records first read in transaction greater than second read (caused by concurrent deletion)

From the above, the use of MVCC, most of the reading will no longer need to read locks, so read no longer blocking write, writing is no longer blocking read. Read operations are only subject to system IO capabilities.


Where did you go yesterday? Network interview, Old Zhou and Lao Zhao asked a lot of questions, most of the details, including the transaction isolation level and MVCC, due to inadequate preparation, so today specially to verify.

Among the isolation levels, it is difficult to understand the repeatable read REPEATABLE read, and serializable serial read, the following sequence of experiments to see the difference between each other.


Serializable isolation Level:

Session 1 Session 2

Mysql> Show variables like '%iso% ';

+---------------+--------------+

| variable_name | Value |

+---------------+--------------+

| tx_isolation | SERIALIZABLE |

+---------------+--------------+

1 row in Set (0.00 sec)

Mysql> begin;

Query OK, 0 rows Affected (0.00 sec)

Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

+---+

5 rows in Set (0.00 sec)

Mysql> Show variables like '%iso% ';

+---------------+--------------+

| variable_name | Value |

+---------------+--------------+

| tx_isolation | SERIALIZABLE |

+---------------+--------------+

1 row in Set (0.00 sec)

Mysql> begin;

Query OK, 0 rows Affected (0.00 sec)

Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

+---+

5 rows in Set (0.00 sec)

Mysql> INSERT INTO T values (6);

ERROR 1205 (HY000): Lock wait timeout exceeded; Try restarting transaction



Mysql> INSERT INTO T values (10);

ERROR 1205 (HY000): Lock wait timeout exceeded; Try restarting transaction

The above phenomenon shows: when the isolation level is serializable, incompatible MVCC, strictly follow the lock mechanism, when Session1 and Session2 are all full table query, the two sessions will be full table read lock, because read lock and read lock compatible, So at this point, no one session can modify, insert data, will enter the waiting.


REPEATABLE READ Isolation Level:


Session 1 Session 2

Mysql> Show variables like '%iso% ';

+---------------+-----------------+

| variable_name | Value |

+---------------+-----------------+

| tx_isolation | Repeatable-read |

+---------------+-----------------+

1 row in Set (0.00 sec)

Mysql> begin;

Query OK, 0 rows Affected (0.00 sec)

Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

+---+

5 rows in Set (0.00 sec)

Mysql> Show variables like '%iso% ';

+---------------+-----------------+

| variable_name | Value |

+---------------+-----------------+

| tx_isolation | Repeatable-read |

+---------------+-----------------+

1 row in Set (0.00 sec)

Mysql> begin;

Query OK, 0 rows Affected (0.00 sec)

Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

+---+

5 rows in Set (0.00 sec)

Mysql> INSERT INTO T values (6);

Query OK, 1 row Affected (0.00 sec)


Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

| 6 |

+---+

6 rows in Set (0.00 sec)

Mysql> select * from T;

+---+

| I |

+---+

| 1 |

| 2 |

| 3 |

| 4 |

| 5 |

+---+

5 rows in Set (0.00 sec)


Mysql> INSERT INTO T values (6);

ERROR 1205 (HY000): Lock wait timeout exceeded; Try restarting transaction

The above description: When the isolation level is repeatable read, compatible with MVCC (use undo), when two things read data to their respective undo, things are independent, but different transactions for the same row of data modification will be based on the order of exclusive locks. Session 2, above, inserts a data because session 1 already exists and locks the row of data, and at the moment it appears at the repeatable read isolation level, which is unique to the Phantom read phenomenon (which is not available in this conversation but still cannot be inserted).


Summarize:

MVCC: Multi-version control, multiple uncommitted transactions see the data are their own, different from each other, in the overall view of the client as if multiple versions of the database.

MVCC is compatible only with isolation level read-committed and Repeatable-read, MACC is not locked between reads and writes for the same line of different things, and locks the same line of different transactions.

MVCC is incompatible with read-uncommitted and serializable, where the serializable is controlled by a lock and all transactions conform to the lock feature.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.