MySQL multi-version concurrency control analysis

Source: Internet
Author: User

    • Read transaction reads the master record directly, regardless of whether the update transaction is completed
    • read_committed
      Read the Commit, read the transaction every time the master record has no lock, if there is no lock to read the master record, if there is a lock, read the latest version of undo log. So every time you read it is the latest committed data. therefore two reads of the same field may read different data (phantom reads), but can guarantee to read the latest data every time.
    • Repeatable_read
      Check the primary record for the first time and read the master record if there is no lock, or read the most recent version of undo log if there is a lock. I guess the update will create a new record and then copy the original master record to the current master record, and the original master record becomes the latest undo log. Read the first read version every time, so that no phantom reads are guaranteed, but the latest data may not be read
    • SERIALIZABLE
      Lock table, read/write blocking, use less


How did MySQL realize the MVCC? Many people are asking this question, but there is no answer in Google, this article tries to find the answer from the MySQL source.

In MySQL, MVCC is supported in the InnoDB storage engine, and InnoDB implements three hidden fields for each row of records:

    • 6-byte transaction ID ( DB_TRX_ID )
    • 7-byte rollback pointer (db_roll_ptr)
    • The Hidden ID
A 6-byte object ID is used to identify the transaction described by the line, and a 7-byte rollback pointer requires an understanding of the transaction model under InnoDB.
1. InnoDB transaction-related concepts in order to support transactions, INNBODB introduces the following concepts:
  • Redo Log
    Redo log is the SQL operation that saves the executed SQL statement to a specified log file and re-executes the redo log record when MySQL executes recovery. When the client executes each SQL (UPDATE statement), the redo log is written to log buffer first, and when the client executes the commit command, the contents of the log buffer are flushed to disk as appropriate. Redo log exists as a standalone file on disk, which is the InnoDB log file.
  • Undo Log
    In contrast to redo log, undo log is used for rollback, which is the contents (rows) of the database before the copy transaction to undo buffer, and the content in undo buffer is flushed to disk at the appropriate time. Undo buffer, like redo buffer, is also a ring buffer, but when the buffer is full, the contents of undo buffer will also be flushed to disk, and unlike the redo log, there is no separate undo log file on the disk, all undo Log is stored in the main IBD data file (tablespace), even if the client sets one data file per table.
  • Rollback segment
    Rollback segment This concept comes from the Oracle model of Things, in InnoDB, the undo log is divided into segments, and the undo log for a particular row is saved in a segment, called a rollback segment. Undo log and rollback segments can be considered to be the same meaning.
  • Lock
    InnoDB provides row-based locks, and if the number of rows is very large, the number of locks in high concurrency may also be larger, according to the InnoDB document, InnoDB the lock is effectively spatially optimized, even if the high concurrency does not result in memory exhaustion.
    There are two types of locks on rows: exclusive, shared. Shared locks are for pairs, exclusive locks are written for, and are completely equivalent to the concept of read and write locks. If a transaction is updating a row (an exclusive lock), the other thing must wait for the line to read or write, and if something reads a row (a shared lock), the other reads without waiting, while writing things waits. With shared locks, there is no waiting between multiple reads, but the application of locks relies on the transaction isolation level of MySQL.
  • Isolation level
    The isolation level is used to limit the degree of direct transaction interaction, and there are several industry standards:
    -Read_uncommitted: Dirty Read
    -read_committed: Read submit
    -Repeatable_read: Repeat Read
    -SERIALIZABLE: Serialization
    InnoDB to four types are supported, dirty read and serialization of the application is not many, read commit, repeat read more extensive, the following will explain how it is implemented.
2. The update process for the line below demonstrates the process of updating a transaction to a row of records: 1. Initial data row
F1~f6 is the name of a row, 1~6 is its corresponding data. The following three hidden fields correspond to the transaction number and the rollback pointer of the row, if the data is just insert, you can think of ID 1, and the other two fields are empty. 2. Transaction 1 Change the value of each field in the row
When transaction 1 changes the value of the row, the following actions are performed:
    • Lock the row with an exclusive lock
    • Log Redo Log
    • Copy the value of the row before the change to undo log, which is the line below
    • Modify the value of the current row, fill in the transaction number, and point the rollback pointer to the pre-modified line in undo log
3. Transaction 2 Modify the value of the row
As with transaction 1, at this point, undo log has two rows of records and is linked together by a rollback pointer. Therefore, if undo log is not deleted, the current record's rollback pointer goes back to the initial content at the time the row was created, fortunately, when there is a purge thread in InnoDB, it queries the undo log, which is older than the oldest active transaction, and deletes them, guaranteeing the undo The log file does not grow indefinitely. 4. Transaction commit when the transaction is committed, Innbod only needs to change the transaction state to commit, no additional work is required, and the rollback is slightly more complex, and needs to be recovered from the undo log based on the current rollback pointer to find the version prior to the transaction modification. If a transaction affects a very high number of rows, the rollback can be inefficient, and the InnoDB efficiency is very high, depending on the number of transaction rows in the 1000~10000. Obviously, InnoDB is a storage engine that commits a higher efficiency than rollback. It is said that the implementation of postgress is exactly the opposite. 5. Insert undo Log The procedure described above is exactly the transaction procedure for update, in fact, undo log sub-insert and update undo log, because the original data does not exist when the insert is rolled back, so insert Undo Log is discarded, and update undo log must follow this procedure. 3. Transaction levels are known to be updates (update, insert, delete) is a transactional procedure, and in InnoDB, the query is also a transactional, read-only transaction. When a read-write transaction accesses the same row of data concurrently, the content that can be read depends on the transaction level:
    • read_uncommitted
      Read transaction reads the master record directly, regardless of whether the update transaction is completed
    • read_committed
      Read commits, read transactions read the latest version of undo log every time, so reading the same field two times may read different data (Phantom read), but can guarantee to read the latest data every time.
    • Repeatable_read
      Reads the specified version each time so that no phantom reads are generated, but the latest data may not be read
    • SERIALIZABLE
      Lock table, read/write blocking, use less
Read transactions are typically triggered by a SELECT statement, which is guaranteed to be non-blocking in InnoDB, except for select with For update, and select with For update adds an exclusive lock to the row, waiting for the update transaction to complete and read its latest content. For the entire InnoDB design goal, it is to provide efficient, non-blocking query operations. 4. MVCC the above update before the establishment of the undo log, according to the various policies read non-blocking is the Mvcc,undo log line is the MVCC in the multi-version, this may be with our understanding of the MVCC have a greater discrepancy, generally we think MVCC has the following features:
    • Each row of data has a version that is updated every time the data is updated
    • Copy out the current version at random modification, no interference between transactions
    • Compare version number on save, overwrite original record if successful (commit), failure to discard copy (rollback)
That is, each line has a version number, save it depends on the version number of success, it sounds like an optimistic lock flavor ... , and InnoDB is implemented in the following way:
    • Transaction modifies raw data as an exclusive lock
    • Store the pre-modified data in undo log and associate it with the master data by rolling back the pointer
    • Modify success (commit) do nothing, failure restores data in undo log (rollback)
The most essential difference between the two is, if you want to modify the data is exclusive locking, if the lock is not considered MVCC?
InnoDB implementation is really not MVCC, because there is no multi-version of the core coexistence, undo log content is only the result of serialization, the process of recording multiple transactions, not part of the multi-version coexistence. But the ideal MVCC is difficult to implement, and when a transaction modifies only one row of records using the ideal MVCC pattern is no problem and can be rolled back by comparing the version number, but when the transaction affects multiple rows of data, the ideal MVCC is powerless.
For example, if Transaciton1 executes the desired MVCC, the modification Row1 succeeds, and the modification Row2 fails, the Row1 is rolled back, but because Row1 is not locked, the data may be modified by Transaction2, and if the contents of Row1 are rolled back at this time, Damage to the Transaction2 results, causing Transaction2 to violate acid.
The fundamental reason why the ideal MVCC is difficult to achieve is the attempt to replace the two-paragraph submission with optimistic locking. Two rows of data are modified, but to ensure consistency, there is no difference between modifying data in two distributed systems, and two commits are the only means of ensuring consistency in this scenario at this time. Two paragraph of the nature of the submission is locked, the essence of the optimistic lock is to eliminate the lock, the contradiction between the two, so the ideal MVCC difficult to really be applied in practice, InnoDB just borrowed MVCC the name, provides read non-blocking only. 5. Summing up is not to say that MVCC is nowhere to be found, for some scenarios where consistency is not required and for the operation of a single data can be useful, such as multiple transactions at the same time to change the number of users online, if a transaction update fails to recalculate and retry, until successful. Using MVCC in this way can greatly increase the number of concurrent numbers and eliminate the lock. 6. References
    • MySQL official website
    • Http://blog.chinaunix.net/link.php?url=http://forge.mysql.com%2Fwiki%2FMySQL_Internals
    • Understanding MySQL Internals

MySQL multi-version concurrency control analysis

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.