B + Tree and MVCC in InnoDB

Source: Internet
Author: User
I previously made an InnoDB sharing, mainly about the structure of B + Tree in InnoDB AND THE IMPLEMENTATION OF MVCC. PaperwritingservicesPPT :? Below BpTree_MVCC, I will sort out the PPT content a little. The first is B + Tree. The following describes the structure (via) of B + Tree in InnoDB, which has the following features: fixed number of seek operations and fewer times (because of Tree height comparison ).

I previously made an InnoDB sharing, mainly about the structure of B + Tree in InnoDB AND THE IMPLEMENTATION OF MVCC. Paper writing services PPT :? Below BpTree_MVCC, I will sort out the PPT content a little. The first is B + Tree. The following describes the structure (via) of B + Tree in InnoDB, which has the following features: fixed number of seek operations and fewer times (because of Tree height comparison ).

I previously made an InnoDB sharing, mainly about the structure of B + Tree in InnoDB AND THE IMPLEMENTATION OF MVCC.

Paper writing services

PPT :? BpTree_MVCC

Below we will sort out the PPT content a little.

The first is B + Tree. The following describes the structure of B + Tree (via) in InnoDB)

It has the following features:

  1. The number of track searches is fixed and the number of times is small (because the tree height is relatively low), while HD track searches are very time-consuming.
  2. Data is stored continuously. Non-leaf nodes only store pointers, and data is stored on leaf nodes. Easy to cache Indexes
  3. Each piece of data is organized by a two-way linked list, with fast Range Query
  4. Data is stored with leaf nodes, query is fast (no need to seek again), and insertion is slow (splitting/merging requires moving more data ). Compared with MyIASM, leaf nodes only store pointers, insert blocks, and slow query (multi-track)
  5. Although each module of a leaf node is in a continuous disk space, the leaf node itself is not continuously stored. After a long period of operation, it will be fragmented, affecting the efficiency of range query. However, mysql provides optimization methods.

Is it strongly recommended here? This article describes the implementation structure of B + Tree in InnoDB in detail.

Then there is MVCC.

MVCC is a multi-version concurrency control that replaces a read/write lock when implementing transaction operations. A simple read/write lock adds a read/write lock to all read data. once read, the data cannot be written, and the data cannot be read.

Since it is to solve the read/write conflict, when can write and when can read is the focus of consideration, so there is a "isolation level" concept. This concept emphasizes the situations in which read and write are allowed.

InnoDB MVCC supports four isolation levels: read uncommitted, read committed, repeatable read, and SERIALIZABLE. The most common one is "READ? COMMITTED: read committed "and" repeatable read: repeatable read ".

  1. READ? COMMITTED: Read COMMITTED. In the SELECT statement, the same query statement is executed twice in the same transaction, when the data queried by other transactions is modified and committed, the data read twice is inconsistent. It is a "read" and "committed" transaction.
  2. Repeatable read: repeatable read. In any transaction, the visibility of any data is prior to the start of the transaction. Even if other transactions are committed, the data is invisible to the current transaction. That is, "repeatable" and "read" to the same content.

Note that, regardless of the isolation level, once a record is updated/DELETE/select for update, that is, after the X lock is applied, the transaction cannot be updated (the X lock is applied) until it is committed.

How does InnoDB implement multi-version transactions? during my speech, I also invited Netease he Deng da Shen's PPT

Address :?InnoDB Transaction Lock and MVCC: Micro disk address ?? Slideshare address

This PPT describes in detail the specific implementation of MVCC, including lock-related implementation. Below I will briefly summarize the key points.

InnoDB achieves the preceding isolation level through ReadView. ReadView records the current status:

  1. The transaction ID of the smallest active transaction (globally unique, auto-incrementing)
  2. ID of the current transaction
  3. Linked list composed of all active transaction IDs

At the same time, when a transaction modifies a field, the ID of the current transaction is indicated when the original value is modified, and the old data and old transaction ID are placed in the rollback segment.

With the above two operations, the role of ReadView is reflected, that is, the Select statement reads:

  1. Data with the transaction ID that exceeds the minimum active transaction ID and has the largest transaction ID in the current inactive transaction
  2. Organize the language, that is, find the largest inactive transaction through ReadView, obtain its transaction ID, and search for the data with this transaction ID in the table or its rollback segment.

At the same time, any data smaller than the "minimum active transaction ID" can be recycled because they will no longer be read.

Therefore, READ? The difference between COMMITTED and repeatable read lies in the Creation Time of ReadView. The former creates ReadView at the beginning of the statement, and then drops after the statement ends. The latter is created at the beginning of the transaction, and drops after the transaction is committed. To implement its functions.

It should be noted that even for READ? At the COMMITTED level, if a new transaction is COMMITTED during statement execution, the select statement is still invisible (in extreme cases ).

The storage structure of ReadView, or more in-depth research, can go to the above PPT, no longer repeated.

In fact, I also shared some information about rollback segments and rollback methods, MySQL's X-commit 2 segment commit, some operations on B + Tree, I feel a little pale in writing. Besides
Jeremy Cole and he dengcheng's blog and PPT should be detailed and elegant. We recommend you have a look at them.

Zp8497586rq

Original article address: B + Tree and MVCC in InnoDB. Thank you for sharing them with me.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.