MySQL Technology insider InnoDB Storage Engine (second edition) read the notes

Last Update:2017-06-18 Source: Internet

Author: User

Tags rollback

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, MySQL architecture

MySQL is a single-process multithreaded architecture database.

Second, the storage engine

InnoDB:

Support Transactions
Row lock
Read Operation no lock
4 isolation levels, default to repeatable
Adaptive Hash Index
The storage of each table is stored in the order of the primary key.
Support for full-text indexing (innodb1.2.x-mysql5.6)
Supports MVCC (multiple version concurrency control) for high concurrency

MyISAM:

Transaction not supported
Table lock
Support Full-Text indexing

Third, INNODB system structure

1. Background thread

Master Thread
- Responsible for asynchronously flushing data from the buffer pool to disk to ensure consistency of data
IO Thread
- callback Handling for IO requests
Purge Thread
- Reclaims an undo page that has been used and allocated (after a transaction is committed, the Undolog it uses is no longer needed)

2. Memory Pool

Buffer pool (one area of memory)
- InnoDB based on disk storage, managing records as a page (due to disk-based, slow, so you need to introduce a buffer pool to improve performance)
- Read page: First fetched from buffer pool, no buffer pool to get from disk
- Modify page: Write redo log buffer, modify the page in the buffer pool, and then flush to disk at a certain frequency (checkpoint mechanism), the page is called dirty page before it is flushed to disk
- Innodb_buffer_pool_size Setting size
- Store objects: index pages, data pages, adaptive hash indexes, and lock information
- The buffer pool can be configured with multiple (innodb_buffer_pool_instances), each of which is evenly distributed to different buffer pool instances based on the hash value, and is used to reduce intra-database resource contention
LRU List
- Put the most recent page in the front of the queue, the least recently used at the tail end, when the buffer pool is not enough, the tail end of the page is deleted out of the buffer pool (if this page is dirty page, it will be flushed to disk first). InnoDB uses midpoint technology for LRU, see MySQL Technology insider InnoDB storage Engine
Flush List
- Dirty Page List
Redo Log Redolog buffering
- To prevent dirty pages from going down when they are flushed to disk, you must first Redolog and then modify the page;
- Data recovery through Redolog (ACID-D persistence ) when a database outage occurs
- Default size 8M, via innodb_log_buffer_size
- The time to flush the redolog buffer to the Redolog file
  - Master flushes the Redolog buffer to the redolog file every 1s
  - Every thing submitted
  - Redolog buffer pool remaining space less than 1/2
Checkpoint
- Flush a dirty page to disk when the buffer pool is not sufficient
- When the database is down, you only need to redo the log after checkpoint to shorten the recovery time of the database.
- Flushing dirty pages to disk when Redolog is not available

Iv. InnoDB Logical Storage structure

1. Table Space

By default, only one tablespace is ibdata1, and all data is stored in this space
If Innodb_file_per_table is enabled, the data within each table can be placed in a single table space
- Each tablespace holds only data, indexes, and Insertbuffer bitmap pages, and other data is still in ibdata1

2, Segment section (InnoDB engine control)

Data segment: leaf node of B + Tree
Index segment: Non-leaf node of B + Tree
Rollback segment

3, Extent District

The size of each area is 1M, the page size is 16KB, that is, there are 64 consecutive pages in a zone (the size of the zone is not adjustable, the page can be)

4. Page

InnoDB Minimum unit of Disk Management
The default per page size is 16KB and can be set by Innodb_page_size (4/8/16k)
Up to 7992 rows of data per page

5, Row line

Five, index

1. Hash index

Locating data requires only one search, O (1)
Adaptive Hash Index: INNODB will monitor queries on each index page on the table, and if it is observed that a hash index can lead to a speed increase, a hash index (that is, InnoDB automatically establishes a hash index for some hotspot pages based on the frequency and mode of access)
The default is open
Available only for equivalent queries, not for range queries

2. B + Tree Index

The height of the tree is generally four layers, need to query (100w and 1000w rows of data, if the B + tree are 3 layers, then the query efficiency is the same)
B + Tree Index can find the page where the data row is located
Include clustered index and secondary index

3. Clustered index

That is, the primary key index
The leaf node holds the page where the row record data resides, and each row in the page is a complete row (the leaf node is also called the data page)
Also faster for range queries

Clustered index Graph:

Where the root node portion of the key:80000001 represents the primary key to 1;pointer:0004 represents the page number (that is, page 4th) pointing to the data page;

The pageoffset:0004 of the data page node represents page 4th, where the data stored is the complete row.

4. Secondary index

The leaf node is also the page where the row record data resides, but it is not a full row in the page, but is just a pair of key-value and a pointer to the primary key of the clustered index of the corresponding row data
Assuming that the secondary index tree is 3 layers high and the clustered index tree is 3 layers, then finding the data based on the secondary index requires 3 IO to find the primary key, and then 3 IO to find the data page of the row.
Insert and update operations for secondary indexes: If the secondary index page is inserted in the buffer pool, if it is not, then the point is placed in the Insertbuffer object, followed by a merge of the Insertbuffer and the secondary index page child nodes at a certain leveling rate

Secondary index graphs:

Among them, Idx_c said column C was indexed; The KEY:7FFFFFFF in Idx_c represents a value of column C, in fact, the pointer:80000001 in -1;idx_c represents the primary key of the row is 80000001, or 1; The following is the clustered index section.

5. Federated index (Multi-column index)

The left matching principle (if the index is (a, b), the where a=x can be used for the index, but b=x is not used, if the overwrite index is likely to be used)

6. Overlay Index

Fetching records directly from the secondary index
For statistical operations, such as COUNT (1), it is possible to federate the index, and the right side will match (the optimizer will do it itself), because the count (1) operation does not need to fetch the entire row of detailed data, so do not need to go to the clustered index of the leaf node to get the data, directly in the secondary index tree to complete
Select username from xxx where username= ' lisi ', if username is a secondary index, then the entire query can be completed on the secondary index tree, because the secondary index tree does not save the full row, but saves the < Username,lisi> this key-value right; If select username, age from xxx where username= ' Lisi ', then go to the clustered index

Six, lock

1, latch

Ensuring the correctness of critical resources for concurrent threading operations
Spin lock, after the specified number of spins, if the lock has not been acquired, then enter the waiting state, waiting to be awakened

2. Lock

Transaction locks, which can be locked by a table, page, or row
Release point: Transaction commit or Rollback
Two standard row-level locks
- Shared Lock: S lock, transaction T1 acquires S lock of R row, transaction T2 can also get S lock of R row
- Exclusive lock: x Lock, Transaction T1 acquires S lock of R row, transaction T2 cannot get X lock of R row, transaction T1 acquires X lock of R row, transaction T2 cannot get x/s lock of R row

Vii. Business

1. Isolation level

Read NOT submit
Read and submit
- Avoid dirty reads: One transaction reads data that is not committed by another transaction, and if another transaction is rolled back, the data that the first transaction reads is junk data
REPEATABLE READ
- There will be a phantom read, InnoDB through Next-key lock solved the
  - Phantom read: Two times the same SELECT statement will have different results, the second read will add a row of data, and did not say that the two executions are in the same transaction. You can avoid using table locks.
- Avoid non-repeatable reads: A comparison of the execution results of two identical SELECT statements in the same transaction. If the result is the same as before and after, it is repeatable, and if the results can be different before and after, it is not repeatable to read. Typically, an update has occurred. You can avoid this by increasing the shared lock on read (no modification).
- Default transaction ISOLATION LEVEL
Serialization of

MySQL Technology insider InnoDB Storage Engine (second edition) read the notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More