MySQL InnoDB Storage Engine

Last Update:2016-01-25 Source: Internet

Author: User

Tags bulk insert flushes mysql in

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL corresponds to InnoDB version

MySQL 5.1 "InnoDB 1.0.X

MySQL 5.5 "InnoDB 1.1.X

MySQL 5.6 "InnoDB 1.2.X

Background thread

1.Master Thread

Responsible for asynchronously flushing data from the buffer pool to disk to ensure consistency of data, including flushing dirty pages, merging insert buffers, and recovering undo pages.

2.IO Thread

The InnoDB storage engine uses AIO (Async IO) to handle write IO requests to improve the concurrency of the database, with four types of IO threads: Insert buffer thread, log thread, read thread, write Thread Where the read thread and the write thread each have four threads, which can be configured by Innodb_read_io_threads and Innodb_write_io_threads.

' innodb_%io_threads ' or show ENGINE INNODB STATUS \g;

3.Purge Thread Threads

Purge thread threads are used to reclaim the undo page of a transaction after it has been committed, which is turned on by default and can be configured with Innodb_purge_threads=1 multiple purge thread threads.

' innodb_purge_threads ';


innodb_purge_threads=2

4.Page Cleaner Thread

Used to reclaim dirty pages generated by the delete and update operations in the multi-versioning feature to perform flushing of dirty pages to disk.

5.Binlog Dump Thread

When replication is configured, a binlog dump is generated on the primary server to read the binary modification records.

Memory

do not understand that there is only buffer pool in memory, but also include redo log buffer, additional memory pool (currently not known as join buffer, order buffer, key buffer, table cache Whether buffer is inside the buffer pool or independent of the buffer pool in memory)

1. Cache Pool

Cached data mainly have data page, index page, redo log page (Undolog), node information, system data, insert buffer, Adaptive Hash Index, data dictionary, lock information, etc., do not mistakenly understand that Undolog is dirty page, undolog refers to redo log file data, Dirty pages refer to data or index pages.

' innodb_buffer_pool_size ';

The default InnoDB has 8 buffer pools, which can be configured by innodb_buffer_pool_instances
Inquire
Show engine InnoDB status \g;
Or
SELECT * from Information_schema.innodb_buffer_pool_status;

Read operation:

Data is a page storage unit, cache a lot of data pages in the buffer pool, when the first read the page from disk to the cache pool, the next time to read the same data page if it is also in the cache pool directly from the buffer pool read, and do not need to go to disk read, The best way to do this is to cache all the disk data in the buffer pool, but that's enough memory.

Modify Operation

The InnoDB storage engine modifies data by modifying the data page in the buffer pool (if it exists) and then refreshing the data file to disk at a certain frequency, which involves the checkpoint mechanism,

Insert operation (Insert buffer)

Because the data is in the order of the clustered index, all inserts for the clustered index are generally very fast, and the non-clustered index inserts are not necessarily sequential, this time requires a discrete access to the nonclustered index page, the performance of the insert is often poor, one case may be the exception is the nonclustered index time field, The time is often sequential, this situation will be faster, for the non-clustered index in this case introduced the insertion buffer.

Insertion buffer (insert buffer) is introduced in the INNODB, insert buffers are not unique nonclustered indexes, insert and update operations for nonclustered indexes are not inserted into the index file each time, but rather first determine if the inserted nonclustered index page exists in the buffer pool. If it exists, it is inserted directly into the nonclustered index file of the buffer pool, otherwise it is put into an insert buffer object, but it is inserted into the index file, but it is not actually, and then inserted into the index file at a certain frequency, In this process, if there are multiple inserts of the same index page, insert will be merged, greatly improving the performance of the non-clustered index insert,

Because each insert is inserted into the buffer pool without looking up the index page to determine the uniqueness of the record, because to make the decision to go to the discrete lookup, the insertion buffer is not unique to the nonclustered index.

In the case of dense write operations, the insertion buffer consumes too much buffer pool memory, the default maximum can be 50%, the source code of ibuf_pool_size_per_max_size=2, if it is modified to 3, the maximum can use only 1/3 of the buffer pool memory.

2.LRU list, free list, Flush list

The default size of the page in the InnoDB buffer pool is 16KB, and the buffer pool is managed by the LRU (Latest recent used least recently used) algorithm, placing the most frequently used pages in the front of the LRU list, while the most recently used pages are placed at the tail end, When there is not enough space in the cache pool, the end page is first removed to free up space. The LRU has a midpoint location, default at 37% of the LRU, the left side represents the old list, the right side represents the new list (hotspot data), and the pages in the newly inserted buffer pool are placed first in the midpoint position. If a newly inserted page moves to the new list, it may cause some of the activities in the new list to be removed to the old list, such as when a table scan operation may require access to a large number of data pages at a time, which may be rarely used in the future, and when the newly inserted page will be placed in the new list. To solve this problem InnoDB introduced the Innodb_old_blocks_time parameter, which controls how long the newly inserted data page will be added to the new list after the mid position.

' innodb_old_blocks_pct '

Query The Innodb_old_blocks_time value, in milliseconds, by default 1000 milliseconds or 1 seconds
Show variables like ' Innodb_old_blocks_time '

View information about all pages in the buffer pool, including free pages, all of the data pages *16kb in fact the total size of the buffer pool. Select  from information_schema. innodb_buffer_page; View information for the Lur list, including the new list and old list but excluding the free list, the fields in the table record information about the current data page, including the buffer pool ID, the type of page ( data page, index page, undo Log,other), table name, index name, whether it is a page of the old list, whether it belongs to a compressed page (you can compress the original 16K page to 1K, 2K, 4K, 8K), compress the size of the page, whether it belongs to the dirty page. Select Pool_id,lru_position,space,page_type,flush_type,newest_modification,oldest_modification,index_name, Data_size,compressed_size,compressed,is_old from INFORMATION_SCHEMA. INNODB_BUFFER_PAGE_LRU;

Oldest_modification>0 indicates the number of dirty pages (modified DB pages)
Is_old= ' YES ' on behalf of the old list page
Compressed<>0 on behalf of compressed pages

Flush List: The value of the dirty page in the LRU, Flust list exists in the new list, that is, oldest_modification>0 (modified DB pages)

3. Redo log buffers (redo log buffer)

' innodb_log_buffer_size% ';

The InnoDB storage engine first places the redo log information into the redo log buffer and then flushes it to the Redo log file at a certain frequency. The default buffer size is 8m,8m basic can meet the requirements, do not need to configure too large redo log buffers.

Refresh mechanism:

1.Master Thread Flushes the redo log buffer to the redo log file every second;

2. The redo log buffers are flushed to the redo log file when each transaction commits;

3. When the redo log buffer pool has less than 1/2 remaining space

4. Additional memory

Normally our server MySQL process uses more memory than the configured InnoDB buffer pool, which is because MySQL in addition to the buffer pool cache memory additional need for a portion of memory to control the buffer pool internal resource information, such as LRU, lock resources, wait and so on.

Checkpoint mechanism

In order to solve the problem of CPU and disk direct speed with buffer pool, so the operation of the data is done in the buffer pool, the data page in the buffer pool is often newer than the data page on disk, we will be modified in the buffer pool but not applied to the disk data page called "Dirty page", the data page will eventually need to update to disk , the middle will involve the checkpoint mechanism.

At the same time, in order to resolve the problem that the buffer pool has not yet been flushed to the disk by a sudden server outage, the redo log file is added (the Redo log file is configured by default to 2, the default name is Ib_logfile, the redo log file default size is 48M, Two redo log files to take the form of circular write, when the transaction commits the first write redo log, when a server outage can be done by redo the log to complete the recovery (after the server restarts itself by default recovery), so make sure redo log files have the remaining space, the default mechanism is when redo log file space reached 75%-90% Flush a portion of the dirty page to the disk while emptying the corresponding redo log space.

How many pages to disk are refreshed each time:

Sharp Checkpoint: When the database is closed, all dirty pages are flushed back to disk, by default, parameters: Innodb_fast_shutdown=1

Fuzzy Checkpoint: Refresh part of the dirty page, specifically divided into the following four cases

1.Master Thread Checkpoint

Master thread flushes dirty pages back to disk every few seconds from the buffer pool

2.flush_lru_list CheckPoint

After the 5.6 release, you need to ensure that LRU has 1024 available pages by default, and if the available pages are less than 1024 pages, refresh some dirty pages back to disk, configured by the parameter "Innodb_lru_scan_dapth".

3.async/sync Flush Checkpoint

refers to the synchronous or asynchronous flushing of dirty pages to disk due to insufficient redo log file space, triggering an asynchronous flush when the redo log space has been used for 75%-90%, and triggering a synchronous flush if more than 90% is triggered, typically not triggering a synchronous flush operation unless the redo log file is too small and load The bulk insert operation for data.

4.Dirty Page Too much

The proportion of dirty pages in the buffer pool is guaranteed, and when the dirty page ratio in the buffer pool reaches 75%, the dirty page action is triggered by the parameter "innodb_max_dirty_pages_pct" configuration.

MySQL InnoDB Storage Engine

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More