About innodb Index page structure, insert buffer, adaptive hash index, and innodb
Physical Structure of an InnoDB Index
All innodb indexes are btree indexes, and the index records are stored on the leaves. The default index page size is 16 kb. When a new record is inserted, innodb tries to leave a 1/16 free page for future insert and update operations.
If the index records are completely inserted in the order of index records, the index will also fill up 15/16 of the page size. If the insertion order is completely random, then the index page is basically filled with 1/2 to 15/16 self-built. If the fill factor is lower than 1/2, innodb tries to recreate B-tree.
After Mysql5.6, you can use the innodb_page_size parameter to set the size of each index page under the current instance. Once set, the page cannot be changed back. The recommended configuration is usually 16 K, 8 K or 4 K. In addition, if A Mysql instance is set with innodb_page_size A different from the default value, files on other instances with A different value (such as physical backup and recovery) cannot be used)
Insert Buffering
Database applications are usually inserted in the primary key order. In this case, because the order of the clustered index is exactly the same as that of the primary key value, the insert operation will reduce a lot of random IO.
On the other hand, secondary indexes are generally not unique, so data insertion in secondary indexes is a relatively random order. Similarly, when the delete and update operations affect the data page, index changes are involved, and secondary indexes are not closely related. This leads to a large number of random IO.
When a record is inserted or a record is deleted from a non-unique secondary index, innodb first checks whether the secondary index page is in the buffer pool. If it is in the buffer pool, innodb will directly modify this index page in the memory. If the index is not in the buffer pool, innodb will insert the modification record to the insert buffer, that is, insertbuffer. Insert buffer is usually relatively small, so it can ensure that all are in the buffer pool, and updates are very frequent. The modified process is change buffering (generally, it only acts on insert operations, so it is also called insertbuffering, and the data structure is insert buffer)
Disk I/O for Flushing the Insert Buffer
So how does the insert buffer reduce random I/O? During each period of time, insert buffer will merge the secondary non-unique index in insertbuffer. Generally, it merges N modifications to the index pages of the same B-tree index, which saves a lot of IO operations. It is tested that insertbuffer can increase the insertion speed by 15 times.
After the transaction is committed, the insert buffer may still be merged and written. Therefore, if a database restarts abnormally and the reovery stage requires many secondary indexes to be updated or inserted, the insert buffer may take a long time or even several hours. In this phase, the disk IO will increase, resulting in a significant performance decline in disk-bound queries.
Adaptive Hash Indexes
The adaptive hash index (AHI) enables innodb to have enough memory and certain workloads in the buffer pool. It looks more like a memory database and does not sacrifice the features and stability of any transactions. This feature is controlled by the innodb_adaptive_hash_index parameter. The dynamic parameter, on by default, indicates that the adaptive hash index is enabled. After AHI is disabled, the built-in hash table will be cleared immediately, and normal operations can continue, just directly by accessing the B-TREE index. After AHI is enabled, the hash table is rebuilt.
By observing the search mode, mysql uses the index key prefix to create a hash index. This prefix can be of any length and may be only some values on B-tree, instead of the entire B-tree. After the hash index is detected, a hash index is created on frequently accessed index pages.
If almost all tables are in the buffer pool, creating a hash index can speed up the equivalent query and convert the index value of the B tree into a sort pointer. Innodb has this mechanism to monitor the Search Condition of indexes. If it notices that some queries can be optimized by creating hash indexes, it will be automatically created, so it is "Adaptive ".
In some workloads, the performance improvement brought by searching by hash indexes is much greater than the additional overhead of monitoring index search and maintaining the hash table structure. However, in some cases, when the load is high, the read/write locks added to the adaptive hash index will also lead to competition, such as high-concurrency join operations. Like operation and % wildcard are also not applicable to AHI. If the workload is not suitable for AHI, we recommend that you disable it to avoid unnecessary performance overhead. Because it is difficult for mysql to predict whether AHI is suitable in a specific situation, we recommend a stress test for the actual workload (with or without AHI ). In versions 5.6 and later, it is recommended that more and more workloads disable adaptive hash indexes, although it is enabled by default.
Hash indexes are often created based on the existing B-tree. innodb can create a hash index by observing the search conditions of B-tree to create a B-tree index prefix of any length. A hash index can be partial and only contains the most frequently accessed pages in B-tree indexes.
You can determine whether to use the adaptive hash index by observing the SEMAPHORES section in the result of show engine innodb status. If you see that many threads all create waiting on rw-latch In the btr0sea. c file, we recommend that you disable the adaptive hash index. One of the cases I have met is as follows. In a typical high-concurrency mode, when competition is caused by AHI, We need to disable AHI.
In this article, we will talk about the index page structure of innodb, insert buffer, and adaptive hash index, which is all the content that I have shared with you. I hope you can give me a reference and support me a lot.