Index of MySQL storage engine InnoDB learning review

Source: Internet
Author: User

There are two types of indexes in InnoDB: B + tree indexes and hash indexes. The hash index is self-adaptive. The storage engine automatically creates a hash index based on the table usage and cannot interfere with the hash index.

1. Clustered Index

Clustered index: the InnoDB Storage engine table is an index organization table, that is, the data in the table is stored in the order of primary keys, and the clustered index is to construct a B + tree based on the primary keys of each table, the leaf node stores the row record data of the entire table. Therefore, the leaf node is a data node. Due to the characteristics of the B + tree, the data node is connected through a two-way linked list, this makes searching faster. Because the actual data page can only be stored in one B + tree, a table can only have one clustered index. In most cases, the query optimizer tends to use clustered indexes.

Lab

Create a table T. In this way, each page can only store two rows of data.

Create Table T (A int not null primary key, B varchar (8000 ));

Insert 3 data records

insert into test.t select 2,repeat('a',7000);insert into test.t select 3,repeat('a',7000);insert into test.t select 4,repeat('a',7000);

Then use the analysis in the book and the py_innodb_info.py tool for analysis,

The following B + tree is created here.

Clustered indexes are not stored in physical order, but logically continuous.

The benefits of clustered index: sorting and searching for primary keys are very fast. For example, to search for the 10 most recently registered users in a registered user table, you can query the last data page, obtain 10 records and search by range: If you want to query data in a specific range, you can obtain the page range through the intermediate node at the upper layer of the page node, and then read the data page.

2. Non-clustered Index

For non-clustered indexes, the page level does not contain all the data of the row. In addition to the key value, the page Node also contains a bookmarks for each page level index, this Bookmarks can be used to tell the storage engine where row data corresponding to the index can be found. Because the InnoDB Storage engine is an index organization table, the bookmarks are the clustered index keys for the corresponding row data.

Compared with clustered indexes, multiple non-clustered indexes can be created in a database table. When data is queried through a non-clustered index, innoDB traverses non-clustered indexes and obtains a pointer to the primary key index (clustered index) through the leaf node, and then finds the corresponding row data through the primary key index, this requires more Io than clustered indexes.

In the above experiment, add a field int C

Then execute: Update t set C = 0-a;

Add index alter table t add key idx_c (C );

This is a tree that combines clustered indexes and non-clustered indexes.

You can also create the beginning of a column: alter table t add key idx_ B (B (100 ));

You can also add a joint index: alter table t add key idx_a_ B (A, B );

You can use drop index idx_a_ B on T; to delete an index.

You can use show index from t; to view the index of some money.

Use of B + tree indexes

It makes sense to use B + tree indexes when there are few rows in the table. For scenarios where gender fields are only for men and women, it is completely unnecessary, B + tree indexes can be used to obtain a wide range of image names.

3. Sequential read, random read, and pre-read

Sequential reading refers to sequential reading of blocks on disks. Random reading refers to non-sequential access to blocks on disks, which requires continuous rotation of the Disk Head. in the database, sequential read refers to the ability to read the required row data in sequence based on the leaf node data of the index. Random read refers to the need to access the non-clustered index and clustered index to read the required data.

To improve reading performance, InnoDB introduces the pre-read technology. Multiple pages are read to the buffer pool through one Io request. There are two methods: Random pre-read and linear pre-read, however, pre-reading is not very effective in actual tests. After InnoDB plugin 1.0.4, random access only retains linear pre-reading, and innodb_read_ahead_threshold is added.

4. Secondary index (non-clustered index) Optimization

Because the page node of the secondary index contains a primary key, but the page of the secondary index does not contain the complete row information, InnoDB always checks whether the required data can be found through the secondary index.

Create a table:

Create Table T2 (A int not null, B varchar (20), primary key (A), key (B ));

Insert four rows of data:

Insert into T2 select 1, 'Hello ';

Insert into T2 select 2, 'World ';

Insert into T2 select 3, 'wh ';

Insert into T2 select 4, 'areyou ';

Then select * From T2; the query result is:

4 Areyou
1 Hello
3 What
2 World

This is because the secondary index contains the value of primary key A. Therefore, access the secondary index on column B to obtain the value of A, and then obtain the data in the whole table. Generally, secondary indexes store more data than primary key pages. Therefore, the optimizer selects secondary indexes.

When interpreting this SQL statement:

Explain select * From T2;

If index B is used, if you want to sort a, you can use order by or force the primary key to obtain the result.

Explain select * From T2 order by a; or select * From T2 Force Index (primary );

The primary key is used.

Explain select * From T2 Force Index (primary); but in fact it does not use the primary key, not use!

5. Joint Index

You can add an index to multiple columns.

Alter table t add index idx_a_ B (A, B );

Benefits of joint indexing: 1) for similar select * from t where a = 1 and B = 'aaaaaaaaaaaaaaaaaaaaaaa... you can also use (a, B) indexes for 'queries and select * from t where a = 1, but B cannot be used. 2) You can sort 2nd key values.

Lab

Create a table

Create Table buy_log (userid int unsigned not null, buy_date date );

Insert data

insert into buy_log values(1,'2009-01-01');insert into buy_log values(2,'2009-01-01');insert into buy_log values(3,'2009-01-01');insert into buy_log values(1,'2009-02-01');insert into buy_log values(2,'2009-02-01');insert into buy_log values(3,'2009-03-01');insert into buy_log values(1,'2009-04-01');

  

Add an index: alter table buy_log add key (userid); alter table buy_log add key (userid, buy_date );

If you only query userid, explain select * From buy_log where userid = 2;

Userid_2 is used on my computer: I think it should be the relationship between OS, probably in win7, similar to the secondary index above

In Linux (UBUNTU), the following results are obtained:

For querying explain select * From buy_log where userid = 1 order by buy_date limit 3;

Userid_2 is also used, because the order of buy_date has been sorted in this optimization.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.