MySQL index one (b-tree)

Source: Internet
Author: User
Tags mysql index



One: type of index

Two: Advantages of the Index

Three: High performance indexing strategy

Four: Index case



1.1 Type Introduction

There are many types of indexes that can provide better performance for different scenarios. In MySQL, indexes are implemented at the storage engine layer rather than at the server level. Therefore, there is no unified indexing standard: indexes for different storage engines work differently, and not all storage engines support all types of indexes, even if multiple storage engines support the same type of index, the underlying implementation may be different

1.2 B-tree Index

The storage engine uses B-tree indexes in different ways, with different performance and pros and cons.

For example, MyISAM uses prefix compression technology to make the index smaller, but InnoDB is stored in the original data format.

The index row is referenced by the physical location of the data, and Inoodb is referenced by the primary key MyISAM



1.2.1 B-tree index data Structure

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6F/1C/wKiom1WSEQyjIGKwAADjBorWus0298.jpg "title=" 1.png " alt= "Wkiom1wseqyjigkwaadjborwus0298.jpg"/>




1.2.2 Understanding Data Structures

If you have the following data sheet:

CREATE TABLE people (last_name varchar () NOT NULL, first_name varchar () is not NULL, DOB date not NULL, gender eum (' m ', ' F ') is not NULL, key (last_name, first _name, DOB))

For each row of data, the index contains the values of the last_name,first_name and DOB columns

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6F/1C/wKiom1WSEWfBkvpuAAF1p7iERDw383.jpg "title=" 2.png " alt= "Wkiom1wsewfbkvpuaaf1p7ierdw383.jpg"/>

Note: The index sorts multiple values by the order of the columns when the index is defined in the CREATE TABLE statement. Looking at the last two entries, the two people have the same surname and name, according to their birth date to order




1.2.3 B-tree indexes are valid for queries of the following type

full value match: matches all columns in the index, for example, the index created earlier can be used to find the person named Allen Cuba, born in 1960-01-01

match the leftmost prefix: the index created earlier can be used to find all people with the surname Allen, that is, using only the first column of the index

match column prefix: the previously created index can also match only the beginning of the value of a column, such as a person who can be used to find all surnames starting with J.

Match Range Value: the index created earlier can be used to find people whose surname is between Allen and Barrymore. ( only the first column of the index is used here )

match exactly one column and the range matches another column: the index you created earlier can be used to find all people whose surname is Allen and whose name starts with the letter K (for example, Kim). that is, the first column last_name full match, the second column first_name range matches

query that accesses the index only: B-tree It is often possible to support "indexed queries only", where the query only needs to access the index without having to access the data rows.



Because the nodes in the index tree are ordered, the index can also be used for the order by operation in the query, in addition to lookup by value. In general, if B-tree can find a value in some way, it can also be used in this way for sorting. Therefore, if the ORDER BY clause satisfies several of the query types listed earlier, the index can also meet the corresponding ordering requirements




1.2.4 B-tree limits for Indexes

1.2.4.1 You cannot use an index if you do not start by the leftmost column of the index. For example, the index in the example above cannot be used to find a person named Kim, and you cannot find a specific birthday person because neither column is the leftmost data column. Similarly, you cannot find the person whose last name ends with a letter

1.2.4.2 columns in the index cannot be skipped. that is, the index mentioned earlier cannot be used to find a person whose surname is Allen and is born on a specific date. If you do not specify a name (first_name), MySQL can only use the first column of the index

1.2.4.3 If there is a range query for a column in the query, none of its right columns will be able to use index-optimized lookups. For example there is a query where last_name= ' Allen ' andfirst_name like ' K% ' and dob= ' 1930-07-12 ', this query can only use the first two columns of the index, Because like here is a scope condition (but the server can use the rest of the columns for other purposes). If the range Query column values are limited, you can replace the scope criteria by using multiple equals conditions. try to put the scope query to the last possible, because the column after the scope query cannot use the index




2.1 Benefits of indexing

Indexing allows the server to quickly navigate to the specified location of the table. But this is not the only function of the index, so far

As you can see, indexes have some additional effects depending on the structure of the data in which the index is created

The most common B-tree index stores data sequentially, so MySQL can be used to do order by and GROUPBY operations. Because the data is ordered, B-tree also stores the related column values together. Finally, because the actual column values are stored in the index, some queries use only the index to complete the query. According to the characteristics, the following three advantages are summed up in the index:

1. the index greatly reduces the amount of data that the server needs to scan

2. Indexing can help the server avoid sorting and staging tables

3. Indexes can turn random I/O into sequential I/O




Lahdenmaki and Leach proposed, how to evaluate whether an index is suitable for a query "Samsung system":

A. The index gets one star by putting the related records together;

B. Two stars are obtained if the data order in the index and the sort order in the lookup are consistent;

C. If the column in the index contains all the columns needed in the query, get Samsung;




3. High-performance indexing strategy

Correctly creating indexes and using indexes is the basis for implementing high-performance queries. Now let's talk about how to really play the advantages of these indexes

3.1 stand-alone index

It is common to see some queries using indexes incorrectly, or to make MySQL unusable with existing indexes. If the columns in the query are not independent, then MySQL does not use the index. A "stand-alone column" means that an indexed column cannot be part of an expression or a parameter of a function

Example 1: The following query cannot use the index of the user_id column

Mysql> SELECT user_id from tb1 WHERE user_id+1 = 5;

It is easy to see where the start of the expression is equivalent to user_id=4, but MySQL cannot automatically parse the equation, which is entirely user behavior. We should develop the habit of simplifying where conditions, and always place the index columns on one side of the comparison symbol alone



Example 2: The following query cannot use the index of the Date_col column

Mysql> SELECT ... WHERE to_days (cuurrent_date)-To_days (date_col) <= 10

Workaround: Get the current time in the program minus 10 days, add the last value stored in Last_date_val

Mysql> SELECT ... WHERE Date_col <= Last_date_val



3.2 prefix index and index selectivity

Sometimes you need to index a long character column, which makes the index very slow. You can usually index the beginning of some characters, which can greatly save the index space, thus improving the efficiency of indexing. However, this also reduces the selectivity of the index.


index selectivity refers to the following: The ratio of non-repeating index values (also known as cardinality) to the total number of records (#T) of the data table, ranging from 1/#T到1之间. The higher the selectivity of the index, the more efficient the query, because a highly selective index allows MySQL to filter out more rows when it looks for it. The selectivity of the unique index is 1, which is the best index selectivity and the best performance

Understanding Index Selection: each row of records has an index value, if the prefix is too small, then the probability of duplicate index value is greater, so that when the query to query the more data, filtered out the less data, so to find a reasonable prefix.

the trick to a reasonable prefix is: Choose a long enough prefix to ensure high selectivity, but not too long (to save space). The prefix should be long enough to make the selectivity of the prefix index close to the index of the entire column. In other words, the "cardinality" of the prefix should be close to the "cardinality" of the complete column



3.2.1 How to choose the length of a reasonable prefix

Mysql> CREATE TABLE TB (city VARCHAR, not NULL);  Query OK, 0 rows affected (0.26 sec) mysql> SELECT COUNT (*) as CNT, city_name from City GROUP by City_name ORDER By CNT DESC LIMIT 10;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6F/1C/wKiom1WSExGRd5CCAACrdQakpA4784.jpg "title=" 3.png " alt= "Wkiom1wsexgrd5ccaacrdqakpa4784.jpg"/>


As you can see, each of the above values appears 5-7 times and now finds the most frequently occurring cities, starting with 3 prefix letters



Mysql> SELECT COUNT (*) as CNT, left (city_name,3) as pref from City GROUP by Pref ORDER by CNT DESC LIMIT 10;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6F/1C/wKiom1WSE0exPCeEAACEcYYoOCk265.jpg "title=" 4.png " alt= "Wkiom1wse0expceeaacecyyoock265.jpg"/>

Each prefix has more occurrences than the original city, so the unique prefix is much smaller than the single city. Then increase the length of the prefix, knowing that the selectivity of this prefix is close to the full column selectivity.


Mysql> SELECT COUNT (*) as CNT, left (city_name,7) as pref from City GROUP by Pref ORDER by CNT DESC LIMIT 10;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6F/19/wKioL1WSFW7x24VIAACGzV-o-A8032.jpg "title=" 5.png " alt= "Wkiol1wsfw7x24viaacgzv-o-a8032.jpg"/>



The method of calculating the length of the appropriate prefix is to calculate the selectivity of the complete column and to make the selectivity of the prefix close to the selectivity of the complete column. Here's how to calculate the selectivity of a complete column:


Mysql> SELECT COUNT (DISTINCT city_name)/count (*) from the city;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6F/1C/wKiom1WSE9aR2ytSAACnZqRYNFs678.jpg "title=" 6.png " alt= "Wkiom1wse9ar2ytsaacnzqrynfs678.jpg"/>

In general (although there are exceptions), this example, if the selectivity of the prefix is close to 0.4317, is basically available and can be calculated for different prefix lengths in a single query.

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6F/19/wKioL1WSFcHBWKvjAAHhOPOz4Ao061.jpg "title=" 7.png " alt= "Wkiol1wsfchbwkvjaahhopoz4ao061.jpg"/>

When the display length is 7, the prefix length is increased, the selectivity has not been lifted (or sometimes the elevation is very small).




3.2.2 To Create a prefix index

Mysql> ALTER TABLE City ADD [key| INDEX] City_name_index (City_name (7));

Prefix indexing is an efficient way to make indexes smaller and faster, but on the other hand there are drawbacks: MySQL cannot use the prefix index for order by and GROUP by, and cannot use the prefix index to do overwrite scans




3.3 Multi-column index

Many people do not have enough understanding of multi-column indexes. A common mistake is to create a separate index for each column, or create a multicolumn index in the wrong order

CREATE TABLE T (c1 int, C2 int, C3 int, key (C1), key (C2), Key (C3));

This indexing strategy is usually due to the fact that people are hearing something that says, "index the columns in the Where condition". is actually very wrong, so the best case can only be a "one-star" index, whose performance may be several orders of magnitude worse than the actual index. Sometimes if you can't design a "Samsung" index, you might as well ignore the WHERE clause, focus on optimizing the order of the indexed columns, or create a full-coverage index

Establishing separate single-column indexes on multiple columns in most cases does not improve the query performance of MySQL. The MySQL5.0 and later versions introduce a strategy called "Index merging", which can be used to locate the specified rows using multiple single-column indexes on the table to some extent.

Earlier versions of MySQL can only use one of these single-column indexes, and in this case there is no independent single-column index that is very effective. For example, table Film_actor has a single-column index on fields film_id and actor_id, but these two-column indexes are not a good choice for the where condition of the following query:


Mysql> SELECT film_id,actor_id from Film_actor WHERE actor_id=1or film_id=1;

In the old MySQL version, MySQL uses a full table scan for this query. Unless the following two queries are written.

Mysql> Select film_id,actor_id from Film_actor WHERE actor_id=1, UNION All, select film_id,actor_id from fil M_actor WHERE film_id=1 and actor_id<>1;




3.4 Select the appropriate index column order

3.4.1 Example Analysis

Mysql> SELECT * from payment WHERE staff_id = 2 and customer_id = 584;

Do you want to create an index (staff_id, customer_id) or reverse the order? You can do some queries to determine the distribution of values in this table and determine which column is more selective.

mysql> SELECT sum (staff_id=2), sum (customer_id=584) from payment;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6F/1D/wKiom1WSLdzSk0RpAADdFqYfkxg181.jpg "title=" 8.png " alt= "Wkiom1wsldzsk0rpaaddfqyfkxg181.jpg"/>

Mysql> SELECT COUNT (DISTINCT (staff_id))/count (*) as staff_id, COUNT (DISTINCT (customer_id))/count (*) as CU STOMER_ID, COUNT (*) from payment;

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6F/1E/wKiom1WSLhiy_4SQAADxTNmsqYQ085.jpg "title=" 9.png " alt= "Wkiom1wslhiy_4sqaadxtnmsqyq085.jpg"/>







650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/6F/1E/wKiom1WSLwKx5lvLABL1dzOLnB4167.jpg "title=" MySQL index. png "alt=" wkiom1wslwkx5lvlabl1dzolnb4167.jpg "/>













This article is from the "Everyman" blog, please make sure to keep this source http://caoyt.blog.51cto.com/9978141/1669369

MySQL index one (b-tree)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.