SQL Learning Note----------MySQL Index optimization all in all

Source: Internet
Author: User
Tags bulk insert mysql index

An index is a particular sort of algorithm for a particular MySQL field, such as a two-tree algorithm and a hashing algorithm, which is created by creating eigenvalues and then quickly finding them based on eigenvalues. And the most used, and is the MySQL default is the binary tree algorithm BTREE, through the BTREE algorithm to build indexed fields, such as scanning 20 lines can get unused BTREE before scanning the results of the 2^20 line, the specific implementation of this blog will be an algorithm topic there will be specific analysis and discussion ;

Explain optimized query detection

Explain can help developers analyze SQL problems, explain shows how MySQL uses indexes to process SELECT statements and join tables to help select better indexes and write more optimized query statements.

Using the method, add explain to the SELECT statement:

Explain select * from blog where false;

Before executing a query, MySQL parses each SQL issued, deciding whether to use an index or a full table scan if sending a select * from blog where Falsemysql does not perform a query operation. Because after the analysis of the SQL parser, MySQL has been clear that there will be no statement to comply with the operation;

Example

mysql> EXPLAIN Select ' Birday ' from ' user ' WHERE ' birthday ' < ' 1990/2/2 '; --Results: id:1 select_type:simple--Query type (simple query, federated query, subquery) Table:user--Shows the data of this row is about which table Type:range--Interval index (in the number less than 1990/2/2 interval This is an important column that shows what type of connection is used. The best to worst connection types are system > Const > EQ_REF > Ref > Fulltext > Ref_or_null > Index_merge > Unique_subquery &G T Index_subquery > Range > Index > All,const represents a single hit, and all represents a scan of the full table before the result is determined. In general, it is best to ensure that the query reaches at least the range level, preferably ref. Possible_keys:birthday--Indicates which index MySQL can use to find rows in the table. If it is empty, there is no index associated with it.  To improve performance, you can examine where clauses to see if some fields are referenced, or check that the fields are not appropriate for the index. Key:birthday--The index that is actually used. If NULL, the index is not used. If it is primary, it means that the primary key is used. Key_len:4--the longest index width. If the key is null, the length is null.  With no loss of accuracy, the shorter the length, the better Ref:const--shows which field or constant is used together with the key. Rows:1-This number indicates how much data MySQL will traverse to find and is inaccurate on InnoDB. Extra:using where; Using index-The execution status description, the bad examples you can see here are using temporary and using  

Select_type

    1. Simple select (Do not use union or subquery)
    2. Primary the outermost Select
    3. The second or subsequent SELECT statement in a Union union
    4. Dependent the second or subsequent SELECT statement in the Union Union, depending on the outside query
    5. The result of Union result union.
    6. Subquery The first select in a subquery
    7. Dependent the first select in a subquery subquery, depending on the outside query
    8. Derived the select of the exported table (subquery FROM clause)

Extra and type detailed description

  1. Distinct: Once MySQL finds a row that matches a row, it no longer searches for
  2. Not Exists:mysql optimizes the left join, and once it finds a row that matches the left join standard, it no longer searches for
  3. Range checked for each Record (index map:#): No ideal index was found, so for every combination of rows from the preceding table, MySQL examines which index to use and uses it to return rows from the table. This is one of the slowest connections to use the index
  4. Using Filesort: When you see this, the query needs to be optimized. MySQL requires additional steps to find out how to sort the rows that are returned. It sorts all rows based on the connection type and the row pointers for all rows that store the sort key values and matching criteria.
  5. Using index: Column data is returned from a table that uses only the information in the index and does not read the actual action, which occurs when all the request columns of the table are part of the same index
  6. Using temporary When you see this, the query needs to be optimized. Here, MySQL needs to create a temporary table to store the results, which usually occurs on an order by on a different set of columns, rather than on the group by
  7. Where used uses a WHERE clause to restrict which rows will match the next table or return to the user. If you do not want to return all the rows in the table, and the connection type all or index, this occurs, or the query has a problem the interpretation of different connection types (sorted in order of efficiency)
  8. The system table has only one row: the system table. This is a special case of the const connection type
  9. Const: The maximum value of a record in a table can match this query (the index can be a primary key or a unique index). Because there is only one row, this value is actually a constant, because MySQL reads the value first and treats it as a constant.
  10. Eq_ref: In a connection, when MySQL queries, from the previous table, the union of each record reads a record from the table, which is used when the query uses the index as the primary key or the unique key.
  11. Ref: This connection type occurs only if the query uses a key that is not a unique or primary key or is part of these types (for example, using the leftmost prefix). For each row union of the previous table, all records are read from the table. This type is heavily dependent on how many records are matched against the index-the less the better +
  12. Range: This connection type uses the index to return rows in a range, such as what happens when you use > or < to find something +
  13. Index: This connection type is fully scanned for each record in the previous table (better than all because the index is generally less than the table data) +
  14. All: This connection type is fully scanned for each previous record, which is generally bad and should be avoided as much as possible.

Where type:

    1. If it is only index, this means that information is retrieved only from the information in the index tree, which is faster than scanning the entire table.
    2. If it is a where used, the where limit is used.
    3. If it is impossible where means no where, it is generally not found out what.
    4. If this information shows the using Filesort or using temporary, then the where and order by indexes are often out of balance, and if the index is determined by where, then the order by will inevitably cause the using Filesort, it depends on whether to filter and reorder the cost, or first sort and then filter the cost.

Index

Type of index

Unique Unique index

Can not appear the same value, can have null value

Index Normal indexes

Allow the same indexed content to appear

PRIMARY Key PRIMARY Key index

The same value is not allowed and cannot be a null value, and a table can have only one Primary_key index

Fulltext Index full-text indexing

All three of these indexes work on the value of the column, but the full-text index can target a word in a value, such as a word in an article, but there is no egg, because only MyISAM and English support, and efficiency is not flattering, But this need can be done with third-party applications such as Coreseek and Xunsearch.

Curd of the Index

Creation of indexes

ALTER TABLE

Applies when the table is created and then added

ALTER table name ADD index type (unique,primary key,fulltext,index) [index name] (field name)

ALTER TABLE ' table_name ' ADD index ' index_name ' (' column_list ')--the index name, can not be, if not, the current index name is the field name; ALTER TABLE ' table_name ' Add UNIQUE (' column_list ') ALTER TABLE ' table_name ' Add PRIMARY KEY (' column_list ') ALTER TABLE ' TABLE_NAME ' ADD fulltext KEY (' column_list ')

CREATE INDEX

CREATE Index to add a normal or unique index to a table

--example, you can only add these two indexes; CREATE INDEX index_name on table_name (column_list) Create UNIQUE INDEX index_name on table_name (column_list)

In addition, you can add a table when you build it

Create TABLE ' test1 ' (   ' id ' smallint (5) UNSIGNED auto_increment not NULL,--note that the primary key index is created below, there is no need to create   ' username ' var char (+) NOT null COMMENT ' user name ',   ' nickname ' varchar (not null COMMENT ' nickname/Name ',   ' intro ' text,   PRIMARY KEY (' Id '),    UNIQUE key ' unique1 ' (' username '),--index name, can not, not just as column name   KEY ' index1 ' (' nickname '),   fulltext key ' Intro ' (' intro ')) Engine=myisam auto_increment=4 DEFAULT Charset=utf8 comment= ' backend user table ';

Deletion of indexes

Drop index ' index_name ' on ' talbe_name '  ALTER TABLE ' table_name ' DROP index ' index_name '-both are equivalent and are deleted table_name The index in the index_name; ALTER TABLE ' table_name ' drop PRIMARY key--Delete primary key index, note that primary key index can only be deleted in this way

View of the Index

Show index from TableName \g;

Changes to the index

Change the yarn and remove the rebuilt one to

Tips for creating indexes

1. Creating indexes on columns with high dimensions

The number of distinct values in the data column, the higher the dimension, the greater the

If the data table has 8 rows of data A, B, c,d,a,b,c,d the dimension of this table is 4

To create indexes for columns with high dimensions, such as gender and age, the dimension of the age is higher than the gender

Columns such as gender are not suitable for creating indexes because the dimensions are too low

2. Use indexes on columns that appear in Where,on,group By,order by

3. Use indexes on smaller data columns, which makes the index file smaller and allows more index keys to be loaded in memory

4. Use the prefix index for longer strings

5. Do not create indexes too much, except for additional disk space, which has a significant impact on the speed of DML operations because they have to be re-indexed every time they are incremented

6. Using a composite index, you can reduce the file index size and use it faster than multiple single-column indexes

Combined index and prefix index

Note that these two types of titles are a kind of salutation to the indexing technique, not the type of the index;

Combined index

What is the difference between a MySQL single-column index and a composite index?

To visually compare the two, first build a table:

CREATE TABLE ' Myindex ' (   ' i_testid ' INT not null auto_increment,    ' Vc_name ' VARCHAR (a) NOT null,    ' vc_city ' VA Rchar (a) is not null,    ' i_age ' int is not NULL,    ' i_schoolid ' int is not NULL,    PRIMARY KEY (' I_testid ')  );

Assuming that the table already has 1000 data, in these 10,000 records 7 8 in the field distribution of 5 vc_name= "Erquan" records, but the combination of City,age,school are different. Look at this T-sql:

SELECT ' I_testid ' from ' myindex ' WHERE ' vc_name ' = ' Erquan ' and ' vc_city ' = ' zhengzhou ' and ' i_age ' = 25; --Relevance search;

First consider building a MySQL single-column index:

An index was established on the Vc_name column. When executing T-SQL, MYSQL quickly locks the target on the 5 records of Vc_name=erquan and takes it out to a middle result set. In this result set, the first rule out vc_city not equal to "Zhengzhou" record, and then exclude i_age not equal to 25 of the record, and finally filtered out the only qualified records. Although the index is built on the vc_name, MySQL does not have to scan the whole table when querying, but the efficiency is improved, but there is a certain distance from our request. Similarly, the MySQL single-column indexes established separately in vc_city and i_age are similar in efficiency.

To further extract the efficiency of MySQL, it is necessary to consider building a composite index. is to build the Vc_name,vc_city,i_age into an index:

ALTER TABLE ' myindex ' ADD INDEX ' name_city_age ' (Vc_name (Ten), vc_city,i_age);

When the table is built, the length of the vc_name is 50, why is it used in 10? This is the prefix index that is mentioned below, because in general, the length of the name does not exceed 10, which speeds up the index query, reduces the size of the index file, and increases the update speed of the INSERT.

When executing T-SQL, MySQL does not need to scan any records to find a unique record!!

If you set up a single-column index on Vc_name,vc_city,i_age, so that the table has 3 single-column indexes, is the query as efficient as the combined index above? The answer is much different, much lower than our combined index. Although there are three indexes at this point, MySQL can only use the one that it considers to be the most efficient single-column index, the other two are not used, that is, a full-table scan process.

The establishment of such a composite index is actually equivalent to establishing a separate

    • Vc_name,vc_city,i_age
    • Vc_name,vc_city
    • Vc_name

Such a three combination index! Why is there no such combination index as vc_city,i_age? This is because MySQL combines the results of the "leftmost prefix" of the index. The simple understanding is only from the left to the beginning of the combination. It is not just that the combined index is used for queries that contain these three columns, and several of the following T-SQL is used:

SELECT * from Myindex whree vc_name= "Erquan" and vc_city= "Zhengzhou" select * from Myindex whree vc_name= "Erquan"

And the next few are not used:

SELECT * from Myindex whree i_age=20 and vc_city= "Zhengzhou" select * from Myindex whree vc_city= "Zhengzhou"

That is, Name_city_age (Vc_name (), vc_city,i_age) is indexed from left to right, and if there is no left front index MySQL does not perform an index query

Prefix index

If the index column length is too long, this column index will produce a large index file, not easy to operate, you can use the prefix index indexing prefix index should be controlled at a suitable point, control at 0.31 gold value (greater than this value can be created)

SELECT COUNT (DISTINCT ("title"))/count (*) from Arctic; -This value is greater than 0.31 to create the prefix index, distinct to repeat ALTER TABLE ' user ' ADD index ' uname ' (title (10)); -Increase prefix index SQL to set the index of person names at 10, which reduces index file size and speeds up index queries

What kind of SQL does not go index

To try to avoid these non-indexed SQL

Select ' sname ' from ' Stu ' where ' age ' +10=30;--does not use the index because all indexed columns participate in the calculation of SELECT ' sname ' from ' Stu ' where left (' Date ', 4) <1990 ; --Do not use the index because the function operation is used, the same principle as the above select * from ' Houdunwang ' where ' uname ' like ' backing% '-Go to index SELECT * from ' houdunwang ' where ' un Ame ' like '% backing% '--do not go index--the regular expression does not use the index, which should be well understood, so why it is difficult to see the reason for the REGEXP keyword in SQL--the string and the numeric comparison do not use the index; CREATE TABLE ' A ' (' A ' char (10)); EXPLAIN select * from ' a ' where ' a ' = ' 1 '-Go to index EXPLAIN select * from ' a ' where ' a ' = 1--Do not go index SELECT * FROM dept WHERE Dnam e= ' xxx ' or loc= ' xx ' or deptno=45--if there is or in the condition, it will not be used even if there is a conditional index. In other words, all the fields required for use must be indexed, and we recommend that you try to avoid using the OR keyword--if MySQL estimates that using a full table scan is faster than using an index, do not use the index

Index efficiency for multi-table associations

    • SELECT ' sname ' from ' Stu ' WHERE Left (' Date ', 4) <1990; -The index is not used because the function operation is used, and the principle is the same as above
    • SELECT * from ' Houdunwang ' WHERE ' uname ' like ' backing% '-Walk index
    • SELECT * from ' Houdunwang ' WHERE ' uname ' like '% backing% '-Do not go index

As can be seen, all tables are of type all, indicating full table index, that is 6 6 6, the total traversal of the query 216 times;

In addition to the first one representing the full table index (which must be associated with other tables), the remainder is a range (indexed interval), which is 6+1+1+1, which iterates through the query 9 times;

So we recommend that you try to join as few tables as possible, because it is a horror scan of a Cartesian product, and we also recommend that you use a LEFT join as much as possible with less association. Because the first table is the one that must be scanned with the join, Fewer connections can reduce the number of scans.

Disadvantages of indexing

Instead of blindly creating indexes, creating indexes on columns that are frequently queried, creating indexes can make query operations faster, but it can slow the increase, deletion, and update operations, because the index files are reordered or updated at the same time as these operations are performed;

However, in the Internet application, the query statement is much larger than the DML statement, can even account for 80%~90%, so do not care too much, just in the Big data import, you can delete the index, then bulk insert the data, and then add the index.

This paper draws on http://www.codeceo.com/article/mysql-index.html

SQL Learning Note----------MySQL Index optimization all in all

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.