Mysql-High-Performance Index policy, mysql-High-Performance Index

Source: Internet
Author: User
Tags percona percona server

Mysql-High-Performance Index policy, mysql-High-Performance Index
Independent index:

An independent index means that an index Column cannot be part of an expression or a function parameter.

  Example 1:

SELECT actor_id FROM actor WHERE actor_id + 1 = 5 -- this method does not work even if an index is created on actor_id.

  Example 2:

SELECT... WHERE TO_DAYS (CURRENT_DATE)-TO_DAYS (date_col) <= 10 -- this is also an incorrect syntax

 

Multi-column index (combined index) & select the appropriate index column sequence:

Multiple-Column Indexes is also known as composite index.

When should I use multi-column indexes?
  • When the server performs intersection operations on Multiple indexes (usually with multiple AND conditions), it usually requires a multi-column Index containing all the relevant columns, rather than multiple independent single column indexes.
  • When the server needs to perform joint operations on Multiple indexes (usually with multiple OR conditions), it usually consumes a lot of CPU and memory resources in the cache, sort, and merge operations of algorithms. Especially when some of the indexes are not highly selective and need to be merged and scanned to return a large amount of data.
Rules for applying multiple-column indexes:

For example, (a, B, c) and abc are sorted, B is sorted under any segment of a, and c is sorted under any segment of B. The effective principle of multi-column indexes is that they are used in sequence from the past to the next. If an index is not used in the middle, the index at the front of the breakpoint takes effect, and the index after the breakpoint does not;

Note:(A, B, c) The multi-column index is different from (a, c, B). From the figure above, we can see that the link order is different;

Analyze several practical examples to enhance understanding

 

(0) select * from mytable where a = 3 and B = 5 and c = 4; -- the three indexes of abc are used in the where condition, and all play a role (1) select * from mytable where c = 4 and B = 6 and a = 3; -- this statement is listed to indicate that mysql is not so stupid, the conditional order in where is automatically optimized by mysql before query. The result is the same as that in the previous statement (2) select * from mytable where a = 3 and c = 7; -- a uses the index, B is useless, so c does not use the index effect (3) select * from mytable where a = 3 and B> 7 and c = 3; -- a is used, B is also used, and c is not used. In this case, B is a range value and a breakpoint, but it uses an index (4) select * from mytable where B = 3 and c = 4; -- because the index is not used, bc does not use the index effect here (5) select * from mytable where a> 4 and B = 7 and c = 9; -- a is not used when B is used, and c is not used (6) select * from mytable where a = 3 order by B; -- a uses the index, and B also uses the index effect in the result sorting. As mentioned above, B In any segment of a is sorted (7) select * from mytable where a = 3 order by c; -- a uses the index, but c does not play the sorting effect, because there is a breakpoint in the middle, you can see that filesort (8) select * from mytable where B = 3 order by a; -- B does not use the index, and a does not play the index effect in sorting.

There is an empirical rule on how to select the column sequence of an index: Put the most selective column in the forefront of the index. (Refer to ①)

When you do not need to consider sorting and grouping, it is usually good to put the most selective column in front. At this time, the index is only used to optimize the WHERE condition search.

 

Prefix index and index selectivity: prefix index can effectively reduce the size of index files and increase the index speed. However, prefix indexes also have their disadvantages:

1. You cannot use prefix indexes in oorder by or group;

2. They cannot be used as overwrite indexes ).

Syntax for creating prefix indexes:

Alter table table_name add key (column_name (prefix_length ));

Example:

Alter table city add key (cityname (7 ));

What is index selectivity? ①

The so-called index Selectivity refers to the ratio of non-repeated index values (also called Cardinality) to the number of table records (# T ).

Selectivity = Cardinality/# T

Obviously, the value range of selectivity is (0, 1]. The index value with higher selectivity is more valuable.

SELECT count (DISTINCT (title)/count (*) AS selecti.pdf FROM employees. titles;
  1. + ------------- +
  2. | Selecti.pdf |
  3. + ------------- +
  4. | 1, 0.0379 |
  5. + ------------- +

 

For example, the employees table has only one index <emp_no>. If we want to search for a person by name, we can only scan the entire table:

Explain select * FROM employees. employees WHERE first_name = 'Eric 'AND last_name = 'anido ';

+ ---- + ------------- + ----------- + ------ + --------------- + ------ + --------- + ------ + -------- + ------------- +
| Id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+ ---- + ------------- + ----------- + ------ + --------------- + ------ + --------- + ------ + -------- + ------------- +
| 1 | SIMPLE | employees | ALL | NULL | 300024 | Using where |
+ ---- + ------------- + ----------- + ------ + --------------- + ------ + --------- + ------ + -------- + ------------- +

 

In this way, the full table scan efficiency is very low, so we have two options to create an index for the name: <first_name> or <first_name, last_name>. Let's look at the selectivity of the two indexes:

SELECT count(DISTINCT(first_name))/count(*) AS Selectivity FROM employees;+-------------+| Selectivity |+-------------+|      0.0042 |+-------------+SELECT count(DISTINCT(concat(first_name, last_name)))/count(*) AS Selectivity FROM employees;+-------------+| Selectivity |+-------------+|      0.9313 |+-------------+

 

Obviously, the <first_name> selectivity is too low, and the <first_name, last_name> selectivity is good. But does first_name and last_name have a total length of 30? Is there a way to take both length and selectivity into account? You can use the first few characters of first_name and last_name to create an index. For example, <first_name, left (last_name, 3)> to check its selectivity:

SELECT count(DISTINCT(concat(first_name, left(last_name, 3))))/count(*) AS Selectivity FROM employees;+-------------+| Selectivity |+-------------+|      0.7879 |+-------------+

 

The selectivity is good, but the distance from 0.9313 is still a bit, so add the last_name prefix to 4:

SELECT count(DISTINCT(concat(first_name, left(last_name, 4))))/count(*) AS Selectivity FROM employees;+-------------+| Selectivity |+-------------+|      0.9007 |+-------------+

 

At this time, the selectivity is very good, and the index length is only 18, which is nearly half shorter than <first_name, last_name>. We will create This prefix index:

ALTER TABLE employeesADD INDEX `first_name_last_name4` (first_name, last_name(4));

 

Execute the query by name again to compare and analyze the results before the index creation:

SHOW PROFILES;+----------+------------+---------------------------------------------------------------------------------+| Query_ID | Duration   | Query                                                                           |+----------+------------+---------------------------------------------------------------------------------+|       87 | 0.11941700 | SELECT * FROM employees.employees WHERE first_name='Eric' AND last_name='Anido' ||       90 | 0.00092400 | SELECT * FROM employees.employees WHERE first_name='Eric' AND last_name='Anido' |+----------+------------+---------------------------------------------------------------------------------+

The performance improvement is significant, and the query speed is improved by more than 120 times.

Clustered index overwrites the redundant index and overwrites the unused Index

Unused indexes should be deleted. There are two tools to help locate unused indexes.

1. Open the userstates Server variable in Percona Server or MariaDB (disabled by default), and then let the Server run normally for a period of time. Then, query INFORMATION_SCHEMA.INDEX_STATISTICS to check the usage frequency of each index.

2. In Percona Toolkit, the pt-index-usage tool can read the query log, perform the EXPLAIN operation on each query in the log, and then print the report related to the index and query.

Index and lock

InnoDB can be shackled only when the row is accessed, and the index can reduce the number of lines accessed by InnoDB, thus reducing the number of locks.

InnoDB uses the share (read) lock on the secondary index, but the exclusive (write) Lock is required to access the primary key index. This eliminates the possibility of overwriting the index and makes the select for update much slower than the lock in share mode or non-locked query.

When InnoDB's primary key selection and insertion optimization use the InnoDB Storage engine, if there is no special need, always use an auto-increment field unrelated to the business as the primary key. Why?

Because InnoDB uses clustered indexes, data records are stored on the leaf nodes of the primary index (a B + Tree. This requires that each data record in the same leaf node (the size is a memory page or disk page) be stored in the primary key order. Therefore, when a new record is inserted, mySQL inserts an appropriate node and Location Based on its primary key. If the page reaches the load factor (InnoDB defaults to 15/16), a new page (node) is created ).

If the table uses an auto-incrementing primary key, the records are added to the subsequent positions of the current index node each time a new record is inserted. When a page is full, A new page is automatically created. As shown in:

 

In this way, a compact index structure will be formed, and the approximate order will be filled up. Because existing data does not need to be moved during each insertion, the efficiency is very high and there will be no additional overhead for maintaining indexes.

If a non-auto-incrementing primary key is used (if the ID card number or student ID number is used), because the value of the primary key inserted each time is similar to a random value, each new record is inserted to a certain position in the middle of the existing index page:

                          

At this time, MySQL had to move the data in order to insert the new record to the appropriate location, and even the target page may have been written back to the disk and cleared from the cache. At this time, it had to read back from the disk, this adds a lot of overhead, and frequent movement and paging operations cause a lot of fragmentation, resulting in a compact index structure. Later, we had to use optimize table to recreate the TABLE and OPTIMIZE the page filling.

Therefore, if you can, use the auto-incrementing field on InnoDB as the primary key.

References:

[1] Baron Schwartz; translated by Ninghai yuan; high-performance MySQL (version 3rd); Electronic Industry Press, 2013

Zhang Yang blog, http://blog.codinglabs.org/articles/theory-of-mysql-index.html

[3] anonymous blog, http://www.cnblogs.com/codeAB/p/6387148.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.