The basic content of the index is described in the previous article, and we continue to introduce the index optimization combat. Before introducing the index optimization practice, first introduce the two important concepts related to indexes, which are essential for index optimization.
This article is used to test the user table structure:
Important conceptual cardinality related to indexes
The number of single-column unique keys (distict_keys) is called cardinality.
SELECT COUNT(DISTINCT name),COUNT(DISTINCT gender) FROM user;
The total number of rows in the user table is the cardinality of the 5,gender column is 2, indicating that there are a large number of duplicate values in the Gender column, the Name column has a cardinality equal to the total number of rows, indicating that the name column does not have duplicate values.
Returns the scale of the data:
There are 5 data in the user table:
SELECT * FROM user;
Query the number of records that meet gender 0 (male):
Then the scale number of the returned record is:
Similarly, query the number of records with Name ' SWJ ':
The number of scales to return records is:
Now that's the problem, assuming that both the name and gender columns have indexes, SELECT * FROM user WHERE gender = 0; SELECT * FROM user WHERE name = ‘swj‘; can you hit the index?
Index details for user table:
SELECT * FROM user WHERE gender = 0;Without a hit index, note that the value of filtered is the number of scales of the return record we computed above.
SELECT * FROM user WHERE name = ‘swj‘;Hit the index index_name, because walk the index can find the record to query directly, so the value of filtered is 100
Conclusion:
The data in the returned table in 30% is indexed, and the full table scan is used to return more than 30% data. Of course, this conclusion is too absolute, is not absolute 30%, just a general range.
Back to Table
When an index is created on a column, the index contains the key value of the column and the rowid of the corresponding row for the key value. The table is called back by accessing the data in the table by rowID recorded in the index. Too many back tables can seriously affect SQL performance, if the number of tables too many, you should not go to the index scan, should go straight to the full table scan.
Explain the result of the command means that the Using Index table is not returned and the main data can be obtained by indexing. Using WhereIt means that you need to return the data to the table.
Index optimization Combat
There are times when the database is indexed, but it is not selected by the optimizer.
We can SHOW STATUS LIKE ‘Handler_read%‘; view the usage of the index by:
Handler_read_key: If the index is working, the value of Handler_read_key will be high.
Handler_read_rnd_next: The number of requests in the data file that reads the next line, if a large number of table scans are in progress, the value will be higher, indicating that the index is not used optimally.
Index optimization Rules :
If MySQL estimates that using an index is slower than a full table scan, the index is not used
The scale of the returned data is an important indicator, and the lower the scale, the more easily the index is hit. Keep in mind that the range value is--30%, and what is said later is based on the ratio of the returned data to less than 30%.
Leading fuzzy query cannot hit index
The Name column creates a normal index:
A leading fuzzy query cannot hit the index:
EXPLAIN SELECT * FROM user WHERE name LIKE ‘%s%‘;
Non-leading fuzzy queries can use indexes that can be optimized to use non-leading fuzzy queries:
EXPLAIN SELECT * FROM user WHERE name LIKE ‘s%‘;
An implicit conversion of the data type does not hit the index, especially if the column type is a string, be sure to enclose the character constant value in quotation marks
EXPLAIN SELECT * FROM user WHERE name=1;
EXPLAIN SELECT * FROM user WHERE name=‘1‘;
?
In the case of a composite index, the query condition does not contain the leftmost part of the indexed column (which does not satisfy the leftmost principle) and does not hit the eligible index
Name,age,status column to create a composite index:
ALTER TABLE user ADD INDEX index_name (name,age,status);
User Table Index Details:
SHOW INDEX FROM user;
Based on the leftmost principle, you can hit the composite index index_name:
EXPLAIN SELECT * FROM user WHERE name=‘swj‘ AND status=1;
Note that the left-most principle is not the order of the query criteria:
EXPLAIN SELECT * FROM user WHERE status=1 AND name=‘swj‘;
Instead, the leftmost column field of the index is included in the query condition:
EXPLAIN SELECT * FROM user WHERE status=2 ;
The Union, in, or both are capable of hitting the index and are recommended for use in.
Union:
EXPLAIN SELECT * FROM user WHERE status = 1
UNION ALL
SELECT * FROM user WHERE status = 2;
in:
EXPLAIN SELECT * FROM user WHERE status IN (1,2);
or:
EXPLAIN SELECT * FROM user WHERE status=1 OR status=2;
CPU consumption of the query: or > in >union
The condition that is separated by or, if the condition in the or before the index, and the subsequent column does not have an index, then the index involved will not be used
EXPLAIN SELECT * FROM payment WHERE customer_id = 203 OR amount = 3.96;
Because there is no index in the condition column after or, the subsequent query must take a full table scan, and in the case of a full table scan, there is no need for more than one index scan to increase IO access.
-
Negative criteria queries cannot use indexes and can be optimized for in queries.
Negative conditions are:! =, <>, not in, not exists, not like, and so on. The
Status column creates an index:
ALTER TABLE User ADD index Index_status (status);
User table Index details:
SHOW index from user;
Negative condition cannot hit cache:
EXPLAIN SELECT * from user WHERE status!=1 and Status! = 2;
The
can be optimized for in queries, but only if the sensitivity is high and the ratio of the returned data is less than 30%:
EXPLAIN SELECT * from user WHERE status in (0,3,4);
Scope criteria query can hit index
Scope conditions include:<, <=, >, >=, between, etc.
Status,age columns to create the index separately:
ALTER TABLE user ADD INDEX index_status (status);
ALTER TABLE user ADD INDEX index_age (age);
User Table Index Details:
SHOW INDEX FROM user;
The scope criteria query can hit the index:
EXPLAIN SELECT * FROM user WHERE status>5;
The range column can be used for indexes (the federated index must be the leftmost prefix), but the columns following the range column cannot be indexed, and the index is used up to one range column, if there are two range columns in the query condition that cannot be fully used in the index:
EXPLAIN SELECT * FROM user WHERE status>5 AND age<24;
If the range query and the equivalent query exist at the same time, the index of the equivalent query column is preferred:
EXPLAIN SELECT * FROM user WHERE status>5 AND age=24;
Database execution calculation does not hit index
EXPLAIN SELECT * FROM user WHERE age > 24;
EXPLAIN SELECT * FROM user WHERE age+1 > 24;
The computational logic should be placed as far as possible into the business layer processing, saving the CPU of the database while maximizing the hit index.
Querying with an overlay index, avoiding back tables
The queried column, the data can be obtained from the index, rather than through the row locator row-locator to the row, that is "queried column to be built by the index overlay", which can speed up the query.
Index details for user table:
Because the Status field is an indexed column, you can get the value directly from the index without having to return to the table query:
Using IndexRepresents the query from the index
EXPLAIN SELECT status FROM user where status=1;
When querying other columns, you need to return a table query, which is one of the reasons why you should avoid it SELECT * :
EXPLAIN SELECT * FROM user where status=1;
Indexed columns, not allowed to null
A single-column index does not have null values, the composite index does not have all null values, and if the column is allowed to be null, you might get a result set that does not meet expectations, so use a NOT NULL constraint and a default value.
The remark column builds the index:
ALTER TABLE user ADD INDEX index_remark (remark);
Is null to hit the index:
EXPLAIN SELECT * FROM user WHERE remark IS NULL;
is not NULL cannot hit index:
EXPLAIN SELECT * FROM user WHERE remark IS NOT NULL;
Although is null can hit the index, but NULL is itself not a good database design, should use not NULL constraints and default values
It is not appropriate to index on fields that are updated very frequently
Because the update operation changes the B + Tree, the index is rebuilt. This process is very expensive for database performance.
It is not appropriate to index on a field with a small degree of differentiation
Similar to gender-sensitive fields, indexing is not very meaningful. Performance is equivalent to full table scan because data cannot be effectively filtered. In addition, the optimizer does not choose to use the index in cases where the scale of the returned data is outside 30%.
A field with unique attributes on the business, even if it is a combination of multiple fields, must be a unique index
Although the unique index affects the insert speed, it is very obvious that the speed of the query is improved. In addition, even in the application layer to do a very good check control, as long as there is no unique index, in the case of concurrency, there are still dirty data generated.
When you associate a multi-table, ensure that the associated field must have an index
Avoid the following misconceptions when creating indexes
The more indexes the better, the more you think a query requires an index.
Ningquewulan that the index consumes space, severely slows updates, and adds speed.
Resist the unique index, that the uniqueness of the business needs to be resolved in the application layer through the "first check and then plug" approach.
Optimization is premature and can be optimized without knowing the system.
Summarize
For your own SQL query statements, try to use the explain command to analyze, do a SQL performance with the pursuit of programmers. The SQL capability is an important indicator of whether a programmer is reliable or not. As a back-end programmer, it's a deep thought.
Reference
The thing about the code.
Public Number:
Source: http://songwenjie.cnblogs.com/
Statement: This article for Bo Master Learning sentiment Summary, the level is limited, if improper, welcome correction. If you think it's good, just click on the "recommend" button below, thanks for your support. Reprint and quote please specify the source.
mysql--Index Optimization Combat