Prefix Index
(1) If multiple fields are used as the index (A, B, C) as a key, whether to use the index based on the matching condition during the query, A/, queries such as B/A, B, and C can be used, but prefixes such as B, A/A, C/C, and a cannot be used if they do not match, therefore, when writing SQL statements, you must note that
(2) When you need to generate a key for a long string, you can find the most common prefix of the most commonly used string (the first n notes are used as the key instead of the entire string as the key ), in this way, the retrieval efficiency is greatly improved, and the indexing maintenance overhead (storage space, update, etc.) is reduced at the expense of less accuracy (with excess data and then filtered)
If there is a million URL, the statistical results are as follows:
Select count (*) as CNT, URL from iurl group by URL order by cnt desc limit 10;
+ ----- + ------------------- +
| CNT | URL |
+ ----- + ------------------- +
| 380 | http://www.f.com, |
| 354 | http://www.r.com, |
| 340 | http://www.h.com, |
| 336 | http://www.u.com, |
| 336 | http://www.e.com, |
| 336 | http://www. I .com, |
| 332 | http://www.k.com, |
| 330 | http://www.t.com, |
| 330 | http://www.w.com, |
| 326 | http://www.l.com, |
+ ----- + ------------------- +
10 rows in SET (3.04 Sec)
Take the first n characters as the prefix, so that the matching effect is close to the full statistics (because both are starting with http: // www., it cannot be seen)
Select count (*) as CNT, left (URL, 11) from iurl group by URL order by cnt desc limit 10; -- left (URL, 11) gets the left n strings
You can also calculate the full column selectivity. The detailed method should be detailed.
Add prefix Index
Alter table iurl add key (URL (11 ));
Prefix index advantages: 1. Reduce index size 2. Improve speed, but there are also deficiencies: cannot be used for order by/group by, cannot be used as overwrite Index
Clustered Index
It is not a separate index, but a storage method. InnoDB supports B-tree and data rows, which store data rows in the leaf pages of the index. Each table has only one clustered index. InnoDB aggregates data by primary key.
Clustered index advantages:
1. related data may be stored together
2. Fast Data Access
3. You can use the primary key value that contains the leaf node for full-coverage index query.
Disadvantages:
1. clustering can maximize the performance of I/O intensive load, but if the data can be loaded into the memory, the sequence will not matter, so clustering will be useless.
2. The insert speed is heavily dependent on the insert Order (insert in the primary key order is the fastest)
3. High overhead for updating clustered Indexes
4. pagination may occur during insertion or update, resulting in more disk space.
5. Clustering table scanning is slower than full table Scanning
Overwrite Index
The leaf nodes of the index contain the data of their indexes, which can meet the Query Needs and do not need to read the row.
Advantages:
1. Index records are usually much smaller than the full row size. If you only read the index, the data access volume can be greatly reduced (MyISAM will also compress the index and facilitate loading to the memory)
2. indexes are sorted by index values. Therefore, I/O-intensive range access is much faster than randomly extracting each row of data from the disk.
3. cache indexes of most storage engines are better than cached data
4. It is particularly useful for InnoDB because it uses clustered indexes.
Note:
1. If you have selected all columns, you cannot use overwriting indexes.
2. The like operation cannot be performed.
Two-step queries may be used to merge indexes to extend coverage.
Select * from products
Join (
Select prod_id from products where actor = 'kejie 'and title like' % ABC % '-- the index will be overwritten in the first stage.
) As T1 on (t1.prod _ id = products. prod_id );
However, if the result set is too large or too small, the optimization will not be obvious, or it is better to query it directly. Therefore, you need to compare the results based on the actual service query and determine whether such optimization is required!
Note: Sorting queries that cannot use indexes are as follows:
1. Two different sorting directions
2. referenced a column not in the Index
3. The leftmost prefix of the index cannot be created.
4. The first column has a range condition.
... Where should al_date> '2017-05-25'
Compress Indexes
MyISAM uses prefix compression to reduce the index size and run more indexes into the memory to greatly improve performance in some cases.
However, there are also disadvantages: after compression, index search will be several times slower for CPU-intensive load, reverse scanning or even slower
Redundant and duplicate Indexes
Create Table Test (
Id int not null primary key,
Unique (ID ),
Index (ID)
);
MySQL uses indexes to implement the unique and primary key constraints. Therefore, three indexes are created on the same column!
It is much slower to Insert a new row into a table with more indexes. adding a new index will significantly affect the performance of insert, update, and delete. In particular, when the new index encounters memory restrictions
1. For data servers, consider having enough memory
2. For memory estimation, refer to the Data Index size (try to load all indexes into the memory)
Index policy summary
1. Check the most common Query
2. Avoid creating such errors before you know what queries will use indexes.
3. Create an optimized configuration for the index
1) Check the response time
2) try to extend the index anywhere, instead of adding an index.
Common techniques for high-performance queries
1. Use additional indexes
2. Redundant Fields
3. Accelerated reading of summary tables through cache tables
High-performance Query Optimization
1. Optimize Data Access
1) Reduce Unnecessary data requests to the server-> This can avoid additional burden on the server, network overhead, memory consumption, and CPU resources.
2) use limit to limit the number of returned records
3) remain skeptical about select *, whether it is really necessary-> will cause overwriting index failure
2. MySQL check
1) execution time
2) Number of checked rows
3) The number of returned rows-> check the type of explain: full table scan, index scan, range scan ), unique index lookup, constant)
--> Slow logs are written in all three modes, so browsing this log is the best way to search for excessive data.
Slow query log
To enable the slow query log, you must go to my. when you use the -- log-Slow-queries [= file_name] Option to start the configuration in CNF, mysqld writes a log file containing all SQL statements whose execution time exceeds long_query_time seconds.
Listing 1. Enabling MySQL slow query logs
[mysqld]; enable the slow query log, default 10 secondslog-slow-queries; log queries taking longer than 5 secondslong_query_time = 5; log queries that don't use indexes even if they take less than long_query_time; MySQL 4.1 and newer onlylog-queries-not-using-indexes If the data volume is large and few rows exist in the generated results, you can try more complex modifications: 1) overwrite indexes are used to store partial column data without reading the complete rows. 2) Change the architecture 3) rewrite complex queries so that the optimizer can optimize execution. Refactored Query Method 1. Split a complex Query into multiple simple queries-> note: this does not mean that a query that reads 10 rows is divided into 10 reads. 2. Shorten the query and execute a small part each time to reduce the number of affected rows. 3. Break down connections Select * From tag, Join tag_post on tag_post.tab_id = tag. ID Join post on tag_post.post_id = post. ID Where tag. Tag = 'mysql '; You can use the following statement instead: Select * From tag where tag = 'mysql '; Select * From tag_post where tag_id = 1234; Select * From post where post. ID in (123,456,909 ); Such decomposition has the following major advantages: 1) Higher cache Efficiency 2) one query for each table can use the table lock more efficiently, and it will not lock the table for a long time. 3) connecting on the application end makes it easier to expand the database. 4) reduces access to redundant rows. 5) query is more efficient. With in, MySQL can sort IDs and obtain data more efficiently. Conclusion: When is the connection efficiency higher on the application end? 1) a large amount of data that can be cached for early queries 2) use more MyISAM tables 3) data is distributed on different servers. 4) replace join with in for large tables 5) a join references the same table multiple times. Associate subquery 1. MySQL has poor Optimization on subqueries. The worst case is that in is used in the WHERE clause. Select * From sakila. film where film_id in ( Select file_id from sakila. film_actor where actor_id = 1 ); Mysql> explain select * From sakila. film where film_id in (select film_id from sakila. film_actor where actor_id = 1 ); + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------ + ------ + ------------- + | ID | select_type | table | type | possible_keys | key | key_len | ref | rows | extra | + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------ + ------ + ------------- + | 1 | primary | film | all | null | 1000 | using where | | 2 | dependent subquery | film_actor | eq_ref | primary, idx_fk_film_id | primary | 4 | const, func | 1 | using index | + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------ + ------ + ------------- +
You can also switch to exists Select * From sakila. film where exists ( Select file_id from sakila. film_actor where actor_id = 1 And film_actor.film_id = film. film_id ); Explain select * From sakila. film where exists (select film_id from sakila. film_actor where actor_id = 1 and film_actor.film_id = film. film_id ); + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------------------- + ------ + ------------- + | ID | select_type | table | type | possible_keys | key | key_len | ref | rows | extra | + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------------------- + ------ + ------------- + | 1 | primary | film | all | null | 1000 | using where | | 2 | dependent subquery | film_actor | eq_ref | primary, idx_fk_film_id | primary | 4 | const, sakila. Film. film_id | 1 | using index | + ---- + -------------------- + ------------ + -------- + ------------------------ + --------- + ------------------------- + ------ + ------------- + Rewrite to join mode: Select film. * From sakila. Film Inner join sakila. film_actor using (film_id) Where actor_id = 1; Mysql> explain Select film. * From sakila. Film inner join sakila. film_actor using (film_id) Where actor_id = 1; + ---- + ------------- + ------------ + -------- + ------------------------ + --------- + --------------------------- + ------ + ------------- + | ID | select_type | table | type | possible_keys | key | key_len | ref | rows | extra | + ---- + ------------- + ------------ + -------- + ------------------------ + --------- + --------------------------- + ------ + ------------- + | 1 | simple | film_actor | ref | primary, idx_fk_film_id | primary | 2 | const | 19 | using index | | 1 | simple | film | eq_ref | primary | 2 | sakila. film_actor.film_id | 1 | + ---- + ------------- + ------------ + -------- + ------------------------ + --------- + --------------------------- + ------ + ------------- + We can see that the results of using in/exists subqueries are the same. In terms of performance, 1000 + 1 rows are scanned, and only 19 + 1 rows are scanned using inner join !! In the mysqlslap test, the subquery mode for 100 concurrent queries takes 0.448 S, and the connection mode only uses 0.285, which doubles the speed. But it is not absolute, so it is best to use the data after testing to explain the problem, rather than absolute! Select/update a table at the same time MySQL cannot select or update a table at the same time, for example: Update TBL as outer_tbl Set CNT = ( Select count (*) from TBL as inner_tbl Where inner_tbl.type = outer_tbl.type ); One way to change the performance is to use a temporary table: Update TBL as outer_tbl Inner join (
Select Type, count (*) as CNT from TBL
Group by type ) As der using (type) Set TBL. CNT = der. CNT; Optimize specific queries 1. Count 1) only count the number of times the expression has a value !, Count (*) instead of Count (column) 2) If you want to measure a large amount of data, you can try to calculate the total number-No need Select count (*) from world. City where ID> 5; It can be rewritten: Select (select count (*) from world. City)-count (*) from world. City where ID <= 5; 2. Optimize connections 1) Make sure that the column on or using is indexed. 2) Make sure that group by or order by only uses one table column so that MySQL will use the index for these operations. 3. Optimize subqueries 1) Try to use join !! 2) The temporary table created by the subquery is not indexed. |