High-Performance MySql evolution (11): Optimization of common query statements bitsCN.com
Summarize the optimization methods of common query statements.
1 COUNT
1. Functions of COUNT
· COUNT (table. filed) indicates the number of records for which this field is not null.
· COUNT (*) or COUNT (not nullable field) indicates the number of rows in the entire table.
If you COUNT the number of records in the entire table, the efficiency of COUNT (*) is a little higher than that of COUNT (not nullable field ).
2. MYISAM COUNT
Generally, a large number of records need to be scanned during the COUNT operation. However, in the database of MyISAM engine, the database saves the number of table records, so COUN (*) it will be very fast (provided that the where condition is not included)
3. when you need to frequently use COUNT, you can consider using the summary table policy.
4. small Optimization examples
Tips for reducing the number of retrieved rows when performing a range query in MYISAM
Original: select count (*) from dictionary where id> 5.
After optimization: select (select count (*) fromdictionary)-count (*) from dictionary where id <= 5
Reduce query times
Before optimization: two statements are required.
Select count (*) from student where area = 'sh'
Select count (*) from student where area = 'BJ'
After optimization: merge into one
Select count (area = 'sh') as shcount, count (area = 'BJ ') as bjcount from student;
2. optimize Association query
1. make sure that the ON or USING statements are indexed.
2. Generally, you only need to create an index on the second table.
3. try to make the Group by/Order by expression contain only one table field
3. optimize subqueries
Try to replace subquery with Association
4. optimize Group by and Distinct
1. when performing the group by operation on the associated query, it is more efficient to use the ID column of the query table as the grouping condition.
2. when the field not specified by group by needs to be queried, it cannot be executed normally. it can be compensated by inner join.
select firstname, lastnamefrom actorinner join(select actor_id, count(*) as cnt from actor group by(actor_id))using (actor_id)
3. by default, group by sorts query results, which may consume resources when the data volume is large. if you do not care about the order of query results, order by null can be used to avoid unnecessary waste.
5 LIMIT pagination
During paging query, data is often extracted using the select * from table1 limit 120 or 20 Method. 100 data records are read during processing, and offset records are discarded, finally, 20 records are returned to the client. If the offset value is very large and may affect the efficiency, try
1. you can overwrite the index + inner join to rewrite the SQL statement.
select field1,field2,field3from table1 inner join(select id from table1 limit 100, 20) as temp using(id)
2. if you can calculate a clear start point and end point, you can convert it to the between and method. this method only scans the specified number of rows, which is more efficient.
Select * from table1 between 100 and 120.
3. you can use location tags to reduce the number of records to be retrieved.
For example, starting from a certain position. Select * from table1 whereid & gt; 100 limit 20
The efficiency comparison of the three methods is listed.
When processing pages, you often need to know the total number of records, and then use these total numbers to generate page numbers. The total number of records obtained is usually obtained using count or a full table query. this process also retrieves all records and then discards them. Two policies can be adopted to avoid such waste.
· Change the page number to the next page, so that you only need to retrieve a fixed number of entries.
· Read 1000 records at a time. when one thousand records are used up, you can use the "get more records" method to retrieve 1000 more records.
6 UNION
· Push every optimization method down to every subset (http://blog.csdn.net/eric_sunah/article/details/17290641) when using)
· The Union operation performs the distinct operation on the processed results, which is unnecessary in many cases. Union all can be used to avoid this problem.
7. Custom variables
Reasonable and flexible use of user-defined variables often results in unexpected performance of the program, but it often brings compatibility issues with other database systems.
The following are examples of how to use custom variables:
· Row number
Mysql> set @ rownumber: = 0;
Mysql> select mean, @ rownumber: = @ rownumber + 1 from dictionary limit10;
· Avoid repeated query of the updated data
After updating a record, you often need to execute the select statement again to query the updated record.
This problem can be avoided through variables.
Mysql> set @ updaterow: = null;
Mysql> update dictionary set mean = 'update get variable' where id = 100and @ updaterow: = now ();
· Count the number of updates and inserts
Mysql> set @ x: = 0; // define avariable
Mysql> insert into dictionary (id, mean) values (3, 'duplicate') onduplicate key update mean = values (mean) + (0 * (@ x: = @ x + 1); // insert a duplicaterecord
Mysql> select @ x; // get x value, it's indicator duplicate times
8. static analysis tools
Sometimes you can use a dedicated query analysis tool to discover your own problems, such as pt-query-advisor (http://www.percona.com/doc/percona-toolkit/2.1/pt-query-advisor.html)
BitsCN.com