Document directory
- 7.2.2 estimate query performance
7.2.2 estimate query performance
In most cases, you can estimate the query performance by counting the number of disk searches. For small tables, you usually only need to search a disk once to find the corresponding record (because the index may already be cached ). For a large table, this can be roughly estimated. It uses the B-tree for indexing. To find a record, the approximate search times are as follows:log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1
.
In MySQL, an index block is usually 1024 bytes, and the Data Pointer is usually 4 bytes. For a table with 500,000 records and an index length of 3 bytes (medium integer ),log(500,000)/log(1024/3*2/(3+4)) + 1
=4
Secondary search.
The index of this table requires about 500,000*7*3/2 = 2/3 MB of storage space (assuming of the typical index buffer). Therefore, more indexes should be stored in the memory, the corresponding record can be found only once or twice.
For writing, it takes about four (or more) searches to locate the new index location. When updating a record, it usually requires two searches.
Note that the performance of the application is not mentioned in the previous discussion because the log N value increases and decreases. As long as everything can be cached by the operating system or SQL Server, the performance will only decrease slightly because the data table is larger. When the data is getting bigger and bigger, it won't be able to put all the data in the cache, and it will get slower and slower, unless the application is restricted by disk search (it follows the logN). To avoid this situation, you can increase the index cache capacity as the data volume increases. PairMyISAM
For type tables, the index cache capacity is determined by system variables.key_buffer_size
. For details, see "7.5.2 tuning server Parameters ".
7.2.3
Select
Query speed
In general, to make a slow Select ... Where
The first thing to do with faster queries is to check whether indexes can be added. All accesses to different tables usually use indexes. AvailableEXPLAIN
Statement to determineSelect
Which indexes are used. For details, see "7.4.5 how MySQL uses indexes" and "7.2.1EXPLAIN
Syntax (get information aboutSelect
)".
Below are several common improvementsMyISAM
Advice on Table query speed:
- To make MySQL query optimization faster, you can execute a row after the data table has been fully loaded.
ANALYZE TABLE
Or runmyisamchk --analyze
Command. It updates the value of each index part, which means the average value of the same record (this value is always 1 for a unique index ). MySQL will decide which index to use based on this value when you connect two tables based on a non-constant expression. To view the results, runSHOW INDEX FROM tbl_name
ViewCardinality
Field Value.myisamchk --description --verbose
Displays the index distribution information.
- To sort data by an index, run
myisamchk --sort-index --sort-records=1
(If you want to sort Index 1 ). This is a good way to increase the query speed if you have a unique index and want to read records in sequence based on the index order. However, it takes a long time to sort a large table for the first time.
7.2.4 how to optimize MySQLWhere
Clause
This section describes how to deal with optimization programs.Where
Clause. The example usesSelect
Statement, but inDelete
AndUpdate
StatementWhere
Clause optimization is the same.
Note that the MySQL optimization work continues, so this chapter is not over yet. MySQL has done a lot of optimization work, not just the one mentioned in the document.
Some MySQL optimization practices are as follows:
- Remove unnecessary parentheses:
((a AND b) AND c or (((a AND b) AND (c AND d)))) -> (a AND b AND c) or (a AND b AND c AND d)
- Expand constant:
(a<b AND b=c) AND a=5 -> b>5 AND b=c AND a=5
- Remove constant conditions (required when expanding constants ):
(B>=5 AND B=5) or (B=6 AND 5=5) or (B=7 AND 5=6) -> B=5 or B=6
- Constant expression is calculated only once in the index.
- In a single table
COUNT(*)
Do not useWhere
ForMyISAM
AndHEAP
The table will retrieve the results directly from the table information. Create any table in a single tableNOT NULL
This is also the case when writing a query.
- An invalid constant expression is detected in advance. MySQL will quickly detect some impossible
Select
Statement and no records are returned.
- Useless
GROUP BY
Or grouping functions,HAVING
AndWhere
Merge (COUNT()
,MIN()
And so on ).
- Construct a concise
Where
Statement to get a fasterWhere
Calculate the value and skip the record as soon as possible.
- All common tables in the query are read earlier than other tables. A constant table meets the following conditions:
- Empty table or only one record.
- With
UNIQUE
Index, orPRIMARY KEY
OfWhere
Table used together by the clause. Here, all index parts are compared with constant expressions and the index part is definedNOT NULL
.
The following tables are regarded as common tables:
Select * FROM t Where primary_key=1; Select * FROM t1,t2 Where t1.primary_key=1 AND t2.primary_key=t1.id;
- MySQL may find the best connection method for table connection. If
orDER BY
AndGROUP BY
If all fields in the clause are from the same table, the table will be given priority during connection.
- If
ORDER BY
Clause and a differentGROUP BY
Clause, or ifORDER BY
OrGROUP BY
If the fields in are from other tables rather than the first table in the join sequence, a temporary table is created.
- If you use
SQL_SMALL_RESULT
MySQL will use the memory temporary table.
- All table indexes will be queried. The best case is that all indexes will be used, unless the optimization program deems the full table scan more efficient. At the same time, data table scanning is based on the determination that the best index range exceeds 30% of the data table. Now the optimization program is much more complicated. It is based on estimation of some additional factors, such as the table size, total number of records, and I/O block size, therefore, you cannot determine whether to use indexes or scan data tables directly based on a fixed percentage.
- In some cases, MySQL can retrieve records directly from indexes without querying data files. If all the fields used in the index are numeric, you only need to use the index tree to complete the query.
- Before each record is output
HAVING
Clause is skipped.
The following queries are very fast:
Select COUNT(*) FROM tbl_name;Select MIN(key_part1),MAX(key_part1) FROM tbl_name;Select MAX(key_part2) FROM tbl_nameWhere key_part1=constant;Select ... FROM tbl_nameorDER BY key_part1,key_part2,... LIMIT 10;Select ... FROM tbl_nameorDER BY key_part1 DESC, key_part2 DESC, ... LIMIT 10;
The following queries use the index tree. If the index fields are numeric:
Select key_part1,key_part2 FROM tbl_name Where key_part1=val;Select COUNT(*) FROM tbl_nameWhere key_part1=val1 AND key_part2=val2;Select key_part2 FROM tbl_name GROUP BY key_part1;
The following queries use indexes to obtain records sorted in sequence without the need for separate sorting steps:
Select ... FROM tbl_nameorDER BY key_part1,key_part2,... ;Select ... FROM tbl_nameorDER BY key_part1 DESC, key_part2 DESC, ... ;
7.2.5 how to optimize MySQL
OR
Clause
Index Merge
Methodref
,ref_or_null
, Orrange
The records obtained by the scan are merged and put together as a result. In this method, the table condition is orref
,ref_or_null
, Orrange
And these conditions can be used when different keys are used.
The "join" type is optimized starting from MySQL 5.0.0, which indicates that the index performance has a symbolic improvement. Because of the old rules, A database can only use one index for each referenced table.
InEXPLAIN
In the resulttype
The field is shownindex_merge
. In this case,key
The field contains the list of all used indexes andkey_len
The field contains the list of the longest indexed parts of the used index.
For example:
Select * FROM tbl_name Where key_part1 = 10 or key_part2 = 20;Select * FROM tbl_nameWhere (key_part1 = 10 or key_part2 = 20) AND non_key_part=30;Select * FROM t1,t2Where (t1.key1 IN (1,2) or t1.key2 LIKE 'value%')AND t2.key1=t1.some_col;Select * FROM t1,t2Where t1.key1=1AND (t2.key1=t1.some_col or t2.key2=t1.some_col2);
7.2.6 how to optimize MySQL
IS NULL
MySQL inCol_name IS NULL
Time andCol_name =
Constant_valueSame optimization. For example, MySQL uses the index or rangeIS NUL L
SearchNULL
.
Select * FROM tbl_name Where key_col IS NULL;Select * FROM tbl_name Where key_col <=> NULL;Select * FROM tbl_nameWhere key_col=const1 or key_col=const2 or key_col IS NULL;
IfWhere
The clause includesCol_name IS NULL
And the field is declaredNOT NULL
The expression is optimized. When a field may generateNULL
Value, it will not be optimized; for example, when it comes fromLEFT JOIN
To the right of the table.
MySQL 4.1.1 or latercol_name = expr AND col_name IS NULL
For additional optimization, subqueries are common.EXPLAIN
It is displayed when optimization works.ref_or_null
.
The optimizer will partially process any index IS NULL
.
In the following examples, we have optimized the fields.a
And tablet2
Mediumb
Indexed:
Select * FROM t1 Where t1.a=expr or t1.a IS NULL;Select * FROM t1,t2 Where t1.a=t2.a or t2.a IS NULL;Select * FROM t1,t2Where (t1.a=t2.a or t2.a IS NULL) AND t2.b=t1.b;Select * FROM t1,t2Where t1.a=t2.a AND (t2.b=t1.b or t2.b IS NULL);Select * FROM t1,t2Where (t1.a=t2.a AND t2.a IS NULL AND ...)or (t1.a=t2.a AND t2.a IS NULL AND ...);
ref_or_null
Read the reference key first, and then scan the key value independentlyNULL
.
Note that the optimizer only processes oneIS NULL
Level. In the following query, MySQL only uses keys to Query expressions.(t1.a=t2.a AND t2.a IS NULL)
But cannotb
Use the index on:
Select * FROM t1,t2Where (t1.a=t2.a AND t2.a IS NULL)or (t1.b=t2.b AND t2.b IS NULL);
7.2.7 how to optimize MySQL
DISTINCT
In many cases,DISTINCT
AndORDER BY
When used together, a temporary table is created.
Note that becauseDISTINCT
It may be used GROUP BY
, You need to understand that MySQLORDER BY
OrHAVING
The fields in the clause are not in the selected field list. For details, see "13.9.3GROUP BY
With Hidden Fields ".
WhenLIMIT row_count
AndDISTINCT
MySQL findsRow_countDifferent records will immediately stop searching.
If no fields from any table in the query are used, MySQL will stop searching for the tables that are not used after finding the first matching record. In the following examplet1
Int2
It is used before (you can useEXPLAIN
Analysis), MySQL willt2
The first record is not read.t2
(In order to be able and moderatet1
):
Select DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;
7.2.8 how to optimize MySQL
LEFT JOIN
And
RIGHT JOIN
A LEFT JOIN B join_condition
The implementation in MySQL is as follows:
- Table
B
Dependent on tablesA
And all tables it depends on.
- Table
A
Depends onLEFT JOIN
All tables in the condition (B
).
LEFT JOIN
Conditions are used to determine how to import data from a table.B
(In other words,Where
Any condition in the clause does not work ).
- All standard connection optimizations are performed. Except that a table is always read after all the tables it depends on. If this is a circular dependency, MySQL will regard it as wrong.
- All standards
Where
All optimizations are executed.
- If
A
One record matchesWhere
Clause,B
No matching records inON
Condition,B
Record, and all its field values are setNULL
.
- If you use
LEFT JOIN
To search for records that do not exist in some tables, andWhere
Some of them have detection conditions:col_name IS NULL
,col_name
Field definedNOT NULL
MySQL will find a matchLEFT JOIN
The condition record (used in combination with a specific index key) stops searching.
RIGHT JOIN
Implementation andLEFT JOIN
Similar, but the role of the table is reversed.
The connection optimizer calculates the order of table connections. The order in which the table is read isLEFT JOIN
Forcibly specified and usedSTRAIGHT_JOIN
It can help the connection optimization program to execute faster, because there will be fewer table queuing checks. Note: If you execute the following type of query, MySQL willb
Perform a full table scan becauseLEFT JOIN
Mandatoryd
Previously:
Select *FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)Where b.key=d.key;
To solve this problem, rewrite the query as follows:
Select *FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)Where b.key=d.key;
Starting from 4.0.14, MySQL does the following:LEFT JOIN
Optimization: IfNULL
RecordWhere
Condition alwaysFalse
, ThenLEFT JOIN
It becomes a normal connection.
For example, ift2.column1
The value isNULL
Then,Where
The result of the clause isFalse
Now:
Select * FROM t1 LEFT JOIN t2 ON (column1) Where t2.column2=5;
Therefore, this can be safely converted into a common connection query:
Select * FROM t1,t2 Where t2.column2=5 AND t1.column1=t2.column1;
This query is faster, because if there is a better query plan, MySQL willt1
Previously Usedt2
. You can useSTRAIGHT_JOIN
.