When publishing a query for a selected row, MySQL analyzes it to see if it can be optimized to make it faster. This article will study how the query optimization program works. For more information, see "GettingMaximumPerformancefromMySQL" in MySQL Reference Guide. This article describes various optimization measures adopted by MySQL. (Http:
When publishing a query for a selected row, MySQL analyzes it to see if it can be optimized to make it faster. This article will study how the query optimization program works. For more information, see "Getting Maximum Performance from MySQL" in MySQL Reference Guide. This article describes various optimization measures adopted by MySQL. (Http:
When publishing a query for a selected row, MySQL analyzes it to see if it can be optimized to make it faster. This article will study how the query optimization program works. For more information, see "Getting Maximum Performance from MySQL" in MySQL Reference Guide. This article describes various optimization measures adopted by MySQL. (MySQL online reference guide at http://www.mysql.com/is constantly updated .)
The MySQL query optimization program uses indexes. Of course, it also uses other information. For example, if the following query is published, MySQL runs it very quickly, regardless of the size of the corresponding table:
SELECT * FROM tb1_name WHERE 1 = 0
In this case, MySQL examines the WHERE clause and does not search the table if it realizes that it cannot meet the query. You can use the EXPLAIN statement to know this. The EXPLAIN statement requires MySQL to display some information about which SELECT query should be executed but not actually executed. To use e x p l a I N, you only need to place the EXPLAIN statement before the SELECT statement, as shown below:
Explain select * FROM tb1_name WHERE 1 = 0
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 122 src = "/files/uploadimg/20051129/1532470 .JPG" width = 235>
Generally, EXPLAIN returns more information than this, including the index used to scan the table, the connection type to be used, and the number of rows to be scanned in each table.
1. How does the optimization program work?
The MySQL query optimization program has several goals, but its main goal is to make full use of indexes and use the most restrictive indexes to exclude as many rows as possible. This may be counterproductive because the purpose of releasing a SELECT statement is to find rows rather than reject them. The reason for the optimization program to work like this is that the faster the row is excluded from the row to be considered, the faster it will be to find the row that actually meets the given criteria. If you can perform the most restrictive test first, the query can be performed faster. Suppose there is a query for two columns, each column has an index:
WHERE coll = "some value" AND col2 = "some other value"
It is also assumed that there are 900 rows in line with the test on col1, 300 rows in line with the test on col2, and 30 rows in both tests. If you test c o l 1 first, you must check the 900 rows to find 30 rows that match the col2 value. 870 of the tests will fail. If you first test c o l 2, you only need to check 300 rows to find 30 rows that match the col1 value. The test failed 270 times, resulting in less computing and less disk I/O. The following guidelines are followed to help optimize the app's use of indexes:
1. Compare columns of the same type. When using index columns in comparison, you should use columns of the same type. For example, CHAR (10) is considered to be the same as CHAR (10) or VARCHAR (10), but different from CHAR (12) and VARCHAR (12 ). INT is different from BIGINT. Before MySQL 3.23, you must use columns of the same type. Otherwise, the index of the column does not work. This is not strictly required since version 3.23, but the same column type provides better performance than different types. If the types of the two columns are different, you can use the alter table statement to modify one of them to match their types.
2. Try to make the index column independent during the comparison. If a column is used in a function call or arithmetic expression, MySQL cannot use such an index because it must calculate the expression value for each row. Sometimes this is inevitable, but in many cases, you can rewrite the query that only takes the index column itself. The WHERE clause below illustrates how to do this. In the first line, the optimizer simplifies expression 4/2 as value 2, and then uses the index on my_col to quickly locate a value smaller than 2. In the second expression, MySQL must retrieve the my_col value of each row, multiply it by 2, and then compare the result with 4. No index is available, because each value in the column must be retrieved to evaluate the expression on the left:
WHERE my_col< 4/2
WHERE my_col * 2 <4
Let's consider another example. Assume that there is an index column date _ c o l. If the following query is published, the corresponding index is not used:
SELECT * FROM my_tb1WHERE YEAR (date_col) <1990
The expression does not compare the index column with 1990, but uses the value calculated from the column value for comparison, and must calculate the value of each row. The result is that the index on date_col cannot be used. How can this problem be solved? Use a text date. The index on date_col will be used:
WHERE date_col <"1990-01-01"
However, if no specific date value exists, you may be interested in finding a record with a date that has occurred within a certain number of days before. There are several ways to write such a query, but not all methods are good. Three possible methods are as follows:
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 69 src = "/files/uploadimg/20051129/1532471 .JPG" width = 502>
The index cannot be used in the first row, because columns must be retrieved for each row TO calculate the value of TO _ DAYS (date_col. The second line is better. Both c ut o ff and TO _ day s (CURRENT _ DATE) are constants, so the optimization program can calculate the right side of the comparison expression once before query processing, rather than once per line. However, the date_col column still appears in a function call, so no index is used. The third line is the best method. The right side of the comparison expression can be calculated as a constant before the query is executed, but now its value is a date. This value can be directly compared with the value of date_col, and does not need to be converted to days. indexes can be used.
■ Do not use wildcards at the beginning of the LIKE mode. Sometimes, some people use the WHERE clause in the following form to search for strings:
WHERE col_name LIKE "% string %"
If you want to find s t r I n g, no matter where it appears in the column, this is correct. But do not add "%" on both sides of the string out of habit ". If you only want to find the string that appears at the beginning of the column, you should not need the first "%. For example, if you look for the starting surname of "M a c" in a column containing the surname, you should write the following WHERE clause:
WHERE last_name LIKE "Mac %"
The optimizer considers the starting text part of the pattern and then uses the index to find the matching row. However, we would rather write it as the following expression, which allows the index on last_name:
WHERE last_name> = "Mac" AND last_name <"Mad"
This optimization does not work for pattern matching using the REGEXP operator.
■ Help the optimization program to better evaluate the effectiveness of the index. Lack of time, if you compare the value in the index column with the constant, the optimizer assumes that the key words are evenly distributed in the index. The optimizer will also perform a quick check on the index to estimate how many entries should be used when determining whether the corresponding index should be used for constant comparison. You can use the -- analyze option of myisamchk or isamchk to provide better information for the optimizer to analyze the distribution of key values. Myisamchk is used for MyISAM tables, and isamchk is used for ISAM tables. To complete key-value analysis, you must be able to log on to the MySQL server host and have write access to the table file.
■ Use EXPLAIN to test the optimization program operation. Check whether the index used for the query can quickly exclude rows. If not, use STRAIGHT_JOIN to forcibly use tables in a specific order to complete a connection. The query execution method is not so obvious; MySQL may have many reasons not to use the index in the order you think is the best.
■ Other forms of test queries, and run them more than once. When testing other forms of a query, you should run each method several times. If you run only one query for each of the two different methods, the second query is usually faster, because the information from the first query is in the disk cache, you do not need to read data from the disk. You should also try to run the query when the system load is relatively stable to avoid being affected by other activities in the system.
2 ignore Optimization
This may sound a bit strange, but in the following cases, we need to abolish the MySQL optimization function:
Force MySQL to slowly Delete table content. When you need to completely DELETE a table, using the DELETE statement without the WHERE clause to DELETE the entire table is the fastest, as shown below:
Delete from tb1_name
MySQL optimizes DELETE in this special case. It uses the table description in the table information file to create an empty data file and an index file from the beginning. This optimization makes the DELETE operation extremely fast, because MySQL does not need to DELETE each row separately. However, in some cases, this will produce unnecessary negative effects:
■ The number of rows involved in the MySQL report is zero, even if the table is not empty. This is often irrelevant (although it may be confusing if you have no preparation), it is inappropriate for applications that really need to know the number of real rows.
■ If the table contains an AUTO_INCREMENT column, the sequence number of the column starts from 1. This is a real thing, even after the AUTO_INCREMENT processing is improved in MySQL3.23. For an introduction to this improvement, see the "use sequence" section in Chapter 2nd. You can add the WHERE 1> 0 clause to "not optimize" the DELETE statement ".
Delete from tb1_name WHERE 1> 0
This forces MySQL to delete data row by row. The query execution is much slower, but the number of rows actually deleted is returned. It also keeps the number of the current AUTO_INCREMENT sequence, but only applies to the MyISAM table (available in Versions later than MySQL3.23. For ISAM tables, the sequence will be reset.
■ Avoid the update cycle from terminating. If an index column is updated, if the column is used for the WHERE clause and the index value is updated to a value range that has not been exceeded so far, the updated row may not be terminated. Assume that the table my_tbl has an indexed integer column key _ c o l. The following queries may cause problems:
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 25 src = "/files/uploadimg/20051129/1532472 .JPG" width = 459>
The solution to this problem is to use key_col in the where clause for an expression so that MySQL cannot use the index:
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 29 src = "/files/uploadimg/20051129/1532473 .JPG" width = 478>
In fact, another method is to upgrade to MySQL 3.23.2 or a later version, which has solved this problem.
Search results in random order. Order by rand () can be used to randomly sort the results since MySQL3.23.3. Another technology is very useful for older MySQL versions, that is, to select a random series and sort the columns. However, if you write a query as follows, the optimization program will disappoint your desire:
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 27 src = "/files/uploadimg/20051129/1532474 .JPG" width = 508>
The problem here is that MySQL considers this column as a function call and considers the corresponding column value as a constant, and optimizes the order by clause to invalidate this query. A table column can be referenced in an expression to cheat the optimizer. For example, if there is a column named age in the table, you can write the following query:
498) this. width = 498; 'onmousewheel = 'javascript: return big (this) 'height = 27 src = "/files/uploadimg/20051129/1532475 .JPG" width = 523>
Ignore the table connection sequence of the optimizer. You can use the STRIGHT_JOIN force optimizer to use tables in a specific order. In this case, you should specify the table order so that the first table is the table with the least number of rows selected from it. (If you are not sure which table meets this requirement, you can use the table with the maximum number of rows as the first table .) In other words, we recommend that you specify the table order so that the most restrictive choice appears first. The earlier candidate rows are excluded, the faster the query is executed. Make sure to test the corresponding query twice. The Optimizer may not connect the table as you imagined for some reason, and the STRAIGHT_JOIN may not actually work.