MySQL Query optimization program

Source: Internet
Author: User
Tags date empty expression mysql mysql query first row versions mysql query optimization


4.2 MySQL Query optimization program
When you publish a query that selects rows, MySQL analyzes it to see if it can be optimized to make it perform faster. In this section, we'll look at how the query optimizer works. For more information, refer to "Getting Maximum performance from MySQL" in the MySQL Reference Guide, which describes the various optimizations used by MySQL
Oxfam The information in this chapter is constantly changing, because MySQL developers are constantly improving the optimizer, so it is necessary to visit the chapter frequently to see if there are any new tricks to exploit. (The MySQL online reference guide at http://www.mysql.com/is constantly being updated.) )
The MySQL query optimizer took advantage of the index. Of course, it also uses other information. For example, if you publish the following query, MySQL will execute it very quickly, regardless of the size of the corresponding table:
SELECT * from tb1_name WHERE 1 = 0
In this case, MySQL examines the WHERE clause and does not search the table if it realizes that there is no way to satisfy the query. You can use the EXPLAIN statement to know this, and the EXPLAIN statement requires MySQL to show some information about the fact that it should execute a select query without actually executing it. In order to use e X P L I N, just place the explain before the SELECT statement, as follows:
EXPLAIN SELECT * from tb1_name WHERE 1 = 0

Typically, EXPLAIN returns more information than this, including the indexes that will be used to scan the table, the type of connection to be used, and the estimated number of rows that need to be scanned in each table, and so on.
How the 4.2.1 optimization program works
The MySQL query optimizer has several goals, but its primary goal is to use the index as much as possible and to use the most restrictive index to exclude as many rows as possible. This can be counterproductive because the purpose of publishing a SELECT statement is to look for rows, rather than rejecting them. The reason the optimizer works this way is that the faster the rows are excluded from the rows to be considered, the quicker it is to find a line that does conform to the given criteria. If the most restrictive test can be performed first, the query can proceed faster. If you have a query that tests two columns, there is an index on each column:
WHERE coll = "Some value" and col2 = "Some other value"
It is also assumed that there are 900 lines that match the tests on the col1, 300 lines that match the tests on col2, and two tests that pass 30 rows. If you first Test C o l 1, you must check 900 rows to find 30 rows that also match the col2 value. Then 870 of the tests will fail. If you first Test C o l 2, to find 30 rows that also match the col1 value, check only 300 rows. There were 270 failures in the test, which involved less computation and less disk I/O. Follow these guidelines to help the optimizer take advantage of the index:
Compares columns with the same type. When you take advantage of indexed columns in comparisons, you should use those columns of the same type. For example, char (10) is treated as the same as char (10) or varchar (10), but differs from char (12) and varchar (12). INT differs from bigint. Before the MySQL3.23 version, the same type of
column, otherwise the index on the column will not work. Since version 3.23, this is not strictly required, but the same column type provides better performance than the different types. If you compare two column types differently, you can use the ALTER TABLE statement to modify one of them to match their type.
You should try to make the index columns independent in the comparison. If you use a column in a function call or an arithmetic expression, MySQL cannot use such an index because it must evaluate the value of the expression for each row. Sometimes this is unavoidable, but in many cases, you can rewrite the query that takes only the indexed column itself. The following WHERE clause illustrates how this work is done. In the first row, the optimizer simplifies expression 4/2 to a value of 2, and then uses the index on My_col to quickly find a value less than 2. In the second expression, MySQL must retrieve the My_col value of each row, multiply by 2, and compare the result with 4. No indexes are available because each value in the column is retrieved so that the expression on the left can be evaluated:
WHERE My_col < 4/2
WHERE My_col * 2 < 4
Let's consider another example. If there is an indexed column date _ c o L. If you publish the following query, the corresponding indexes are not used:
SELECT * from My_tb1where Year (Date_col) < 1990
The expression does not compare the indexed column to 1990, but instead calculates the value from the column value for comparison, and must calculate this value for each row. As a result, indexes on Date_col are not available. How to solve? Using a text date, the index on the Date_col will be used:
WHERE Date_col < "1990-01-01"
But if you don't have a specific date value, you might be interested in finding a record with dates that appear in a certain number of days from now. There are several ways to write such queries, but not all of them are good. The three possible methods are as follows:

The first row cannot take advantage of the index because the column must be retrieved for each row so that the value of the To _ Days (Date_col) can be computed. The second line is better. C ut o ff and to _ Day S (current _ DATE) are both constants, so the right side of the comparison expression can be computed by the optimizer one time before the query is processed, rather than once per row. However, the Date_col column still appears in a function call, so no indexes are used. The third line is the best way. The right side of the comparison expression can be computed as a constant before the query is executed, but now its value is a date. This value can be directly compared to the value of the Date_col, no longer need to convert to days, the use of the index.
Do not use wildcard characters at the beginning of like mode. Sometimes, some people search for a string using the following form of a WHERE clause:
WHERE col_name like "%string%"
This is true if you want to find s t r i n g, regardless of where it appears in the column. But don't put "%" on both sides of the string out of habit. If you're looking for a string that's just the beginning of the column, you shouldn't have the first "%" number. For example, if you look for the last name of "M a C" in a column that contains a last name, you should write the following WHERE clause:
WHERE last_name like "mac%"
The optimizer considers the text portion of the start in the pattern, and then uses the index to find the line that matches it. However, it is better to write the following expression, which allows you to use the index on the last_name:
WHERE last_name >= "Mac" and Last_Name < "Mad"
This optimization does not work with pattern matching using the regexp operator.
Helps the optimizer to better evaluate the effectiveness of indexes. By default, if you compare the values in an indexed column with a constant, the optimizer assumes that the key word is evenly distributed in the index. The optimizer also makes a quick check of the index to estimate how many entries to use when determining whether the corresponding index should be used for a constant comparison. You can use the Myisamchk or Isamchk--analyze options to provide better information to the optimizer to analyze the distribution of key values. Myisamchk is used for MyISAM tables, isamchk for ISAM tables. In order to complete the key value analysis, you must be able to log on to the MySQL server host, and you must have write access to the table file.
Use explain to verify optimizer operations. Check if the index used in the query can quickly exclude rows. If not, you should try using Straight_join to force a table to be used in a particular order to complete a connection. The way the query is executed is less obvious; MySQL may have many reasons not to use indexes in the order you think best.
Test other forms of the query, and run them more than once. When you test other forms of a query, you should run each method several times. If you run only one query at a time for each of the two different methods, you will typically find that the second query is faster because the information from the first query is in the disk cache and does not need to be actually read from disk. You should also try to run queries when the system load is relatively stable to avoid being affected by other activities in the system.
4.2.2 Ignore optimization
This may sound a bit strange, but in the following cases, to abolish the MySQL optimization function:
Forced MySQL to slowly delete the contents of the table. When you need to completely delete a table, deleting the entire table with a DELETE statement without a WHERE clause is the quickest, as follows:
DELETE from Tb1_name
MySQL optimizes this special case of delete, which uses the table description in the table information file to create an empty data file and index file from scratch. This optimization makes the delete operation extremely fast because MySQL does not need to delete each row individually. In some cases, however, this can have some unnecessary negative effects:
The MySQL report covers a zero number of rows, even if the table is not empty. Most of the time it doesn't matter (though, if you're not prepared for it, it's confusing), but it's not appropriate for an application that really needs to know the actual number of rows.
If the table contains a auto_increment column, the ordinal number of the column starts from scratch in 1. This is the real thing, even after the improvement of auto_increment processing in MySQL3.23. For an introduction to this improvement, see the "Use Sequence" section in Chapter 2nd. You can increase the WHERE 1> 0 clause to "not optimize" the DELETE statement.
DELETE from Tb1_name WHERE 1> 0
This forces MySQL to do line-by-row deletion. The corresponding query executes much more slowly, but returns the number of rows that are actually deleted. It will also maintain the number of the current auto_increment sequence, although it is only valid for MyISAM tables (which are available for MySQL3.23 versions). For ISAM tables, the sequence will still be reset.
Avoid the update loop from not terminating. If you update an indexed column, it is possible to update a row that is updated if it is used in a WHERE clause and the update moves the index value to a range of values that have not been in the surplus yet. If the table my_tbl has an indexed integer column key _ C o L. The following queries can cause problems:

The solution to this problem is to use Key_col for an expression in the WHERE clause, so that MySQL is not indexed:

In fact, there are other ways to upgrade to MySQL3.23.2 or higher versions, and they have solved the problem.
Retrieves results in random order. Since MySQL3.23.3, the results can be sorted randomly by using order by RAND (). Another technique that is useful for older versions of MySQL is to select a random sequence and then sort on that column. However, if you write your query as follows, the optimizer will frustrate your desire to do so:

The problem here is that MySQL thinks the column is a function call, and the corresponding column value is considered a constant, and the ORDER BY clause is optimized to invalidate the query. You can refer to a table column in an expression to deceive the optimizer. For example, if you have a column named Age in the table, you can write the following query:

Ignores the table join order of the optimizer. You can use Stright_join to force the optimizer to use tables in a particular order. If you do this, you should specify the order of the tables so that the first table is the least number of rows from which to select. (If you are not sure which table meets this requirement, you can use the table with the largest number of rows as the first table.) In other words, the order of the tables should be set as far as possible so that the most restrictive options appear first. The sooner a possible candidate row is excluded, the faster the query executes. To ensure that the corresponding query is tested two times, there may be some reason why the optimizer does not connect the table in the way you think, and straight_join may not actually work.



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.