MySQL Query optimization series of lectures query optimizer (1)

Source: Internet
Author: User
Tags comparison mysql query first row mysql query optimization

When you submit a query, MySQL analyzes it to see if you can do some optimizations to make it faster to process the query. This section describes how the query optimizer works. If you want to know the optimization method used by MySQL, you can view the MySQL reference manual.

Of course, the MySQL query optimizer also uses the index, but it also uses some other information. For example, if you submit a query like this, MySQL can execute it very quickly no matter how large the datasheet is:

SELECT * FROM tbl_name WHERE 0;

In this example, MySQL looks at the WHERE clause and recognizes that there are no rows of data that match the query criteria, so the search data table is not considered at all. You can see this by providing a explain statement that allows MySQL to display some information about a select query that it executes but does not actually execute. If you want to use explain, just place the explain word in front of the SELECT statement:

mysql> EXPLAIN SELECT * FROM tbl_name WHERE 0\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Impossible WHERE

Typically, explain returns more information than the above, as well as non-empty (null) information for scanning the data table's indexes, the type of join used, and the number of rows of data in each datasheet that are estimated to be checked.

How the optimizer works

The MySQL query optimizer has several goals, but the main goal is to use the index as much as possible and use the strictest indexes to eliminate as many rows of data as possible. Your ultimate goal is to submit a SELECT statement to find rows of data instead of excluding rows of data. The reason the optimizer tries to exclude rows of data is that the faster it excludes rows of data, the faster the data rows that match the criteria are found. If you can do the most rigorous testing first, the query can execute faster. Suppose your query examines two data columns, each of which has an index:

SELECT col3 FROM mytable
WHERE col1 = ’some value’ AND col2 = ’some other value’;

Assuming that the test on col1 matches 900 rows of data, the test on col2 matches 300 rows of data, while the test at the same time gets only 30 rows of data. Test Col1 will have 900 rows of data, you need to check that they find 30 of them and col2 in the matching records, 870 of which failed. Test col2 will have 300 rows of data, you need to check that they found 30 of them and the values in the col1 matching records, only 270 failed, so need less calculation and disk I/O. As a result, the optimizer tests the col2 first because it is less expensive to do so.

You can use the following guidance to help the optimizer better utilize indexes:

Try to compare data columns with the same data type. When you use indexed data columns in a comparison operation, use columns of the same data type. The same data type is a bit higher than the performance of different types. For example, int is different from bigint. char (10) is considered to be char (10) or varchar (10), but differs from char (12) or varchar (12). If you are comparing different types of data columns, you can use ALTER TABLE to modify one of them to match their type.

Make the index columns independent in the comparison expression as much as possible. If you use a data column in a function call or in a more complex arithmetic expression condition, MySQL will not use the index because it must compute the expression value for each data row. Sometimes this is unavoidable, but in many cases you can rewrite a query to make the index columns appear independently.

This situation is shown in the WHERE clause below. They have the same functionality, but there are significant differences in the optimization goals:

WHERE mycol < 4 / 2
WHERE mycol * 2 < 4

For the first row, the optimizer simplifies expression 4/2 to 2, and then uses the index on the MyCol to quickly find values less than 2. For the second expression, MySQL must retrieve the MyCol value of each data row, multiply by 2, and then compare the result with 4. In this case, the index is not used. Each value in the data column must be retrieved to calculate the value to the left of the comparison expression.

Let's look at another example. Let's say you indexed the Date_col column. If you submit a query such as the following, you will not use this index:

SELECT * FROM mytbl WHERE YEAR(date_col) < 1990;

This expression does not compare 1990 to the indexed column, and it compares 1990 to the value computed by the data column, and each data row must compute the value. As a result, the index on the date_col is not used because executing such a query requires a full table scan. How to solve this problem? You only need to use the text date, and then you can use the index on the date_col to find the matching values in the column:

WHERE date_col < ’1990-01-01’

However, suppose you don't have a specific date. You may want to find a record of dates that are fixed several days apart today. There are a number of ways to express this type of comparison--they are different in efficiency. Here are three kinds:

WHERE TO_DAYS(date_col) - TO_DAYS(CURDATE()) < cutoff
WHERE TO_DAYS(date_col) < cutoff + TO_DAYS(CURDATE())
WHERE date_col < DATE_ADD(CURDATE(), INTERVAL cutoff DAY)

The index is not used for the first row because each data row must be retrieved to calculate the value of To_days (Date_col). The second line is better. Cutoff and To_days (Curdate ()) are constants, so the right side of the comparison expression can be calculated one at a time without requiring each data row to be evaluated before the query is processed. However, the Date_col column still appears in the function call, which prevents the use of the index. The third line is the best of the few. Similarly, the right side of the comparison expression can be calculated as a constant before the query is executed, but now its value is a date. This value can be directly compared to the Date_col value and no longer needs to be converted. In this case, the index is used.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.