& Lt; high-performance mysql & gt; Note 2, high-performance mysql Note 2

Source: Internet
Author: User

<High-performance mysql> NOTE 2: High-Performance mysql Note 2

Reprinted Please note:TheViper http://www.cnblogs.com/TheViper

Mysql query statement Optimization in this article

  • Requested unnecessary data?

Typical Case: query records that are not required. When multiple tables are joined, all columns are returned. All columns are always retrieved and the same data is queried repeatedly.

  • Are you scanning additional records?

The simplest measure of query overhead.

  •  Access type

When evaluating the query overhead, you need to consider the cost of finding a row of data from the table. mysql has many ways to find and return a row of results. Some access methods may require scanning many rows to return a row of results, or some methods may return results without scanning.

In the EXPLAIN statement, the type column reflects the access type. There are many access types, from full table scan to index scan, range scan, unique index query, constant reference, and so on. The speed is from slow to fast, and the number of scanned rows is also small to large.

Therefore, we should try our best to avoid allowing each SQL statement to scan the entire table.

If you cannot find a proper access type for the query, the best solution is to add a suitable index, as mentioned in the previous article. The index allows mysql to find the desired records in the most efficient way and with the least number of rows scanned.

Generally, mysql has three ways to apply the where condition. From good to bad:

  • Use the where condition in the index to filter records that do not match. This is done at the storage engine layer.
  • Use index overwrite scan (using index appears in the extra column) to return records, filter unwanted records directly from the index, and return hit results. This is done at the mysql service layer, but there is no need to return to the table to query records.
  • Return data from the table, and then filter records that do not meet the conditions (where appears in the extra column). This is done at the mysql service layer. mysql needs to read the records from the data table and then filter them out.

If a large amount of data is scanned in the query, only a small number of rows are returned. You can try the following method to optimize it.

  • The index overwrite scan is used to place all the columns that need to be used in the index. In this way, the storage engine can return results without returning the corresponding rows to the table.
  • Change the table structure, for example, using a separate summary table
  • Rewrite this complex query so that the mysql optimizer can execute this query in a more optimized way.

 

  • Refactored Query Method

Is one complex query or multiple simple queries?

In traditional implementation, the database layer is always emphasized to do as much work as possible. The logic of this is that network communication, query parsing, and optimization are always considered a very high price.

However, this idea is not applicable to mysql. mysql is designed to make connections and disconnections very light and efficient in returning a small query result. In addition, the current network speed is much faster than before, regardless of bandwidth or latency. In some versions of mysql, more than 0.1 million queries per second can be run even on a common server. Even a gigabit Nic can easily meet more than 2000 queries per second.

 

Split Query

That is to say, divide and conquer. A large query is divided into small queries. Each query function is identical and only a small part of results are returned each time.

Deleting old data is a good example. When a large statement is used to clean up a large amount of data on a regular basis, a lot of data may be locked at a time, occupying the entire transaction log, system resources are exhausted and many small but important queries are blocked.

Therefore

 

 

 

Decomposition association query

Simply put, a single table query is performed for each table, and then the results are associated in the application. For example

It can be decomposed into the following queries to replace

At first glance, there is no benefit in doing so. In fact, there are the following advantages:

  • How to execute association query in mysql

The term "association" in mysql is more meaningful than in general. In general, mysql considers that any query is "associated" at a time, not only does one query need to match two tables for association. Therefore, in mysql, each query and segment (including subqueries, or even select based on a single table) may be associated.

The following describes how mysql performs Association queries.

First look at the union query. Mysql first places a series of individual query results in a temporary table, and then reads the temporary table data again to complete the union query. In the concept of mysql, each query is associated once, so the temporary table for reading results is also associated.

Mysql executes a nested loop association policy for any association, that is, mysql first fetches a single piece of data from a table in a loop, and then nested loops to the next table to find matching rows, until all matched rows exist in the table. Then, the columns required in the query are returned Based on the matched rows in each table.

 

The query starts from the actor table, which is automatically selected by the mysql associated query optimizer. Now, the STRAIGHT_JOIN keyword is used to prevent mysql from automatically optimizing the association.

This association order is reversed. We can see that after the reversal, the first joined table only needs to scan a small number of rows. In addition, the second and third joined tables are queried Based on the index, which is fast.

Finally, make sure that the expression in any group by and order by involves only the columns in a table, so that mysql can use the index to optimize the process.

  • Sorting Optimization

Sorting is a very costly operation. Therefore, from the performance perspective, we should avoid sorting as much as possible or avoid sorting a large amount of data.

The previous article explains how to sort by index. When you cannot use indexes to generate sorting results, mysql needs to sort the results by itself. If the data volume is small, it will be carried out in the memory. If the data volume is large, you need to use a disk. Mysql calls this process filesort ).

If you want to sort associated queries, mysql will process File Sorting in two cases.

1. If all columns in the order by clause are from the first joined table, mysql sorts the files when associating the first table. If so, the Extra field in the explain result will have Using filesort.

2. In all other cases, mysql will first store the association results in a temporary table, and then sort the files after all the associations are completed. In this case, the Extra field in the explain result will have Using temporary; Using filesort. If there is LIMIT in the query, the LIMIT will also be applied after sorting. Therefore, even if a small number of rows need to be returned, the temporary table and the data volume to be sorted will still be very large.

Mysql5.6 has made many important improvements here. When you only need to return partial sorting results, for example, using the LIMIT clause, mysql does not sort all results, but chooses to discard the results that do not meet the conditions based on the actual situation and then sorts them.

  •  Associate subquery

The implementation of mysql subqueries is very poor. The worst type of queries is the subquery statements containing in the where condition.

Mysql has a special optimization policy for the options in the in () list. It is generally considered that mysql will first execute a subquery. However, unfortunately, mysql will first forward the outer table to the subquery. For example

Mysql will change the query to this

 

As you can see, mysql scans the full table of film, and then executes subqueries one by one based on the returned film_id. If the outer table is a very large table, the query performance will be very bad. Of course, it is easy to rewrite this query, and you can directly use Association.

Another optimization method is to use the GROUP_CONCAT () function to construct a list separated by commas IN.

IN addition, we recommend that you use the IN () subquery equivalent to EXISTS.

  •  How to make good use of correlated subqueries

Not all associated subqueries have poor performance. Test and make your own judgment. Sometimes, subqueries are faster. For example, if only some columns in a table exist in the returned results, we assume that we want to return all movies that contain the same actors, because a movie will have many actors participating, some repeated records may be returned.

Remove duplicate records using DISTINCT and GROUP

If EXISTS is used, DISTINCT and group by are not required, and duplicate result sets are not generated. We know that once DISTINCT and group by are used, the temporary intermediate table is usually used during execution.

Test to see which method is faster

In this case, the subquery speed is faster.

  •  Optimal Optimization

The optimization of MIN (), MAX (), and mysql is not good, for example

Mysql cannot scan the primary key. Only full table scans are supported. You can use LIMIT to rewrite the query.

This allows mysql to scan as few tables as possible.

  •  Optimize group by and distinct

They can all use index optimization, which is also the most effective method. When indexes cannot be used, group by uses two policies: use temporary tables or File Sorting for grouping.

For associated query groups, the efficiency of table identifiers is usually higher than that of other columns. For example

The efficiency below is higher

This query utilizes the direct correlation between the actor name and id, so the rewritten results are not affected.

If not, use MIN (), MAX (). to bypass this restriction. However, it must be clear that the non-grouping columns after select must be directly dependent on the grouping column, and the values in each group are unique.

If it is true, write it like this.

However, the cost is a little high. Because the subquery needs to create and fill in temporary tables, the created temporary tables do not have any indexes.

  •  Optimize LIMIT Paging

The simplest way is to use index overwrite scanning as much as possible, instead of querying all columns. Then perform an association operation as needed, and then return the required columns. For example

If the table is very large, you 'd better rewrite it to this

Here, "delayed Association" greatly improves efficiency, allowing mysql to scan as few pages as possible, obtain the records to be accessed, and then return to all the columns required for the original table query based on the associated columns. This can also be used to optimize the limit in the association query.

Sometimes, you can convert a limit query to a query with a known location, so that mysql can obtain results through range scanning. For example

An index is created on a column and the boundary value is calculated in advance.

In addition, the limit and offset problems may cause mysql to scan a large number of unwanted rows and then discard them, such as select... limit, 20.

A work und is available. The Library will flip the page according to the rental record to obtain the first page.

Because the rental_id is incremental, and the viewing records start from the closest to the current time. The following page can be implemented using a query similar to the following

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.