Optimization analysis of Mysql group by statements

Source: Internet
Author: User

BY default, MySQL sorts all group by col1, col2,.... The query method is as follows: specify order by col1, col2,... in the query ,.... If the statement explicitly contains an order by clause containing the same columns, MySQL can optimize it without slowing down, even though it still performs sorting.

If the query includes group by but you want to avoid consumption of sorting results, you can specify order by null to prohibit sorting. For example:

The code is as follows: Copy code

Insert into foo

SELECT a, COUNT (*) FROM bar group by a order by null;


The most common method for executing the group by clause: first scan the entire table and then create a new temporary table. All rows in each GROUP in the table should be continuous, finally, use the temporary table to locate the group and apply the clustering function (if there is a clustering function ). In some cases, MySQL can obtain results by accessing the index without creating a temporary table. The EXPLAIN output of this type of query shows that the value of the Extra column is Using index for group-.

I. Loose index scanning

1. Conditions met

Query a table.

Group by uses the leftmost prefix of the index.
Only MIN () and MAX () clustering functions can be used, and they all point to the same column.
2. Example

Table t1 (c1, c2, c3, c4) has an index idx (c1, c2, c3 ):

The code is as follows: Copy code

SELECT c1, c2 FROM t1 group by c1, c2;

Select distinct c1, c2 FROM t1;

SELECT c1, MIN (c2) FROM t1 group by c1;

SELECT c1, c2 FROM t1 WHERE c1 <const group by c1, c2;

Select max (c3), MIN (c3), c1, c2 FROM t1 WHERE c2> const group by c1, c2;

SELECT c2 FROM t1 WHERE c1 <const group by c1, c2;

SELECT c1, c2 FROM t1 WHERE c3 = const group by c1, c2;

Example of a condition not met:

1. In addition to MIN () or MAX (), there are other cumulative functions, such:

The code is as follows: Copy code

SELECT c1, SUM (c2) FROM t1 group by c1;

2. The fields in the group by clause do not reference the beginning of the index. For example:

 

The code is as follows: Copy code
SELECT c1, c2 FROM t1 group by c2, c3;

3. The query references a part of the keyword after the group by part, and there is no equality equal to the constant, for example:

The code is as follows: Copy code

SELECT c1, c3 FROM t1 group by c1, c2;

II. Compact index scan

If the loose index scan condition is not met, you do not need to create a temporary table if you execute group. If the WHERE clause has a range condition, this method only reads the keywords that meet these conditions.

Otherwise, index scanning is performed. This method reads the range defined by the WHERE clause.

1. a vulnerability exists in group by, but has been overwritten BY condition c2 = 'a.

The code is as follows: Copy code

SELECT c1, c2, c3 FROM t1 WHERE c2 = 'A' group by c1, c3;

2. group by is not a constant that satisfies the leftmost prefix, but has a condition that provides the element:

The code is as follows: Copy code

SELECT c1, c2, c3 FROM t1 WHERE c1 = 'A' group by c2, c3;


Which of the following conditions can I use loose index scan:

1. Single table query

In 2 Group by, only the leftmost prefix column exists, and no other columns exist.

3 only supports max and min aggregation, and the column to be aggregated must be the index of the column in group.

4. The other part of the index that is not referenced by group by must be a constant)

5. Prefix indexes are not supported.

Assume that the table t1 (c1, c2, c3, c4) has an index that includes the c1, c2, and c3 columns. All of the following queries can be performed by loose index scan:

The code is as follows: Copy code

SELECT c1, c2 FROM t1 group by c1, c2;
Select distinct c1, c2 FROM t1;
SELECT c1, MIN (c2) FROM t1 group by c1;
SELECT c1, c2 FROM t1 WHERE c1 <const group by c1, c2;
Select max (c3), MIN (c3), c1, c2 FROM t1 WHERE c2> const group by c1, c2;
SELECT c2 FROM t1 WHERE c1 <const group by c1, c2;
SELECT c1, c2 FROM t1 WHERE c3 = const group by c1, c2;

These are all copied official documents, and the correctness is not fully verified.

None of the following SQL statements can use loose index scan:


-- Because the aggregate function is not max or min

The code is as follows: Copy code
SELECT c1, SUM (c2) FROM t1 group by c1;

-- Because it does not conform to the leftmost prefix principle

The code is as follows: Copy code
SELECT c1, c2 FROM t1 group by c2, c3;

-- The query involves a part of the index, followed by the columns in group by, but there is no constant equivalent statement. Just add WHERE c3 = const.

The code is as follows: Copy code

SELECT c1, c3 FROM t1 group by c1, c2;

Other aggregate functions can also use loose index scan, such as AVG (DISTINCT), SUM (DISTINCT), and COUNT (DISTINCT)

The following statements can also be used:

The code is as follows: Copy code

Select count (DISTINCT c1), SUM (DISTINCT c1) FROM t1;

Select count (DISTINCT c1, c2), COUNT (DISTINCT c2, c1) FROM t1;

Many statements are excerpted from 5.7 documents, and the correctness is not verified. Although the official documents are authoritative, it is better to believe in books than to have no books, and to get a glimpse of them on paper, I know that this is a must. I will test it one by one tomorrow weekend and try again later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.