Use of group by in MySQL database, and use of order by nesting. Optimization Problems

Source: Internet
Author: User

Let's just look at the example. Try it by yourself

Mysql> select * From tb_clothes; + ---- + -------- + ------- + ------------ + --------- + | ID | Name | price | Total | date | Address | + ---- + -------- + ------- + ------------ + --------- + | 1 | shirt | 45 | 89 | 2010-01-09 | Henan | 2 | glove | 7 | 45 | 2010-02-09 | Shandong | 3 | shirt | 56 | 34 | 2010-01-09 | Henan | 4 | glove | 7 | 12 | 2010-02-09 | Shandong | 5 | trousers | 34 | 56 | 2010-04-12 | Jilin | 6 | sweater | 25 | 22 | 2010-08-09 | Jilin | 7 | basketball shoes | 125 | 23 | 2010-09-12 | Henan | 8 | sweater | 25 | 67 | 2010-08-09 | Hubei | 9 | hat | 8 | 14 | Jilin | 10 | hat | 8 | 54 | 2010-10-09 | Shaanxi | + ---- + -------- + ------- + ------------ + --------- + 10 rows in setmysql> select * From tb_clothes group by address; + ---- + ------ + ------- + ------------ + --------- + | ID | Name | price | Total | date | Address | + ---- + ------ + ------- + ------------ + --------- + | 5 | pants | 34 | 56 | 2010-04-12 | Jilin | 2 | glove | 7 | 45 | 2010-02-09 | Shandong | 1 | shirt | 45 | 89 | 2010-01-09 | Henan | 8 | sweater | 25 | 67 | 2010-08-09 | Hubei | 10 | hat | 8 | 54 | 2010-10-09 | Shaanxi | + ---- + ------ + ------- + ------------ + --------- + 5 rows in setmysql> Select Price from tb_clothes group by address; + ------- + | price | + ------- + | 34 | 7 | 45 | 25 | 8 | + ------- + 5 rows in setmysql> select sum (price) from tb_clothes group by address; + ------------ + | sum (price) | + ------------ + | 67 | 14 | 226 | 25 | 8 | + ------------ + 5 rows in Set

Mysql> select sum (price) as sumprice, address from tb_clothes group by address; + ---------- + --------- + | sumprice | Address | + ---------- + --------- + | 67 | Jilin | 14 | Shandong | 226 | Henan | 25 | Hubei | 8 | Shaanxi | + ---------- + --------- + 5 rows in Set

Nested with sorting:

Mysql> select sum (price) as sumprice, address from tb_clothes group by address order by sumprice; + ---------- + --------- + | sumprice | Address | + ---------- + --------- + | 8 | Shaanxi | 14 | Shandong | 25 | Hubei | 67 | Jilin | 226 | Henan | + ---------- + --------- + 5 rows in Set

Optimization Problems:

Master copy

The most common method to satisfy the group by clause is to scan the entire table and create a new temporary table. All rows in each group in the table should be continuous, then use the temporary table to locate the group and apply the accumulate function (if any ). In some cases, MySQL can do better by accessing indexes instead of creating temporary tables.

The most important prerequisite for using an index for group by is that all group by columns reference the same index attribute and the index stores its keywords in order (for example, this is a B-tree index, instead of hash indexes ). Whether to use index access instead of temporary tables depends on which part of the index is used in the query, the conditions specified for this part, and the selected cumulative function.

There are two ways to perform group by queries through index access, as described in the following chapter. In the 1st methods, the combined operation is used in conjunction with all range values (if any ). The 2nd methods first perform a range scan and then combine the result tuples.

7.2.13.1. Loose index Scanning

The most effective way to use indexes is to directly search for group domains. Using this access method, MySQL uses the attributes of index types (such as B-tree) sorted by certain keywords. This attribute allows you to use the search group in the index without considering all the keywords in the index that meet all where conditions. Since this access method only takes into account a small part of the keywords in the index, it is calledLoose index Scanning. Without the WHERE clause, the number of keywords read by the loose index scan is as large as the number of groups, which can be much smaller than the number of all keywords. If the WHERE clause contains the range limit clause (for details about the range join type, see section 7.2.1, "Explain syntax (get information about select)"),
The loose index scan searches for 1st keywords in each group that meet the range conditions, and reads as few keywords as possible again. The following conditions are acceptable:

· Query a single table.

· Group by includes 1st consecutive parts of the index (if a distinct clause is provided for the group by query, all explicit attributes point to the beginning of the index ).

· Only use the cumulative functions (if any) min () and max (), and they all point to the same column.

· Any other part of the index (except those from group by referenced in the query) must be a constant (that is, they must be referenced by the number of constants), but Min () or max ()
Function parameter exceptions.

The explain output of this type of query shows the using indexforgroup-by of the extra column.

The following query provides several examples of this class. Assume that the table T1 (C1, C2, C3, C4) has an index idx (C1, C2, C3 ):

Select C1, C2 from T1 group by C1, C2;

Select distinct C1, C2 from T1;

Select C1, min (C2) from T1 group by C1;

Select C1, C2 from T1 where c1 <ConstGroup by C1, C2;

Select max (C3), min (C3), C1, C2 from T1 where c2>ConstGroup by C1, C2;

Select C2 from T1 where c1 <ConstGroup by C1, C2;

Select C1, C2 from T1 where C3 =ConstGroup by C1, C2;

For the above reason, you cannot use this quick selection method to execute the following query:

1. In addition to min () or max (), there are other cumulative functions, such:

     SELECT c1, SUM(c2) FROM t1 GROUP BY c1;

2. The fields in the group by clause do not reference the beginning of the index, as shown below:

     SELECT c1,c2 FROM t1 GROUP BY c2, c3;

3. the query references a part of the keyword after the group by part, and there is no equality equal to the constant, for example:

     SELECT c1,c3 FROM t1 GROUP BY c1, c2;
7.2.13.2. compact index Scanning

A compact index scan can be an index scan or a range index scan, depending on the query conditions.

If you do not meet the loose index scanning conditions, you do not need to create a temporary table for group by queries. If the WHERE clause has a range condition, this method only reads the keywords that meet these conditions. Otherwise, index scanning is performed. This method reads all the keywords in each range defined by the WHERE clause, or scans the entire index without a range condition. We define it as a compact index scan. Note that for compact index scanning, you can only combine all keywords that meet the range conditions.

For this method to work, it is sufficient to have a constant equality condition for all columns in the query that reference the elements of the Group by keyword and the intermediate keyword. Constants in the Equality condition fill in the "gap" in the search keyword to form a complete index prefix. These index prefixes can be used for index search. If you want to sort the group by results and search keywords with index prefixes, MySQL can also avoid additional sorting operations, because all the keywords have been searched in order by using the prefix of the ordered index.

The first method is not suitable for the following queries, but the 2nd index access methods can work (assuming we have mentioned index idx of table T1 ):

· There is a gap in group by, but it has been overwritten by the condition C2 = 'A.

     SELECT c1,c2,c3 FROM t1 WHERE c2 = 'a' GROUP BY c1,c3;

· Group by does not start with the 1st elements of a keyword, but has a condition that provides constants for this element:

     SELECT c1,c2,c3 FROM t1 WHERE c1 = 'a' GROUP BY c2,c3;
7.2.14. How does MySQL optimize limit?

In some cases, when you use limitRow_countWithout having,MySQLQueries are processed in different ways.

· If you use limit to select only some rowsMySQL SelectionWhen you perform a full table scan, it will use indexes in some cases.

· If you use limitRow_countAnd order,MySQLOnce the first sorting result is foundRow_countRows. The sorting ends, rather than sorting the entire table. If the index is used, it will be very soon. If you must sort files (filesort), You must select all rows that match the query without the limit clause, and find 1stRow_countMost of the rows must be sorted. In either case, once a row is found, the other parts of the result are not sorted, and MySQL is not sorted.

· When combined with limitRow_countAnd distinct,MySQLOnce foundRow_countAnd it will stop.

· In some cases, group by can be solved by reading the key sequentially (or sorting the Key), and then calculating the summary until the value of the keyword changes. In this case, limit
Row_countNo unnecessary group by values will be calculated.

· As long as MySQL has sent the required number of lines to the customer, it will discard the query, unless you are using SQL _calc_found_rows.

· Limit 0 always returns an empty set quickly. This is useful for checking the query validity. When using the MySQL API, it can also be used to obtain the column type of the result column. (This technique does not work in MySQL monitor. Only empty set is displayed. Show columns or describe should be used ).

· Use limit when the server uses a temporary table for queryRow_countClause to calculate the amount of space required.

7.2.15. How to Avoid table Scanning

The explain output shows all types of columns used when MySQL uses table scan to solve queries. This usually happens under the following conditions:

· The table is small, and the scan table is faster than the search keyword. This is common for tables with fewer than 10 rows and shorter rows.

· The on or where clause does not have any applicable index column constraints.

· Compare index columns with constant values, and MySQL has calculated that (based on the Index Tree) constants cover a large part of the table and the table scan will be faster. See section 7.2.4, "How does MySQL optimize the WHERE clause?".

· You are using a lower set's potential keyword (many rows match the keyword) through another column ). In this case, MySQL assumes that it may perform many keyword searches by using the keyword, and the table scan will be faster.

For small tables, table scanning is usually appropriate. For a large table, try the following techniques to prevent the optimizer from incorrectly selecting a table scan:

· Use analyze tableTbl_nameUpdates the keyword distribution for the table to be scanned. See section 13.5.2.1 "analyze table Syntax ".

· Use force index to scan the table to inform MySQL that it is time consuming to scan the table with the given index. See section 13.2.7 "select Syntax ".

·                SELECT * FROM t1, t2 FORCE INDEX (index_for_column)
·                    WHERE t1.col_name=t2.col_name

· Start with the -- Max-seeks-for-Key = 1000 OptionMysqldOr use set max_seeks_for_key = 1000 to inform the optimizer that the keyword scan will not exceed 1,000 keyword searches.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.