In many cases, a temporary table is required for distinct combined with order.
Note that because distinct may use group by, you must be clear about how the mysql tutorial uses columns in the order by or having clause part of the selected column. Mysql extends the use of group by. Therefore, you can use columns or operations not listed in the select list in the group by statement. This indicates "any possible value for this group ". You can achieve better performance by avoiding sorting and grouping unnecessary items. For example, you do not need to group customer. name in the following inquiries:
Mysql> select order. custid, customer. name, max (payments)
-> From order, customer
-> Where order. custid = customer. custid
-> Group by order. custid;
In standard SQL, you must add customer. name to the group by clause. In mysql, if you are not running in ansi mode, this name is redundant.
If the columns you omit from the group by section are not unique in this group, do not use this function! You will get unexpected results.
In some cases, you can use min () and max () to obtain a special column value, even if it is not unique. The values in the column containing the minimum values in the sorting column are given below:
Substr (min (concat (rpad (sort, 6, ''), column), 7)
See 3.6.4, "rows with the maximum value between groups of a field ".
Note: If you are attempting to follow standard SQL, you cannot use the expression in the group by or order by clause. You can bypass this restriction by using the alias of the expression:
Mysql> select id, floor (value/100) as val
-> From tbl_name
-> Group by id, val order by val;
However, mysql allows you to use expressions in the group by and order by clauses. For example:
Mysql> select id, floor (values/100) from tbl_name order by rand ();
, "Group by with hidden fields ".
In most cases, the distinct clause can be considered a special case of group. For example, the following two queries are equivalent:
Select distinct c1, c2, c3 from t1 where c1> const; select c1, c2, c3 from t1 where c1> const group by c1, c2, c3; due to this equivalence, optimization for group by queries is also applicable to queries with distinct clauses. In this way, for more details about the optimization of distinct queries,
, "Mysql how to optimize group ".
The most common method to satisfy the group by clause is to scan the entire table and create a new temporary table. All rows in each group in the table should be continuous, then use the temporary table to locate the group and apply the accumulate function (if any ). In some cases, mysql can do better by accessing indexes instead of creating temporary tables.
The most important prerequisite for using an index for group by is that all group by columns reference the same index attribute and the index stores its keywords in order (for example, this is a B-tree index, instead of hash indexes ). Whether to use index access instead of temporary tables depends on which part of the index is used in the query, the conditions specified for this part, and the selected cumulative function.
There are two ways to perform group by queries through index access, as described in the following chapter. In the 1st methods, the combined operation is used in conjunction with all range values (if any ). The 2nd methods first perform a range scan and then combine the result tuples.
7.2.13.1. Loose index Scanning
The most effective way to use indexes is to directly search for group domains. Using this access method, mysql uses the attributes of index types (such as B-tree) sorted by certain keywords. This attribute allows you to use the search group in the index without considering all the keywords in the index that meet all where conditions. Since this access method only takes into account a small part of the keywords in the index, it is called loose index scanning. Without the where clause, the number of keywords read by the loose index scan is as large as the number of groups, which can be much smaller than the number of all keywords. If the where clause contains a range limit (for more information about the range join type, see section 7.2.1, "explain syntax (get information about select )"), the loose index scan searches for 1st keywords in each group that meet the range conditions, and reads as few keywords as possible again. The following conditions are acceptable:
· Query a single table.
· Group by includes 1st consecutive parts of the index (if a distinct clause is provided for the group by query, all explicit attributes point to the beginning of the index ).
· Only use the cumulative functions (if any) min () and max (), and they all point to the same column.
· Any other part of the index (except those from group by referenced in the query) must be a constant (that is, they must be referenced by the number of constants), but min () or max () function parameter exceptions.
The explain output of this type of query shows the using indexforgroup-by of the extra column.
The following query provides several examples of this class. Assume that the table t1 (c1, c2, c3, c4) has an index idx (c1, c2, c3 ):
Select c1, c2 from t1 group by c1, c2;
Select distinct c1, c2 from t1;
Select c1, min (c2) from t1 group by c1;
Select c1, c2 from t1 where c1 <const group by c1, c2;
Select max (c3), min (c3), c1, c2 from t1 where c2> const group by c1, c2;
Select c2 from t1 where c1 <const group by c1, c2;
Select c1, c2 from t1 where c3 = const group by c1, c2;
For the above reason, you cannot use this quick selection method to execute the following query:
1. In addition to min () or max (), there are other cumulative functions, such:
Select c1, sum (c2) from t1 group by c1; 2. the fields in the group by clause do not start with an index, as shown below:
Select c1, c2 from t1 group by c2, c3; 3. The query references a part of the key word after the group by part and does not have an equality equal to a constant. For example:
Select c1, c3 from t1 group by c1, c2; 7.2.13.2. compact index Scan
A compact index scan can be an index scan or a range index scan, depending on the query conditions.
If you do not meet the loose index scanning conditions, you do not need to create a temporary table for group by queries. If the where clause has a range condition, this method only reads the keywords that meet these conditions. Otherwise, index scanning is performed. This method reads all the keywords in each range defined by the where clause, or scans the entire index without a range condition. We define it as a compact index scan. Note that for compact index scanning, you can only combine all keywords that meet the range conditions.
For this method to work, it is sufficient to have a constant equality condition for all columns in the query that reference the elements of the group by keyword and the intermediate keyword. Constants in the Equality condition fill in the "gap" in the search keyword to form a complete index prefix. These index prefixes can be used for index search. If you want to sort the group by results and search keywords with index prefixes, mysql can also avoid additional sorting operations, because all the keywords have been searched in order by using the prefix of the ordered index.
The first method is not suitable for the following queries, but the 2nd index access methods can work (assuming we have mentioned index idx of table t1 ):
· There is a gap in group by, but it has been overwritten by the condition c2 = 'A.
Select c1, c2, c3 from t1 where c2 = 'A' group by c1, c3; · group by does not start with the 1st elements of the keyword, however, there is a condition that provides constants for this element:
Select c1, c2, c3 from t1 where c1 = 'A' group by c2, c3;
After combining limit row_count and distinct, mysql finds a unique row_count row and stops immediately.
If you do not use the columns of all tables named in the query, mysql will immediately stop scanning unused tables after 1st matches are found. In the following cases, assuming that t1 is used before t2 (you can use the explain check), mysql no longer reads t2 (any row in t1) after 1st rows in t2:
Select distinct t1.a from t1, t2 where t1.a = t2.a;