When asked about the implementation of groupby, I found that there may be misunderstandings. Example: CREATETABLE 'tb' ('C' int (11) DEFAULTNULL, 'D' int (4) DEFAULTNULL, 'E' varchar (1000) DEFAULTNULL) engineinnodbdefadefacharsetutf8;
When asked about the implementation of group by, I found that there may be misunderstandings. For example, create table 'tb' ('C' int (11) default null, 'D' int (4) default null, 'E' varchar (1000) default null) ENGINE = InnoDB default charset = utf8;-Insert into tb values (2, 20, B); Inser
When asked about the implementation of group by, I found that there may be misunderstandings.
Example
Create table 'tb '(
'C' int (11) default null,
'D' int (4) default null,
'E' varchar (1000) DEFAULT NULL
) ENGINE = InnoDB default charset = utf8;
-
Insert into tb values (2, 20, 'B ');
Insert into tb values (1, 10, 'A ');
Insert into tb values (2, 20, 'B ');
Insert into tb values (1, 10, 'A ');
Insert into tb values (3,30, 'C ');
Query statement and Result
The explain results of this statement include Using temporary; Using filesort, which requires sorting.
Therefore, some people think that the implementation of group by is "sort first and then group ".
Usage Analysis
First read the instructions in this manual: "By default, MySQL sorts all group by col1, col2 ,... Queries as if you specified order by col1, col2 ,... In the query as well. if you include an order by clause explicitly that contains the same column list, MySQL optimizes it away without any speed penalty, although the sorting still occurs. if a query has des group by but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying order by null ".
If order by null is added to our query, the result is as follows:
There is no Using filesort in the Explain.
Therefore, we can imagine that sorting is performed only after grouping.
Algorithm Analysis
In fact, if you want to sort and group data first, the time complexity is O (nlog (n). In the example, if you only need to group data, it can be faster. The practice in MySQL is described as follows:
1. Create an empty temporary table with three fields: Index column, c, and count (*)
The "index column" here is the calculation result of the column after group by. Imagine if the statement is group by 1/c or something. Of course, in our example, the value is c.
2. read data from a row in the original table and calculate the value key of the index column first. Use the key to search in the temporary table. If the key row exists, update. Otherwise, insert.
In this example, if you read the Row c = 2 for the first time, insert a row 2, 2, and 1 into the temporary table. The second line that encounters c = 2 is changed to 2, 2, and 2.
3. After all the original tables are traversed, the group is finished. Therefore, we can see that the order c in the query result with order by null is 2, 1, 3. If order by null is not found, the order is sorted based on this.
The query in the temporary table is a simple hash query. We can see that the time complexity of the grouping process of this algorithm is O (n ).