Analysis of MySQL using GROUP by grouping aggregation and subdivision aggregation

Source: Internet
Author: User

1. Aggregation functions (Aggregate function)

MySQL (5.7) the list of aggregate functions given in the official documentation (image) is as follows:

Click https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html for details.

Unless otherwise noted, the aggregate function ignores nulls (null values).

2. Use of aggregate functions

Aggregate functions generally work with each grouping of group by statements, that is, if group by is not used in a query statement to group the result set, the aggregate function works on all rows of the result set. To illustrate the use of aggregate functions, a test table member is created and the member data structure is as follows (using the SELECT * from member query):

  

  1) Use aggregate functions directly on the result set

  For example, using aggregate function sum () to calculate the sum of contributions for all members (member), you can use:

SELECT SUM  as  from member    #计算所有会员会费总和

The query results are:

  

The SUM function sums all the field column fee. Of course, you can also find the average, the maximum value and so on.

In addition, you can use the WHERE statement to qualify the aggregate query. For example, if you want to inquire about the sum of the membership dues for country, then:

SELECT SUM  as  from WHERE = '  China '

The results are shown below:

  

2) group by using aggregate functions after grouping result sets

What if you want to check the sum of contributions for each country? The dues for each country, which is to group all the result sets by the Country field first, the rows with the same value country the same group, and then sum them in groups, so that the number of results records of the query equals the number of different values in the Grouping field. There are a total of members from three countries (China, US, UK), so the number of result records for a grouped aggregate query is 3:

SELECT SUM  as  from GROUP  by Country #查询每个国家的会费之和

The query statement calculates the sum of the contributions for each country and then shows the results of the query grouped by country:

  

Standard SQL and MySQL both provide a having statement to conditionally filter the results after the group by group is used and produce a new result set. For example, for the question of querying the sum of China's membership dues in the 1 above, you can use the having statement:

SELECT SUM  as  from GROUP  by  having = '  China '     #使用HAVING语句查询中国会员会费总和

The result is the same as above:

  

This method and the preceding 1) directly using where to qualify compared to some superfluous, why? Because country is here as a grouping field (group column), it is meaningless to group fields using having to be qualified again, because this can be done directly by using WHERE to filter. So, can I use non-aggregated columns (nonaggregated column) as a qualification? The answer is, not only meaningless, but not allowed. A non-aggregated column refers to a field that does not use an aggregate function but the table itself to be queried, because the aggregate result column after using the group by group query does not contain a non-aggregated field column at all, so it is not found at all when parsing the SQL statement. For example, when you want to get the sum of the dues of a member who is a man in each country, you might try to qualify the sex with having in the statement above, like this:

SELECT SUM  as  from GROUP  by  having = ' Mans '    #错误: Try to qualify non-aggregated fields with a having statement

Error after execution ERR 1054:

[ERR] 1054-unknown column ' m.sex ' in ' have clause ', prompting unknown columns m.sex, even if the alias is used here to explain. So how do you find the sum of the contributions of Members who have sex as a man in each country? Of course, you can use the where statement to qualify before group by groups:

SELECT SUM  as  from WHERE = ' Mans ' GROUP  by country    #在分组之前使用 WHERE to filter conditions

Produces the following result:

   

Therefore, having cannot work on the grouping itself, but can be qualified for the result of the grouping, and the qualified condition can only be an aggregate column (aggregated column), which refers to the column produced by using an aggregate function in the Select column (select list), for example, The sum (fee) here is the aggregate column. The aggregation column is qualified in the having and can obtain the result of the aggregate column satisfying certain conditions. For example, the following SQL statement can be used to obtain the sum of the member fees for each country and then limit the sum of the membership fees to more than 10000:

SELECT SUM  from GROUP  by  having SUM > 10000    Country of #查询会员费总和超过10000

The result is only China:)

  

This is the syntax in the standard SQL statement. In MySQL, you extend the having usage so that it can accept the alias of an aggregated column as a qualification, such as the query statement above that requires an alias:

SELECT SUM  as  from GROUP  by  having > 10000    Using aliases in #在HAVING

The result is still:

  

3) GROUP by using aggregate functions after grouping by multiple grouping fields

If the number of aggregated result records that are grouped by using one grouping field equals the number of different values for that grouping field, then what happens after you use multiple grouping fields? For example, on the basis of the above query, if you want to query the sum of contributions for each country male and female, you can use the following statement:

SELECT SUM  as  from GROUP  by Country,sex    #查询每个国家男, female members of the total membership dues

The results are as follows:

  

As can be seen from the above results, "Male members in China have the most total dues, while the British male members have the fewest total dues". A total of three countries, if only by country (country) group, only three records, if the gender (sex) sub-divided, each group after grouping (that is, each row, each record) by the difference in gender and then subdivided, because the gender value only two, so each country's subgroup is divided into two groups, Three countries have a total of 6 groups (6 = 3x2), so there will eventually be 6 records, as shown.

To understand the number of each subdivision group, at the end of the Select Query column, add the aggregate function count (*) that calculates the number of groups:

SELECT SUM  as COUNT (*as fromgroup by country, sex    #多分组字段分组, and counts per group

The results are as follows:

The above results are sorted by default in the order of the group by, but if you want to specify a sort sentence, you can use order by, for example, to sort the results above by sex:

SELECT SUM  as Count (*as from theGROUPbyorder by sex    #将分组结果按sex sort

The results are as follows:

What if the results are subdivided with other fields? The principle is the same as the above two fields, the more depth of knowledge grouping, the more obvious the result of the number of record rows, but in any case, you will find that each group results are different, this is the characteristics of the grouping results, because the order by itself has the aggregation function, The result of each aggregate column is the result of grouping the collation, so there is only one record.

So what happens if I use a table's primary key or a non-null uniqueness field to group? For example, in this test table, the ID is its primary key, the name is a non-null field with uniqueness constraints, the following are MySQL statements and results grouped by ID and name, respectively:

SELECT SUM  as COUNT (*as from theGROUP by ID    #以主键id进行分组

The results are as follows:

  

SELECT SUM  as COUNT (*as from theGROUP by name    #以非空唯一性约束字段进行分组

The results are as follows:

  

Obviously, the results of the aggregation function are the same in the results of the two groupings, with the same number of result records in each group, and all 1, indicating that grouping by primary key or non-null uniqueness constraint field results in the same result, and the result is a record of all the rows in the table. Doing so may not make much sense, but it helps to understand the principles of group by grouping.

3. Summary

1) You can use aggregate functions directly on a field, or use an aggregate function for a field after filtering on a WHERE statement;

2) You cannot qualify an aggregate function that does not use a group by group using the having;

3) The result of using GROUP by group query can be qualified with having, its qualification is better than the aggregate function column (itself or other aggregate function);

4) You can use where to qualify the result before using group by, and use having to qualify the aggregate function column after grouping;

5) You can use order by to sort the results according to a field (any field or column, or by using a group by group), or by using an aggregate function column;

6) When grouping by primary key or non-null uniqueness constraint field, the result records all records of the entire table.

4. References

[1]. MySQL Official Document Url:https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html

Analysis of MySQL using GROUP by grouping aggregation and subdivision aggregation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.