Original Address http://hi.baidu.com/sdunus/blog/item/333c1a29305041fe99250aad.html
Before we introduce the group BY and HAVING clause, we must first talk about a special function in the SQL language: clustering functions such as SUM, COUNT, MAX, AVG, and so on. The fundamental difference between these functions and other functions is that they generally function on more than one record.
SELECT SUM (population) from BBC
The SUM function here is on the population field of all returned records, and the result is that the query returns only one result, that is, the total population of all countries.
By using the GROUP BY clause, you can have the sum and COUNT functions work on data that belongs to a group. When you specify group by regions, a set of data that belongs to the same region will return only one row of values, that is, all fields except region (regions) in the table can only be returned with a value after the SUM, count, and other aggregate function operations.
The HAVING clause allows us to filter groups of data, where clauses filter records before aggregation. That is, before the GROUP BY clause and the HAVING clause, and the HAVING clause filters groups of records after aggregation.
Let's take a concrete example to understand the group BY and having clauses.
SQL instance:
First, show the total population and area of each region:
SELECT region, sum (population), sum (area)
From BBC
GROUP by region
The return record is divided into groups with region first, which is the literal meaning of group by. After the group is finished, the different fields (one or more records) in each group are calculated with an aggregate function.
Second, show the total population and area of each area. Only those areas with more than 1000000 area are shown.
SELECT region, sum (population), sum (area)
From BBC
GROUP by region
Have SUM (area) >1000000
Here, we cannot use where to filter more than 1000000 of the area, because there is no such record in the table.
Instead, the HAVING clause allows us to filter groups of data after the group.
In a query statement with a GROUP BY clause, the column specified in the select list is either the column specified in the group by sentence or contains a clustered function . such as select Max (SAL), the job from the EMP group by job
The Select, group by, and having words of a query statement are the only places where the cluster function appears, and the clustering function cannot occur in the WHERE clause. Such as:
Select Deptno,sum (SAL) from EMP
where sal>1200
GROUP BY DEPTNO have sum (SAL) >8500
Order BY Deptno;
When the HAVING clause is used in the GROPU by clause, only the groups that satisfy the having condition are returned in the query results. There can be a WHERE clause and a HAVING clause in an SQL statement. The having is similar to the WHERE clause, and is used to set the qualification conditions.
The purpose of the WHERE clause is to remove rows that do not conform to the where condition before grouping the results of the query, that is, to filter the data before grouping, the condition cannot contain a clustering function, and the Where condition is used to display a particular row.
having The purpose of the clause is to filter the groups that meet the criteria, that is, to filter the data after grouping, often including clustering functions, to display specific groups using the having condition, or to group by using multiple grouping criteria.
Note: Each column name specified in the SELECT statement is also mentioned in the GROUP by clause. The column names not mentioned in these two places produce an error.
Query the number of employees for each position in each department
Select Deptno,job,count (*) from the EMP Group by Deptno,job
Considerations for using SQL aggregation functions (clustering functions)