Document directory
- 8.2 application of Aggregate functions
- 8.2.3 Maximum/minimum functions-max ()/min ()
- 8.2.4 mean function -- AVG ()
- 8.2.5 processing of aggregated analysis values
- 8.2.6 combination of Aggregate functions
8.2 application of Aggregate functions
Aggregate functions are widely used in the query and Analysis of database data. This section describes the application of Aggregate functions.
8.2.1 sum function -- sum ()
Sum () is used to sum data and return the sum of all values in the selected result set. The syntax is as follows.
Select sum (column_name)
From table_name
Note: The sum () function can only act on numeric data, that is, the data in column column_name must be numeric.
Use of instance 1 sum function
Query the total salaries of all male teachers from the teacher table. For the structure and data of the teacher table, see Table 5-1 in section 5.2.1. Instance code:
Select sum (SAL) as boysal
From teacher
Where tsex = 'male'
The running result is 8.1.
Figure 8.1 total salaries of all male teachers in the teacher table
Processing of null values by instance 2 sum function
Query the total salaries of teachers over 40 years old from the teacher table. Instance code:
Select sum (SAL) as oldsal
From teacher
Where age> = 40
The running result is 8.2.
Figure 8.2 Total salaries of all instructors older than 40 in the teacher table
When a column is summed, if the column has a null value, the sum function ignores this value.
8.2.2 counting function -- count ()
The count () function is used to calculate the number of records in a table or the number of values in a column. The calculation content is specified by the SELECT statement. When using the count function, you must specify a column name or use an asterisk. An asterisk indicates that all records in a table are calculated. The two methods are as follows.
Count (*) is used to calculate the total number of rows in the Table. Even if the row data in the table is null, it is included.
Count (column) is used to calculate the number of rows contained in the column. If the data of a row in this column is null, this row is not counted in the total number of statistics.
1. Use the count (*) function to count the number of rows in the table
The count (*) function returns a function that meets the search criteria in the WHERE clause of the SELECT statement.
Use of instance 3 count (*) Function
Queries the number of rows of all records in the teacher table. Instance code:
Select count (*) as totalitem
From teacher
The running result is 8.3.
Figure 8.3 count the number of rows in a table using the count (*) Function
In this example, if the SELECT statement does not contain the WHERE clause, all rows in the table are considered to meet the SELECT statement. Therefore, the SELECT statement returns the count of all rows in the table, the results are consistent with the data in the teacher table listed in table 5-1 in section 5.2.1.
If the DBMS stores the number of rows in the Table in its system table, count (*) will return the number of rows in the table quickly, because at this time, the DBMS does not have to read the table from start to end, it also counts rows in the physical table and extracts the row count directly from the system table. If the DBMS does not store the number of rows in the system table, it uses the column with the not null constraint as the parameter. Using the count () function, it may count the rows in the table more quickly.
Note: |
The count (*) function accurately returns the total number of rows in the table, and returns the correct row count only when the parameter column of the count () function does not have a null value, therefore, you can use the count () function to replace the count (*) function only when a column restricted by not null is used as a parameter. |
2. Use the count () function to count data in a column
The count () function can be used to count data values in a column. Unlike the count (*) function that ignores all columns, the count () function checks values in one or more columns one by one and counts rows whose values are not null.
Instance 4: query the number of rows of all records in multiple columns
Query the total number of data rows in the TNO column, tname column, and Sal column of the teacher table. Instance code:
Select count (TNO) as total_tno, count (tname) as total_tname,
Count (SAL) as total_sal
From teacher
The running result is 8.4.
Figure 8.4 use the count () function to count data in a column
It can be seen that the TNO and tname columns do not contain null values, so the count is consistent with the result of using the count (*) function for the record count in the teacher table, because the data of two rows in the Sal column is null, these two columns are not included, and the count result is 8.
3. Use the count () function to count data in multiple columns
The count () function can be used not only to count data values in a column, but also to count data values in multiple columns. If you count multiple columns, you need to connect the multiple columns to be counted using a connector and use it as a parameter of the count () function. The following describes how to use an instance with multiple column counts.
Description |
For more information about how to use connectors to connect multiple columns, see section 7.2. |
Instance 5 uses the count () function to count data in multiple columns
Count the number of data rows in the TNO, tname, and Sal columns of the teacher table, as well as the number of data rows in the TNO, tname, tname, and Sal columns. Instance code:
Select count (TNO) as total_tno, count (tname) as total_tname,
Count (SAL) as total_sal,
Count (cast (TNO as varchar (5) + tname) as t_noname,
Count (tname + Cast (SAL as varchar (5) as t_namesal
From teacher
The running result is 8.5.
Figure 8.5 use the count () function to count data in multiple columns
When two columns are connected, they are converted to the same data type using the cast expression because their data types are inconsistent.
As mentioned in section 7.2.1, if any column in the connected column has a null value, the join result is null, and the column is not counted by the count () function.
Note: |
The count () function only counts rows that are passed to the function and are not null. |
4. Use the count function to count records that meet certain conditions.
You can also add some clause constraints in the SELECT statement to specify the number of returned records.
Instance 6: Use the count function to count records that meet certain conditions.
Query the number of female records in the teacher table. Instance code:
Select count (*) as totalwomen
From teacher
Where tsex = 'female'
The running result is 8.6.
Figure 8.6 use the count function to count records that meet certain conditions
The result is 6 instead of 10 for all the preceding records. The where clause can be used to define the counting condition of the count () function, which is inseparable from the execution sequence of each clause in the SELECT statement. As mentioned earlier, DBMS first executes the from clause, then the WHERE clause, and finally the select clause. Therefore, the count () function can only be used for records that meet the query conditions defined by the WHERE clause. Records that are not included in the query results of the WHERE clause do not conform to the count () function.
8.2.3 Maximum/minimum functions-max ()/min ()
You can use the max () function to understand the maximum values in a column. Similarly, you can use the min () function to understand the minimum values in a column. The syntax is as follows.
Select max (column_name)/min (column_name)
From table_name
Description: The data in column_name can be a value, string, or datetime data type. The max ()/min () function returns a single value of the same data type as the passed column.
Use of instance 7 max () function
Query the maximum age of the instructor in the teacher table. Instance code:
Select max (AGE) as maxage
From teacher
The running result is 8.7.
Figure 8.7 maximum instructor age in the teacher table
However, it is not particularly useful to obtain this result in practical use, because it is often necessary to obtain information such as the instructor ID, name, and gender of the largest age of teachers.
However, SQL does not support the following select statements.
Select tname, dname, tsex, max (AGE)
From teacher
In this example, the max function treats the entire teacher table as a group, while the tname, dname, and tsex data are not grouped, therefore, the SELECT statement has no logical meaning. In the same way, the following code is invalid.
Select tname, dname, tsex, Sal, age
From teacher
Where age = max (AGE)
The solution to this problem is to use the subquery in the where clause to return the maximum value, and then query the relevant information based on the maximum value returned.
Instance 8 returns the maximum value using a subquery in the where clause.
Query the number, name, gender, and other information of the oldest instructors in the teacher table.
Instance code:
Select tname, dname, tsex, Sal, age
From teacher
Where age = (select max (AGE) from teacher)
The running result is 8.8.
Figure 8.8 use a subquery in the where clause to return the maximum value
The max () and min () functions can act on both numeric and datetime data types.
Instance 9 max () function for balanced data
The following code is used:
Select max (tname) as maxname
From teacher
The running result is 8.9.
Figure 8.9 result of using max in string data type
It can be seen that the maximum value of a string can also be obtained.
Description |
The maximum value of the string type data is from A ~ Z is ordered. The greater the value, the closer the result is. Of course, Chinese characters are arranged in full pinyin. If the first character is the same, the next character is compared, and so on. |
Of course, for data of the date and time type, you can also calculate its Maximum/minimum value. Its size is arranged in the morning and evening of the date and time. The earlier it is considered that its value is smaller, the more instances shown below.
Instance 10 max () and min () functions are used for time data.
Query the test time of the oldest and oldest test courses in the Course table. For the structure and data of the Course table, see Table 6-1 in section 6.1 of this book. Instance code:
Select min (ctest) as early_date,
Max (ctest) as late_date
From Course
The running result is 8.10.
Figure 8.10 test time of the oldest and oldest test courses in the Course table
The data type of the returned result is the same as that defined in the column.
Note: |
When determining the maximum (minimum) value in a column, the max () (min () function ignores NULL values. However, if the values of all rows in this column are null, the max ()/min () function returns NULL. |
8.2.4 mean function -- AVG ()
The AVG () function is used to calculate the average value of data values in a column. The syntax is as follows.
Select AVG (column_name)
From table_name
Note: the execution process of the AVG () function is to add up the values in a column and divide them by the number of non-null values. Therefore, like the sum () function, the AVG () function can only act on numeric data, that is, the data in column_name must be numeric.
Application of the AVG () function of instance 11
Query the average age of all instructors from the teacher table. Instance code:
Select AVG (AGE) as avg_age
From teacher
The running result is 8.11.
Figure 8.11 average age of all instructors in the teacher table
When the average value is calculated, the AVG () function ignores NULL values. Therefore, if the column to calculate the average value has a null value, pay special attention to it when calculating the average value.
Processing of null values by instance 12 AVG () function
Query the average salaries of all instructors from the teacher table. Instance code:
Select AVG (SAL) as avg_age1, sum (SAL)/count (*) as avg_age2,
Sum (SAL)/count (SAL) as avg_age3
From teacher
The running result is 8.12.
Figure 8.12 average salaries of all instructors in the teacher table
Different results are displayed. In fact, the "AVG (SAL)" and "sum (SAL)/count (SAL)" statements are equivalent. Because the execution process of the AVG (SAL) statement is to add the values in the Sal column and then sum them (equivalent to sum (SAL )) divide by the number of non-null values (equivalent to count (SAL )). The statement "sum (SAL)/count (*)" is not true, because count (*) returns the number of all records in the table, regardless of whether the value in the Sal column is null.
Note: |
When calculating the average value of a column, the AVG () function ignores NULL values. However, if the values of all rows in this column are null, the AVG () function returns NULL. |
If you do not want to average all values in the column, you can use the search condition in the WHERE clause to limit the rows used to calculate the mean.
Instance 13 uses search conditions in the WHERE clause to limit the rows used to calculate the mean.
Query the average age of all computer teachers from the teacher table. Instance code:
Select AVG (AGE) as avgcomputer_age
From teacher
Where dname = 'computer'
The running result is 8.13.
Figure 8.13 average age of all computer teachers in the teacher table
When a SELECT statement is executed, DBMS evaluates the Search Condition "dname = 'put'" in the WHERE clause for each row in the table. The age value in the row is passed to the AVG (AGE) function only when the search condition is true.
Of course, in addition to displaying the average value of a column in the table, the AVG () function can also be used as part of the WHERE clause. Like the max () function described earlier, the where clause cannot be used directly. It must be in the form of a subquery.
Instance 14 AVG () function as part of the search condition in the WHERE clause
Query Information of all instructors of higher age than the average age from the teacher table. Instance code:
Select *
From teacher
Where age> = (select AVG (AGE) from teacher)
Order by age
The running result is 8.14.
Figure 8.14 information of all instructors older than the average age in the teacher table
8.2.5 processing of aggregated analysis values
The five Aggregate functions described above can act on all data in the selected column (no matter whether the data in the column is reset or not), or only process the non-value values in the column, that is, duplicate values are retrieved for aggregation analysis only once. Of course, for the max ()/min () function, the value processing is of little significance.
You can use the all keyword to indicate that all data in the selected column is processed, and the distinct keyword is used to indicate that non-Valued Data in the selected column is processed. Take the AVG () function as an example. The syntax is as follows.
Select AVG ([All/distinct] column_name)
From table_name
Note: [All/distinct] by default, the All keyword is used to process all data regardless of whether there are values. Other Aggregate functions are used in the same way.
Note: |
Microsoft Access database does not support the distinct keyword in aggregate functions. |
Instance 15 aggregation analysis value processing
Query the number of all records in the salary Sal column from the teacher table. Instance code:
Select count (all Sal) as allsal_count
From teacher
The running result is 8.15.
Figure 8.15 Number of all records in the salary Sal column in the teacher table
Of course, you can get the same result by removing the all keyword in the code. If you query the number of different records in the salary Sal column from the teacher table, you can use the following code.
Select count (distinct Sal) as distinctsal_count
From teacher
The running result is 8.16.
Figure 8.16 number of different records in the Sal column in the teacher table
After the distinct keyword is used, the values in the salary Sal column are not included in the statistical range. In addition, among all the five Aggregate functions, except the count (*) function, other functions ignore null values during calculation, that is, the null value row is excluded and not analyzed.
8.2.6 combination of Aggregate functions
In the previous example, Aggregate functions are used independently. Aggregate functions can also be used in combination, that is, multiple Aggregate functions can be used in a select statement.
Instance 16 uses multiple Aggregate functions
The following code:
Select count (*) as num_items,
Max (SAL) as max_sal,
Min (AGE) as min_age,
Sum (SAL)/count (SAL) as avg_sal,
AVG (distinct Sal) as disavg_sal
From teacher
The running result is 8.17.
Figure 8.17 combined application of Aggregate functions
In this example, almost all Aggregate functions are used in a select statement. Num_items indicates the entries of all records in the teacher table, max_sal indicates the highest wage recorded in the teacher table, min_age indicates the minimum age recorded in the teacher table, and avg_sal indicates the average value of the wage records in all teacher tables, disavg_sal is the average value of all different wage records in the teacher table.