1. Set operation
SQL functions in the Union, intersect, and except (minus instead of except) operations in the relationship, corresponding to the intersection and difference operations in the mathematical set theory. Each of the three operations automatically removes duplicates.
eg
Find all courses in the fall semester of 2009 but not in spring 2010
(Select course_id
FROM clause
where semester= ' Fall ' and year=2009)
Except
(Select course_id
FROM clause
where semester= ' Spring ' and year=2010);
2, the problem of NULL value
Null values bring special problems to relational operations, including arithmetic operations, comparison operations, and set operations.
If any of the inputs in an arithmetic expression are empty, the result of the arithmetic expression (involving such things as +-*/) is empty. For example, R.a+5, and R.A is empty, the result of the expression is also empty.
In the comparison operation, consider the comparison "1<null", then do not know what the result is good. SQL treats the result of any comparison operation involving null values as unknown, which creates a third logical value other than true and false. The result of "1<null" is unknown.
Because the predicates in the WHERE clause can use Boolean operations such as and, or, and not for comparison results, these definitions can also be extended to handle unknown.
The result of And:true and unknown is unknown,false and unknown result is false,unknown and unknown result is unknown.
The result of Or:true or unknown is true,false and unknown result is unknown,unknown and unknown result is unknown.
The result of not:not unknown is unknown.
* If the WHERE clause predicate evaluates to a tuple false or unknown, then the tuple is not written to the result set.
SQL uses the special predicate NULL to test for null values.
Eg: Find all teachers with null values in the instructor relationship salary. can be written
Select Name
From instructor
where salary is null;
The question of null values in the aggregation function is explained in detail below.
3. Aggregation function
A clustered function is a function that returns a single value in a collection (set or multiset) as input. SQL provides 5 intrinsic aggregation functions:
Average: Avg
Minimum value: Min
Maximum value: Max
Sum: Sum
Count: Count
1) Basic Aggregation
Consider "finding out the average salary of computer SCI teachers". We write the query as follows:
Select AVG (Salary) as avg_salary/* One of the most common occasions for renaming operations */
From instructor
where dept_name= ' Comp.sci ';
The aggregate function retains duplicate records by default, and it is important to keep repeating tuples in the calculation of averages. But there are cases where we want to remove duplicates, you can use distinct in a clustered expression
Consider "finding out the total number of teachers who teach a course in the spring semester of 2010", in this case, whether a teacher teaches a few course segments, he should only be counted once, and the query is as follows:
Select COUNT (DISTINCT ID)
From teaches
where semester= ' Spring ' and year=2010;
We often use count to calculate the total number of tuples in a relationship. The wording is as follows:
Select COUNT (*)
From R;
Note: SQL does not allow distinct to be used with count (*), while using distinct in Max and Min is legal, although the use is the same as not using the result.
2) Grouping aggregation: GROUP BY clause
One or more properties given in the GROUP BY clause are used to construct the grouping. Tuples that have the same value in all the attributes in the GROUP BY clause are divided into the same group.
Consider the query "find the average wage for each department", which is written as follows:
Select Dept_name,avg (Salary) as Avg_salary
From instructor
Group BY Dept_name;
Note: When SQL uses grouping aggregation, it is important to ensure that properties that appear in the SELECT clause but not clustered can only be those properties that appear in the GROUP BY clause. That is, the property that appears in the GROUP BY clause must be a nonclustered property that appears in the SELECT clause. The following example is an error query:
Select Dept_name,id,avg (Salary)
From instructor
Group BY Dept_name; /* This statement is grouped by dept_name and the ID cannot appear in the SELECT clause */
3) HAVING clause
Sometimes, it is more useful to group-qualified conditions than to tuple-qualified conditions. For example, we may be interested only in a department with a teacher's average salary of more than $42000. This condition is not for a single tuple but for a tuple of GROUP BY clauses, and a HAVING clause is required.
It must be noted that theHAVING clause cannot be used alone and must be used with the GROUP BY clause.
The example queries are as follows:
Select Dept_name,avg (Salary) as Avg_salary
From instructor
GROUP BY Dept_name
Having avg (Salary) >42000;
4) Aggregation of NULL values and Boolean values
Suppose some tuples in a instructor relationship take a null value on the salary. Consider the following statement:
Select SUM (Salary)
from instructor;
As previously mentioned, there will be a problem, that is, the value of the above query to be summed contains null values. Therefore, in this operation, SQL needs to ignore the null value in it.
In the SQL standard, all aggregate functions ignore null values in the input collection except for count (*).
SQL Learning Note 3--set operations, null values, and aggregation functions