The first time I saw such an SQL statement, I couldn't understand it, and I used the following uncommonly used
Clustering function: GROUPING
The ROLLUP operator used to summarize data
SELECT
Case grouping (o. customerid) WHEN 0 THEN o. customerid ELSE '(Total)' END
AS AllCustomersSummary,
Case grouping (od. orderid) WHEN 0 THEN od. orderid ELSE-1 END
AS IndividualCustomerSummary,
SUM (od. quantity * od. unitprice) AS price
FROM Orders o, [Order Details] od
WHERE Year (o. orderdate) = 1998 AND od. orderid = o. orderid
Group by o. customerid, od. orderid WITH ROLLUP
Order by AllCustomersSummary
Check the help of SQL Server to find out, amazing, there is such a powerful thing, can not help but think of the previous Crystal Report, the original SQL Server can achieve such a function.
1. Use CUBE to summarize data
The result set generated by the CUBE operator is a multi-dimensional dataset. A multi-dimensional dataset is an extension of fact data. Fact data is the data that records individual events. The extension is created on the columns that the user intends to analyze. These columns are called dimensions. A multi-dimensional dataset is a result set that contains all possible combinations of cross tables for each dimension.
The CUBE operator is specified in the group by clause of the SELECT statement. The statement selection list should contain dimension columns and aggregate function expressions. The dimension column and keyword with cube should be specified for group. The result set contains all possible combinations of values in the dimension column, and the aggregated values in the basic rows that match the combination of these dimension values.
For example, a simple tableInventoryIncluding:
Item Color Quantity -------------------- -------------------- -------------------------- Table Blue 124 Table Red 223 Chair Blue 101 Chair Red 210
The returned results of the following queries includeItemAndColorAll possible combinationsQuantitySubtotal:
SELECT Item, Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE
The following is the result set:
Item Color QtySum -------------------- -------------------- -------------------------- Chair Blue 101.00 Chair Red 210.00 Chair (null) 311.00 Table Blue 124.00 Table Red 223.00 Table (null) 347.00 (null) (null) 658.00 (null) Blue 225.00 (null) Red 433.00
We focus on the following:
Chair (null) 311.00
This line reportsItemThe value in the dimension is the subtotal of all rows of Chair. PairColorThe dimension returns a NULL value, indicating that the aggregation reported by this row includesColorThe row whose dimension is any value.
Table (null) 347.00
This line is similar, but the report isItemThe mean value of a dimension is the subtotal of all rows in the Table.
(null) (null) 658.00
This row reports the total number of multidimensional datasets.ItemAndColorThe dimension value is NULL, indicating that all values in the two dimensions are aggregated in the row.
(null) Blue 225.00 (null) Red 433.00
The two rows reportedColorSubtotal of a dimension. In the two rowsItemAll dimension values are NULL, indicating that the aggregated data comes fromItemThe row whose dimension is any value.
Use GROUPING to differentiate null values
The NULL value generated by the CUBE operation brings about a problem: how to distinguish between the NULL value generated by the CUBE operation and the NULL value returned from the actual data? This problem can be solved by the GROUPING function. If the value in the column comes from the fact data, the GROUPING function returns 0. If the value in the column is NULL generated by the CUBE operation, 1 is returned. In the CUBE operation, the generated NULL represents the whole value. You can use the 'select' statement to replace the generated NULL with the string 'all. Because NULL in the fact data indicates that the data value is UNKNOWN, the SELECT statement can also be decoded as the returned string UNKNOWN to replace NULL from the fact data. For example:
SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE
Multi-dimensional dataset
The CUBE operator can be used to generateNDimension, that is, a cube with any number of dimensions. Only one dimension cube can be used to generate a total. For example:
SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item WITH CUBEGO
The result set returned by this SELECT statement is displayedItemReturns the subtotal of each value inItemTotal of all values in:
Item QtySum -------------------- -------------------------- Chair 311.00 Table 347.00 ALL 658.00
SELECT statements containing cubes with many dimensions may generate large result sets because these statements generate rows for all combinations of values in all dimensions. These large result sets may contain too much data and are not easy to read and understand. One solution to this problem is to place the SELECT statement in the View:
CREATE VIEW InvCube ASSELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE
Then you can use this view to query only the dimension values you are interested in:
SELECT *FROM InvCubeWHERE Item = 'Chair' AND Color = 'ALL'Item Color QtySum -------------------- -------------------- -------------------------- Chair ALL 311.00 (1 row(s) affected)
2. Use ROLLUP to summarize data
The ROLLUP operator is useful when generating reports that contain subtotal and aggregate. The result set generated by the ROLLUP operator is similar to the result set generated by the CUBE operator. For more information.
The difference between CUBE and ROLLUP is:
- The result set generated by CUBE displays the aggregation of all the combinations of the values in the selected column.
- The result set generated by ROLLUP displays the aggregation of a certain hierarchy of values in the selected column.
For example, a simple tableInventoryIncluding:
Item Color Quantity -------------------- -------------------- -------------------------- Table Blue 124 Table Red 223 Chair Blue 101 Chair Red 210
The following query generates a subtotal report:
SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH ROLLUPItem Color QtySum -------------------- -------------------- -------------------------- Chair Blue 101.00 Chair Red 210.00 Chair ALL 311.00 Table Blue 124.00 Table Red 223.00 Table ALL 347.00 ALL ALL 658.00 (7 row(s) affected)
If the ROLLUP keyword in the query is changed to CUBE, the CUBE result set is the same as the preceding result, but the following two rows are returned at the end of the result set:
ALL Blue 225.00 ALL Red 433.00
CUBE operation isItemAndColorThe median value may be combined to generate rows. For example, CUBE not only reports andItemValue Chair combinationColorAll possible combinations of values (Red, Blue, and Red + Blue), and reportColorValue RedItemAll possible combinations of values (Chair, Table, and Chair + Table ).
For each value in the right column of the group by clause, the ROLLUP operation does not report all possible combinations of values in the left column (or left column. For example, ROLLUP does notColorValue reportItemAll possible combinations of values.
The result set of The ROLLUP operation is similar to the result set returned by compute by. However, ROLLUP has the following advantages:
- ROLLUP returns a single result set; compute by returns multiple result sets, while multiple result sets increase the complexity of application code.
- ROLLUP can be used in server cursors; compute by cannot.
- Sometimes, the execution plan generated BY the query optimizer for ROLLUP is more efficient than that generated by compute.
3. GROUPING
Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.
Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.
Syntax
GROUPING ( column_name )
Parameters
Column_name
Is the column in the group by clause used to check the null values of CUBE or ROLLUP.
Return type
Int
Note
Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".
Example
In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.
USE pubsSELECT royalty, SUM(advance) 'total advance', GROUPING(royalty) 'grp'
FROM titles GROUP BY royalty WITH ROLLUP
Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.
The following is the result set:
royalty total advance grp --------- --------------------- ---NULL NULL 0 10 57000.0000 0 12 2275.0000 0 14 4000.0000 0 16 7000.0000 0 24 25125.0000 0 NULL 95400.0000 1
3. GROUPING
Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.
Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.
Syntax
GROUPING ( column_name )
Parameters
Column_name
Is the column in the group by clause used to check the null values of CUBE or ROLLUP.
Return type
Int
Note
Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".
Example
In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.
USE pubsSELECT royalty, SUM(advance) 'total advance', GROUPING(royalty) 'grp'
FROM titles GROUP BY royalty WITH ROLLUP
Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.
The following is the result set:
royalty total advance grp --------- --------------------- ---NULL NULL 0 10 57000.0000 0 12 2275.0000 0 14 4000.0000 0 16 7000.0000 0 24 25125.0000 0 NULL 95400.0000 1
3. GROUPING
Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.
Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.
Syntax
GROUPING ( column_name )
Parameters
Column_name
Is the column in the group by clause used to check the null values of CUBE or ROLLUP.
Return type
Int
Note
Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".
Example
In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.
USE pubsSELECT royalty, SUM(advance) 'total advance', GROUPING(royalty) 'grp'
FROM titles GROUP BY royalty WITH ROLLUP
Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.
The following is the result set:
royalty total advance grp --------- --------------------- ---NULL NULL 0 10 57000.0000 0 12 2275.0000 0 14 4000.0000 0 16 7000.0000 0 24 25125.0000 0 NULL 95400.0000 1
The introduction to GROUPING, ROLLUP, and CUBE comes from the help of the Chinese version of SQL Server2000.