GROUPING, ROLLUP, and CUBE

Source: Internet
Author: User

The first time I saw such an SQL statement, I couldn't understand it, and I used the following uncommonly used

Clustering function: GROUPING

The ROLLUP operator used to summarize data

SELECT

Case grouping (o. customerid) WHEN 0 THEN o. customerid ELSE '(Total)' END

AS AllCustomersSummary,

Case grouping (od. orderid) WHEN 0 THEN od. orderid ELSE-1 END

AS IndividualCustomerSummary,

SUM (od. quantity * od. unitprice) AS price

FROM Orders o, [Order Details] od

WHERE Year (o. orderdate) = 1998 AND od. orderid = o. orderid

Group by o. customerid, od. orderid WITH ROLLUP

Order by AllCustomersSummary

Check the help of SQL Server to find out, amazing, there is such a powerful thing, can not help but think of the previous Crystal Report, the original SQL Server can achieve such a function.

1. Use CUBE to summarize data

The result set generated by the CUBE operator is a multi-dimensional dataset. A multi-dimensional dataset is an extension of fact data. Fact data is the data that records individual events. The extension is created on the columns that the user intends to analyze. These columns are called dimensions. A multi-dimensional dataset is a result set that contains all possible combinations of cross tables for each dimension.

The CUBE operator is specified in the group by clause of the SELECT statement. The statement selection list should contain dimension columns and aggregate function expressions. The dimension column and keyword with cube should be specified for group. The result set contains all possible combinations of values in the dimension column, and the aggregated values in the basic rows that match the combination of these dimension values.

For example, a simple tableInventoryIncluding:

 

Item         Color        Quantity          -------------------- -------------------- -------------------------- Table        Blue         124            Table        Red         223            Chair        Blue         101            Chair        Red         210            

The returned results of the following queries includeItemAndColorAll possible combinationsQuantitySubtotal:

 

SELECT Item, Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE

The following is the result set:

 

Item         Color        QtySum           -------------------- -------------------- -------------------------- Chair        Blue         101.00           Chair        Red         210.00           Chair        (null)        311.00           Table        Blue         124.00           Table        Red         223.00           Table        (null)        347.00           (null)        (null)        658.00           (null)        Blue         225.00           (null)        Red         433.00           

We focus on the following:

 

Chair        (null)        311.00           

This line reportsItemThe value in the dimension is the subtotal of all rows of Chair. PairColorThe dimension returns a NULL value, indicating that the aggregation reported by this row includesColorThe row whose dimension is any value.

 

Table        (null)        347.00           

This line is similar, but the report isItemThe mean value of a dimension is the subtotal of all rows in the Table.

 

(null)        (null)        658.00           

This row reports the total number of multidimensional datasets.ItemAndColorThe dimension value is NULL, indicating that all values in the two dimensions are aggregated in the row.

 

(null)        Blue         225.00           (null)        Red         433.00           

The two rows reportedColorSubtotal of a dimension. In the two rowsItemAll dimension values are NULL, indicating that the aggregated data comes fromItemThe row whose dimension is any value.

Use GROUPING to differentiate null values

The NULL value generated by the CUBE operation brings about a problem: how to distinguish between the NULL value generated by the CUBE operation and the NULL value returned from the actual data? This problem can be solved by the GROUPING function. If the value in the column comes from the fact data, the GROUPING function returns 0. If the value in the column is NULL generated by the CUBE operation, 1 is returned. In the CUBE operation, the generated NULL represents the whole value. You can use the 'select' statement to replace the generated NULL with the string 'all. Because NULL in the fact data indicates that the data value is UNKNOWN, the SELECT statement can also be decoded as the returned string UNKNOWN to replace NULL from the fact data. For example:

 

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'      ELSE ISNULL(Item, 'UNKNOWN')    END AS Item,    CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'      ELSE ISNULL(Color, 'UNKNOWN')    END AS Color,    SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE
Multi-dimensional dataset

The CUBE operator can be used to generateNDimension, that is, a cube with any number of dimensions. Only one dimension cube can be used to generate a total. For example:

 

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'      ELSE ISNULL(Item, 'UNKNOWN')    END AS Item,    SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item WITH CUBEGO

The result set returned by this SELECT statement is displayedItemReturns the subtotal of each value inItemTotal of all values in:

 

Item         QtySum           -------------------- -------------------------- Chair        311.00           Table        347.00           ALL         658.00           

SELECT statements containing cubes with many dimensions may generate large result sets because these statements generate rows for all combinations of values in all dimensions. These large result sets may contain too much data and are not easy to read and understand. One solution to this problem is to place the SELECT statement in the View:

 

CREATE VIEW InvCube ASSELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'      ELSE ISNULL(Item, 'UNKNOWN')    END AS Item,    CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'      ELSE ISNULL(Color, 'UNKNOWN')    END AS Color,    SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH CUBE

Then you can use this view to query only the dimension values you are interested in:

 

SELECT *FROM InvCubeWHERE Item = 'Chair' AND Color = 'ALL'Item         Color        QtySum           -------------------- -------------------- -------------------------- Chair        ALL         311.00           (1 row(s) affected)



2. Use ROLLUP to summarize data

The ROLLUP operator is useful when generating reports that contain subtotal and aggregate. The result set generated by the ROLLUP operator is similar to the result set generated by the CUBE operator. For more information.

The difference between CUBE and ROLLUP is:

  • The result set generated by CUBE displays the aggregation of all the combinations of the values in the selected column.

  • The result set generated by ROLLUP displays the aggregation of a certain hierarchy of values in the selected column.

For example, a simple tableInventoryIncluding:

 

Item         Color        Quantity          -------------------- -------------------- -------------------------- Table        Blue         124            Table        Red         223            Chair        Blue         101            Chair        Red         210            

The following query generates a subtotal report:

 

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'      ELSE ISNULL(Item, 'UNKNOWN')    END AS Item,    CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'      ELSE ISNULL(Color, 'UNKNOWN')    END AS Color,    SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH ROLLUPItem         Color        QtySum           -------------------- -------------------- -------------------------- Chair        Blue         101.00           Chair        Red         210.00           Chair        ALL         311.00           Table        Blue         124.00           Table        Red         223.00           Table        ALL         347.00           ALL         ALL         658.00           (7 row(s) affected)

If the ROLLUP keyword in the query is changed to CUBE, the CUBE result set is the same as the preceding result, but the following two rows are returned at the end of the result set:

 

ALL         Blue         225.00           ALL         Red         433.00           

CUBE operation isItemAndColorThe median value may be combined to generate rows. For example, CUBE not only reports andItemValue Chair combinationColorAll possible combinations of values (Red, Blue, and Red + Blue), and reportColorValue RedItemAll possible combinations of values (Chair, Table, and Chair + Table ).

For each value in the right column of the group by clause, the ROLLUP operation does not report all possible combinations of values in the left column (or left column. For example, ROLLUP does notColorValue reportItemAll possible combinations of values.

The result set of The ROLLUP operation is similar to the result set returned by compute by. However, ROLLUP has the following advantages:

  • ROLLUP returns a single result set; compute by returns multiple result sets, while multiple result sets increase the complexity of application code.

  • ROLLUP can be used in server cursors; compute by cannot.

  • Sometimes, the execution plan generated BY the query optimizer for ROLLUP is more efficient than that generated by compute.

 

 

3. GROUPING

Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.

Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.

Syntax

GROUPING ( column_name )

Parameters

Column_name

Is the column in the group by clause used to check the null values of CUBE or ROLLUP.

Return type

Int

Note

Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".

Example

In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.

 

USE pubsSELECT royalty, SUM(advance) 'total advance',  GROUPING(royalty) 'grp'  FROM titles  GROUP BY royalty WITH ROLLUP

Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.

The following is the result set:

 

royalty    total advance       grp ---------   ---------------------  ---NULL      NULL           0 10       57000.0000        0 12       2275.0000        0 14       4000.0000        0 16       7000.0000        0 24       25125.0000        0 NULL      95400.0000        1 


 

3. GROUPING

Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.

Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.

Syntax

GROUPING ( column_name )

Parameters

Column_name

Is the column in the group by clause used to check the null values of CUBE or ROLLUP.

Return type

Int

Note

Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".

Example

In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.

 

USE pubsSELECT royalty, SUM(advance) 'total advance',  GROUPING(royalty) 'grp'  FROM titles  GROUP BY royalty WITH ROLLUP

Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.

The following is the result set:

 

royalty    total advance       grp ---------   ---------------------  ---NULL      NULL           0 10       57000.0000        0 12       2275.0000        0 14       4000.0000        0 16       7000.0000        0 24       25125.0000        0 NULL      95400.0000        1 


 

3. GROUPING

Is an aggregate function that generates an additional column. When a row is added using the CUBE or ROLLUP operator, the output value of the appended column is 1, when the added row is not produced by CUBE or ROLLUP, the value of the appended column is 0.

Grouping is allowed only in the selection list associated with the group by clause that contains the CUBE or ROLLUP operator.

Syntax

GROUPING ( column_name )

Parameters

Column_name

Is the column in the group by clause used to check the null values of CUBE or ROLLUP.

Return type

Int

Note

Grouping is used to distinguish between the null values returned by CUBE and ROLLUP and the standard null values. A special application that returns NULL as the result of a CUBE or ROLLUP operation. It acts as a placeholder for a column in the result set, meaning "all ".

Example

In the following exampleRoyaltyGroup and aggregateAdvance. Apply the GROUPING functionRoyaltyColumn.

 

USE pubsSELECT royalty, SUM(advance) 'total advance',  GROUPING(royalty) 'grp'  FROM titles  GROUP BY royalty WITH ROLLUP

Result set inRoyaltyTwo null values are displayed. The first NULL represents the NULL group obtained from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row displays allRoyaltyGroupAdvanceAndGrp1 is used in the column.

The following is the result set:

 

royalty    total advance       grp ---------   ---------------------  ---NULL      NULL           0 10       57000.0000        0 12       2275.0000        0 14       4000.0000        0 16       7000.0000        0 24       25125.0000        0 NULL      95400.0000        1 

The introduction to GROUPING, ROLLUP, and CUBE comes from the help of the Chinese version of SQL Server2000.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.