Union, cube, rollup, and cumpute operators in SQL Server

Source: Internet
Author: User

The Union operator combines two or more query results into a single result set.
This result set contains all rows of all queries in the Union query. This is different from using join to combine columns in two tables.

There are two basic rules for query result sets using union:

1. the columns and columns in all queries must be in the same order.

2. Data Types must be compatible

A. The Union result set column names are the same as those in the result set in the first select statement. The result set column names of other select statements are ignored.

B. By default, the Union operator deletes duplicate rows from the result set. If the all keyword is used, the result set contains all rows without deleting duplicate rows.

C. The SQL statements that contain the Union operator are valued from left to right. The order of evaluation can be changed using parentheses.

-- Example:

Select * From tablea

Union all

(Select * From tableb

Union all

Select * From tablec)

In this way, you can first merge tableb and tablec, and then merge tablea

D. If you want to save the merged result set to a new data table, the into statement must be added to the first select statement.

E. The order by and compute clauses can only be used in the last SELECT statement, which affects the sorting and counting summary of the final merge results.

The F, Group by, and having clauses can be used in a single SELECT query without affecting the final result.

-- Example:

Select name as name, class as class, grade

Into # students

From stud87

Union all

Select * From stud88

Union all

Select * From stud89

Order by grade

Insert the merge results of the three classes (sorted by grade) to the temporary table # students

Aggregate Data with cube

The result set generated by the cube operator is a multi-dimensional dataset. A multi-dimensional dataset is an extension of fact data. Fact data is the data that records individual events. The extension is created on the columns that the user intends to analyze. These columns are called dimensions. A multi-dimensional dataset is a result set that contains all possible combinations of cross tables for each dimension.

The cube operator is specified in the group by clause of the SELECT statement. The statement selection list should contain dimension columns and aggregate function expressions. The dimension column and keyword with cube should be specified for group. The result set contains all possible combinations of values in the dimension column, and the aggregated values in the basic rows that match the combination of these dimension values.

For example, a simple table inventory contains:

Item color Quantity
--------------------------------------------------
Table blue 124
Table red223
Chair blue 101
Chair red210

The following query returns a subtotal of all possible combinations of item and color:

Select item, color, sum (Quantity) as qtysum
From inventory
Group by item, color with cube

The following is the result set:

Item color qtysum
-------------------------------------------------
Chair blue 101.00
Chair red210.00
Chair (null) 311.00
Table blue 124.00
Table red223.00
Table (null) 347.00
(Null) 658.00
(Null) Blue 225.00
(Null) red433.00

We focus on the following:

Chair (null) 311.00

This row reports the subtotal of all rows whose values are chair in the item dimension. A null value is returned for the color dimension, indicating that the aggregation reported by this row includes rows with any color dimension value.

Table (null) 347.00

This row is similar, but the report shows that the value of the item dimension is the subtotal of all rows in the table.

(Null) 658.00

This row reports the total number of multidimensional datasets. The values of item and color are both null, indicating that all values in both dimensions are aggregated in this row.

(Null) Blue 225.00
(Null) red433.00

The two rows report the subtotal of the color dimension. The item dimension values in both rows are null, indicating that the aggregated data comes from the row where the item dimension is any value.

Use grouping to differentiate null values
The null value generated by the cube operation brings about a problem: how to distinguish between the null value generated by the cube operation and the null value returned from the actual data? This problem can be solved by the grouping function. If the value in the column comes from the fact data, the grouping function returns 0. If the value in the column is null generated by the cube operation, 1 is returned. In the cube operation, the generated null represents the whole value. You can use the 'select' statement to replace the generated null with the string 'all. Because null in the fact data indicates that the data value is unknown, the SELECT statement can also be decoded as the returned string unknown to replace null from the fact data. For example:

Select case when (grouping (item) = 1) Then ''all''
Else isnull (item, ''unknown '')
End as item,
Case when (grouping (color) = 1) Then ''all''
Else isnull (color, ''unknown '')
End as color,
Sum (Quantity) as qtysum

Select statements containing cubes with many dimensions may generate large result sets because these statements generate rows for all combinations of values in all dimensions. These large result sets may contain too much data and are not easy to read and understand. One solution to this problem is to place the SELECT statement in the View:

Create view invcube
Select case when (grouping (item) = 1) Then ''all''
Else isnull (item, ''unknown '')
End as item,
Case when (grouping (color) = 1) Then ''all''
Else isnull (color, ''unknown '')
End as color,
Sum (Quantity) as qtysum
From inventory
Group by item, color with cube

Then you can use this view to query only the dimension values you are interested in:

Select *
From invcube
Where item = ''chair''
And color = ''all''

Item color qtysum
-------------------------------------------------
Chair all 311.00

(1 row (s) affected)

The rollup operator is useful when collecting data using rollup to generate reports that contain subtotal and aggregate. The result set generated by the rollup operator is similar to the result set generated by the cube operator. For more information, see aggregate data with cube. The difference between cube and rollup is that the result set generated by cube displays the aggregation of all the combinations of the values in the selected column. The result set generated by rollup displays the aggregation of a certain hierarchy of values in the selected column. For example, a simple table inventory contains: item color Quantity
----------------------------------------------
Table blue 124
Table red223
Chair blue 101
Chair red210
The following query generates a subtotal report:
Select case when (grouping (item) = 1) Then ''all''
Else isnull (item, ''unknown '')
End as item,
Case when (grouping (color) = 1) Then ''all''
Else isnull (color, ''unknown '')
End as color,
Sum (Quantity) as qtysum
From inventory
Group by item, color with Rollup

Item color qtysum

----------------------------------------------
Chair blue 101.00
Chair red210.00
Chair all 311.00
Table blue 124.00
Table red223.00
Table all 347.00
All all 658.00

(7 row (s) affected)
If the rollup keyword in the query is changed to cube, the cube result set is the same as the preceding result, but the following two rows are returned at the end of the result set:

Full blue 225.00
All red433.00
The cube operation may combine item and color values to generate rows. For example, Cube not only reports all possible combinations of color values (red, blue, and red + blue) combined with item value chair ), it also reports all possible combinations of item values (Chair, table, and Chair + Table) combined with the color value red ). For each value in the right column of the Group by clause, the rollup operation does not report all possible combinations of values in the left column (or left column. For example, rollup does not report all possible combinations of item values for each color value. The result set of The rollup operation is similar to the result set returned by compute by. However, rollup has the following advantages: rollup returns a single result set; compute by returns multiple result sets, multiple result sets increase the complexity of application code. Rollup can be used in server cursors; compute by cannot. Sometimes, the execution plan generated by the query optimizer for rollup is more efficient than that generated by compute.

Use compute and compute by to summarize data
Compute and compute by are provided for backward compatibility. Use the following components instead:

Microsoft SQL Server 2000 Analysis Services is used together with OLE DB for analysis services or Microsoft ActiveX Data Objects (multidimensional) (ADO MD. For more information, see Microsoft SQL Server 2000 Analysis Services.
The compute by clause allows you to use the same SELECT statement to view both the detailed row and the summary row. You can calculate the summary value of the sub-group or the summary value of the entire result set.

The compute clause requires the following information:

Optional by keyword. This keyword can be used to calculate the specified row aggregation for a column.

Name of the row aggregate function, for example, sum, AVG, Min, Max, or count.

The column for which the row aggregate function is to be executed.
Result set generated by compute
The summary value generated by compute is displayed as a separate result set in the query results. The results of queries including the compute clause are similar to reports that control interruptions, that is, reports whose summary values are controlled by a specified group (or interrupt. You can generate a summary value for each group or compute multiple Aggregate functions for the same group.

When compute has an optional by clause, each group that meets the select condition has two result sets:

The first result set of each group is a detailed row set, which contains the selection list of the group.

The second result set of each group has a row, which contains the subtotal of the aggregate function specified in the compute clause of the group.
When compute does not include an optional by clause, the SELECT statement has two result sets:

The first result set of each group is all the lines that contain the selected list information.

The second result set contains a row containing the total of Aggregate functions specified in the compute clause.
Compute usage example
The following SELECT statement uses a simple compute clause to generate the sum of price and advance in the titles table:

Use pubs
Select Type, price, advance
From titles
Order by type
Compute sum (price), sum (advance)

The following query adds an optional by keyword to the compute clause to generate a subtotal for each group:

Use pubs
Select Type, price, advance
From titles
Order by type
Compute sum (price), sum (advance) by type

The results of this select statement are returned with 12 result sets. Each of the six groups has two result sets. The first result set of each group is a row set that contains the information requested in the selection list. The second result set of each group contains the subtotal of the two sum functions in the compute clause.

This section describes some utility tools (such as osql) that display multiple subtotal or aggregate summary methods may make the user think that each subtotal is a separate row in the result set. This is because the utility sets the output format. A single row is used for Subtotal or aggregate return. Other applications, such as the SQL query analyzer, set multiple aggregates to the same row.

Comparison between compute and group
The differences between compute and group by are summarized as follows:

Group by generates a single result set. Each group has a row that only contains the group's aggregate functions based on the column and display the group's sub-aggregation. The selection list can only contain groups based on columns and Aggregate functions.

Compute generates multiple result sets. A type of result set contains the detailed rows of each group, including the expressions in the selection list. Another type of result set includes the sub-aggregation of the group, or the total aggregation of select statements. The selection list can contain expressions other than grouping based on columns or aggregate functions. The aggregate function is specified in the compute clause, rather than in the selection list.
The following query uses the group by and Aggregate functions. This query returns a result set with one row in each group. This row contains the aggregate subtotal of this group:

Use pubs
Select Type, sum (price), sum (advance)
From titles
Group by type

The Data Types of ntext, text, or image cannot be included in the compute or compute by clause.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.