The power of grouping sets in SQL Server

Source: Internet
Author: User

Source: The power of grouping sets in SQL Server

In SQL Server, do you want to do a clustered operation that spans multiple columns/latitudes without using SSAS licensing (SQL Server Analytics Service). I'm not talking about using the development version in production, or installing pirated SQL Server.

Impossible task? Not necessarily, because the so-called grouping sets through SQL Server is possible. In this article I will give you an overview of the next grouping sets, what kind of queries can be implemented using them, and what are their performance benefits.

aggregations using the grouping sets

Suppose you have an order form, and you want to make T-SQL clustered queries across multiple groupings. In the context of the sales.salesorderheader table of the AdventureWorks2012 database, these groupings can resemble the following:

    • Grouping in each column
    • GROUP by SalesPersonID, year (OrderDate)
    • GROUP by CustomerID, year (OrderDate)
    • GROUP by CustomerID, SalesPersonID, year (OrderDate)

When you want to make these separate groupings with traditional T-SQL queries, you need multiple statements to union all of the recordsets. Let's look at a query like this:

1 SELECT *  from2 (3     --1st Grouping Set4     SELECT5         NULL  as 'CustomerID',6         NULL  as 'SalesPersonID', 7         NULL  as 'OrderYear', 8         SUM(TotalDue) as 'TotalDue' 9      fromSales.SalesOrderHeaderTen     WHERESalesPersonID is  not NULL One  A     UNION  All -  -     --2nd Grouping Set the     SELECT -         NULL  as 'CustomerID', - SalesPersonID, -          Year(OrderDate) as 'OrderYear',  +         SUM(TotalDue) as 'TotalDue'  -      fromSales.SalesOrderHeader +     WHERESalesPersonID is  not NULL A     GROUP  bySalesPersonID, Year(OrderDate) at  -     UNION  All -  -     --3rd Grouping Set -     SELECT - CustomerID, in         NULL  as 'SalesPersonID',  -          Year(OrderDate) as 'OrderYear',  to         SUM(TotalDue) as 'TotalDue'  +      fromSales.SalesOrderHeader -     WHERESalesPersonID is  not NULL the     GROUP  byCustomerID, Year(OrderDate) *  $     UNION  AllPanax Notoginseng  -     --4th Grouping Set the     SELECT + CustomerID, A SalesPersonID, the          Year(OrderDate) as 'OrderYear',  +         SUM(TotalDue) as 'TotalDue'  -      fromSales.SalesOrderHeader $     WHERESalesPersonID is  not NULL $     GROUP  byCustomerID, SalesPersonID, Year(OrderDate) -) asT - ORDER  byCustomerID, SalesPersonID, OrderYear the GO

There are several drawbacks to using this T-SQL statement method:

    • The T-SQL statement itself is huge because each individual grouping is a different query.
    • 1 times per query, thesales.salesorderheader table needs to be accessed 4 times.
    • 1 times per query, you will see SQL Server perform 4 index lookups (nonclustered) in the Execution Plan (index Seek (nonclustered)).

If you use the grouping sets feature introduced since SQL Server 2008, you can greatly simplify the T-SQL code you need. The following code shows you the same query, but this time with grouping sets implementation.

1 SELECT2 CustomerID,3 SalesPersonID,4      Year(OrderDate) as 'OrderYear', 5     SUM(TotalDue) as 'TotalDue'6  fromSales.SalesOrderHeader7 WHERESalesPersonID is  not NULL8 GROUP  by GROUPINGsets9 (Ten     --Our 4 different grouping sets One(CustomerID, SalesPersonID, Year(OrderDate)), A(CustomerID, Year(OrderDate)), -(SalesPersonID, Year(OrderDate)), -     () the ) - GO

As you can see from the code itself, you only specify the required grouping set in the GROUP by GROUPING sets clause-everything else is done by SQL Server. The specified empty parenthesis is the so-called empty Grouping Set, which is aggregated across the entire table. When you look at the STATISTICS IO output, you will find that Sales.SalesOrderHeader has only been accessed 1 times! This is a huge difference from what you've just done by hand.

In the execution plan, SQL Server uses the table spool operator, which temporarily stores the obtained data in tempdb. The data from the worktable created in the temporary table is used in the 2nd branch of the execution plan. Therefore, there is no rescan for each grouped data from the table, which provides better performance for the overall execution plan.

Let's take a look at the execution plan, and you'll find that the query plan contains 3 Stream Aggregate operators (red, blue, green highlighting). These 3 operators calculate each grouping set:

    • The blue highlighted operator calculates the CustomerID, SalesPersonID, year (OrderDate grouping set.
    • The red highlighted operator calculates the grouping set for SalesPersonID, Year (OrderDate) . In addition, the grouping set for each 1 columns is also calculated.
    • The green highlighted operator calculates the grouping set for CustomerID, Year (OrderDate) .

The idea behind the 2 consecutive stream aggregate operators is to compute the so-called Super aggregates-clustered aggregation.

Summary

In today's article I introduce you to grouping sets, the enhanced T-SQL introduced after SQL Server 2008. As you can see, grouping sets has 2 great advantages: Simplify your code and access data only once to improve query performance.

I hope that now you have been able to understand grouping sets, if you can use this function in your database can leave a message here, thank you very much!

Thanks for your attention!

The power of grouping sets in SQL Server

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.