SQL Server Data Summary five methods to easily deal with, SQL Server
This article will discuss how to use the GROUPBY clause to summarize data.
Use separate column grouping
The group by clause sets grouping conditions to summarize data. In the first example, I GROUP data on a column in the table Sales. SalesOrderDetail. In the database AdventureWork2012. This example and other examples use the database AdventureWorks2012. If you want to use it to run my code, you can click to download it.
The following is the source code of the first example. Use the group by clause on the CarrierTrackingNumber column to group data.
USE AdventureWorks2012; GO SELECT CarrierTrackingNumber ,SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail GROUP BY CarrierTrackingNumber;
After running this code, I will get 3807 records. Below are the first five values in this huge result set:
CarrierTrackingNumber LineTotal -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6E46-440A-B5 2133.170700 B501-448E-96 4085.409800 8551-4CDF-A1 72616.524200 B65C-4867-86 714.704300 99CE-4ADA-B1 16185.429200
In the above example, I use the group by clause to select which columns will be used as the clustering dventureWorks2012.Sales. salesOrderDetai: Conditions for data in a data table. In this example, I use CarrierTrackingNumber to summarize data. When you group data, only the columns that appear in the group By clause are valid in the selection list. In my example, I used the aggregate function SUM to calculate LineTotal. For convenience, I set the alias SummarizedLineTotal for it.
If I want to obtain a clustering set whose CarrierTrackingNumber meets certain conditions, I can restrict the query in the Where clause, as I do below:
USE AdventureWorks2012; GO SELECT CarrierTrackingNumber ,SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail WHERE CarrierTrackingNumber = '48F2-4141-9A' GROUP BY CarrierTrackingNumber;
Here, I added a limit in the where clause based on the original query. I set my query to only return results with CarrierTrackingNumber equal to a specific value. After running this code, I will get the number of rows in the record where CarrierTrackingNumber is 48F2-4141-9A. The filter action of the Where clause takes effect before data is aggregated.
Grouping by multiple columns
Sometimes you may need to use multiple columns for Data grouping. below is the sample code for grouping multiple columns.
SELECT D.ProductID , H.OrderDate , SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail D JOIN AdventureWorks2012.Sales.SalesOrderHeader H ON D.SalesOrderId = H.SalesOrderID GROUP BY ProductID, OrderDate;
The query returns 26878 rows of data. This is part of the result returned by the preceding query:
ProductID OrderDate LineTotal -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 714 2008-05-21 00:00:00.000 99.980000 859 2007-11-03 00:00:00.000 48.980000 923 2007-11-23 00:00:00.000 14.970000 712 2007-12-22 00:00:00.000 62.930000 795 2007-10-14 00:00:00.000 2443.350000 950 2007-07-01 00:00:00.000 2462.304000 795 2007-11-06 00:00:00.000 2443.350000 877 2007-11-19 00:00:00.000 15.900000 713 2007-10-01 00:00:00.000 99.980000 860 2008-05-31 00:00:00.000 48.980000 961 2008-05-01 00:00:00.000 36242.120880
In the preceding example, the ProductID and OrderDate columns are used in the Group by clause. SQL Server returns the LineTotal value based on the uniqueness of the combination of ProductID and OrderDate, and sets its alias SummarizedLineTotal. If you view the program output, you will find that after SQL Server groups the data, the returned results do not have a specific order. If you need to sort the returned results in a certain order, you need to use the order by clause, as shown in the following code.
SELECT D.ProductID , H.OrderDate , SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail D JOIN AdventureWorks2012.Sales.SalesOrderHeader H ON D.SalesOrderId = H.SalesOrderID GROUP BY ProductID, OrderDate ORDER BY SummarizedLineTotal DESC;
In the above code, I sorted the result set in descending order of SummorizedLineTotal. The values of this column are grouped BY the Group BY clause and obtained BY using the aggregate function Sum for LineTotal. I sorted the results in descending order according to the values of SummorizedLineTotal. If you run this program, you can obtain the highest number of LineTotal ProductID and OrderDate.
Group data without any value
Sometimes you need to group data that contains null values in some records. When you perform this operation on SQL Server, it automatically assumes that all NULL values are equal. Let me look at the example program in detail.
CREATE TABLE NullGroupBy (OrderDate date, Amount Int); INSERT INTO NullGroupBy values (NULL,100), ('10-30-2014',100), ('10-31-2014',100), (NULL,100); SELECT OrderDate, SUM(Amount) as TotalAmount FROM NullGroupBy GROUP BY OrderDate; DROP TABLE NullGroupBy; When I run this code I get the following output: OrderDate TotalAmount -- -- -- -- -- -- -- -- -- -- - NULL 200 2014-10-30 100 2014-10-31 100
In the above program, I first created and filled a NullGroupBy table. in this table, I placed four different rows. The value of orderDate in the first and last rows is NULL, and the value of orderDate in the other two columns is different. As shown in the preceding output, SQL Server aggregates the two rows whose OrderDate is NULL into one row during grouping.
Use expression in Group BY clause
Sometimes you need to use an expression in the Group by clause, instead of a specific column. SQL Server allows you to specify an expression in the Group By clause, as shown in the following code:
SELECT CONVERT(CHAR(7),H.OrderDate,120) AS [YYYY-MM] , SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail D JOIN AdventureWorks2012.Sales.SalesOrderHeader H ON D.SalesOrderId = H.SalesOrderID GROUP BY CONVERT(CHAR(7),H.OrderDate,120) ORDER BY SummarizedLineTotal DESC;
The above Code uses the year and month data in the OrderDate column for grouping, and uses the expression CONVERT (CHAR (7), H. orderDate, 120), I told the SQL Server to intercept the first seven characters in the standard Date Format of OrderDate ODBC, that is, the OrderDate yyyy-mm section. Based on this expression, I can find the total SummarizeLineTotal value for a specific year, use the expression in the Group By clause, and sort the LineTotal value, I can find out which year or month the SummarizeLineTotal is the largest or the smallest.
Use HAVING clause to filter data
Having is another important clause that can be used with the Group BY clause. Using Having clause, you can filter out data rows that do not conform to the expressions of Having clause, when I use the where clause on it, the filtering behavior will be generated before clustering. The Having clause allows you to filter aggregate rows based on certain criteria. For more information about the Having clause, see the following code:
SELECT D.ProductID , H.OrderDate , SUM(LineTotal) AS SummarizedLineTotal FROM AdventureWorks2012.Sales.SalesOrderDetail D JOIN AdventureWorks2012.Sales.SalesOrderHeader H ON D.SalesOrderId = H.SalesOrderID GROUP BY ProductID, OrderDate HAVING SUM(LineTotal) > 200000 ORDER BY SummarizedLineTotal DESC;
In the preceding code, the Having clause has the following constraints: SUM (LineTotal)> 200000. The having clause ensures that the aggregate value (SummarizedLineTotal) of LineTotal in the final result is greater than 200000. by using the having clause, My query returns only one row of data, and its SummarizedLineTotal is greater than 200000. the having clause allows SQL Server to return only data rows whose aggregate results meet the Having clause restrictions.
Summary
Many applications require that data be summarized before being displayed. The group by clause is a mechanism provided by SQL Server. The group by clause allows you to use the HAVING clause to filter the summarized data. We hope that you can achieve the goal more easily next time you need to summarize some data.