Summary SQL Server window functions

Source: Internet
Author: User
I have always liked using window functions of SQL server2005/2008, especially the ranking function row_number. This evening, I checked the relevant documents of SQL Server development, sorted out my favorites and found two articles that have been added to my favorites for a long time. After reading them, I learned a little bit about them. By the way, I will summarize them.
I. Starting from a familiar example We are familiar with the paging query of databases. Let's take this article as an example. To query the persons in the person table by page, you can write the SQL statement as follows:
 
With record as (selectrow_number () over (order by id desc) as recordnumber, ID, firstname, lastname, height, weightfromperson (nolock) Select recordnumber, (select count (0) from Record) as totalcount, ID, firstname, lastname, height, weightfrom recordwhere recordnumber between 1 and 10
Where, Row_number () Is the ranking function, followed Over () A function is a window function. Are you still using the secondary top paging query? You can consider using the ranking function with CTE to implement paging. Ii. Window functions This document introduces window functions. The following student renewal table is used as an example:
 
Create Table [studentscore] ([ID] [int] Identity (1,1) not null, [studentid] [int] not null constraint [df_studentscore_studentid] default (0 )), [classid] [int] not null constraint [df_studentscore_classid] default (0), [courseid] [int] not null constraint [df_studentscore_courseid] default (0 )), [score] [float] not null constraint [df_studentscore_score] default (0), [createdate] [datetime] not null constraint [df_studentscore_createdate] default (getdate ())) on [primary]
Here, ID is an auto-incremental ID, and createdate is the input time, Studentid, classid, courseid, score . Enter the following test data:
-- Courseid 2: Language 4: Mathematics 8: English -- insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (95.5, 2,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values, 8, 75.5) insert into studentscore (studentid, classid, courseid, score) values (3, 1, 8, 77) -- insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (4,100,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (78.5, 8,) -- grade 3: insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values,) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values) insert into studentscore (studentid, classid, courseid, score) values (,) insert into studentscore (studentid, classid, courseid, score) values)
Window functions are newly added to SQL server2005. Next we will talk about its basic concepts: 1. Functions of window functions Window functions operate on a group of values without grouping data using the group by clause. They can also return columns and aggregate columns of the basic row in the same row. For example, we want to obtain the average score of all the students in all the classes of a grade. According to the traditional method, we must use the AVG aggregate function to calculate the average score. The "disadvantage" is that we cannot easily return the columns (classes, students, and other columns) of the basic row, but can only get aggregate columns. The main point of the aggregate function is to aggregate a group of values and use the group by query as the context of the Operation. Since the Group by operation groups the data, query returns only one row of data for each group. Therefore, you must restrict all expressions to return only one value for each group. By using window functions, you can easily query basic columns and aggregate columns. 2. Basic syntax Over ([partition by value_expression,... [N] <order by by_clause>)
Window functions are implemented using over functions. Over functions are divided into two types: parameter and parameter. The optional partition by parameter is used to group data by specific fields. 3. Simple Example Query the basic columns in the student renewal table and the average Chinese score of all students in all classes:
Select -- id, -- createdate, studentid, classid, courseid, score, cast (AVG (score) over () as decimal (5, 2) as 'average Chinese level' fromstudentscorewhere courseid = 2
The result is as follows: 4. partition If we need to query the average Chinese score of each class, we can group the data by partion:
 
Selectid, createdate, studentid, classid, courseid, score, cast (AVG (score) over (partition by classid) as decimal (5, 2) as 'average Chinese level' fromstudentscorewhere courseid = 2
The query results are as follows: the figure may not be clear. The average Chinese score of the three classes is different. Here, you may have realized the advantages of using the over function: A and over clauses can aggregate the basic columns in the same row while returning them.
B. You can mix basic columns and aggregate columns in expressions. If we use a traditional group by grouping query, it is not that simple SQL statement to directly obtain basic columns and aggregate columns. As you know, many Aggregate functions, such as sum, AVG, Max, and Min, all support window function operations. 2. Easy-to-use ranking Functions SQL Server provides four ranking functions: row_number (), rank (), dense_rank (), and ntile (). The following example focuses on the use of these four functions. 1. row_number () Returns the serial number of an expert in the result Set Partition. The first row of each partition starts from 1. The order by clause determines the order in which a unique row_number is allocated for rows in a specific partition. The following queries are sorted in reverse order based on the mathematical score:
 
Selectid, -- createdate, row_number () over (order by score DESC) as 'sequence number ', studentid, classid, courseid, scorefromstudentscorewhere courseid = 8
The results are as follows: As far as I know, this function has been widely used in SQL Server paging queries. Good job. 2. rank () and dense_rank () (1 ), Rank () function Returns the rank of each row in the partition of the result set. The row ranking is the plus one ranking before the related row. If two or more rows are associated with a ranking, each associated row gets the same ranking.
Selectid, -- createdate, rank () over (order by score DESC) as 'sequence number ', studentid, classid, courseid, scorefromstudentscorewhere courseid = 8
The result is as follows: note that it has similarities and differences with row_number (). You should know that a, rank, and row_number functions are similar. They are used to sort the results.
B. The difference is that the row_number function generates a unique sequence number for each value, while the rank function generates the same sequence number for the same value.
Medium, the numbers of the two 86-point students are 3 And then the serial numbers placed below them directly become 5 . (2 ), Dense_rank () function Returns the row ranking in the result set partition, without any interruption in the ranking. The row ranking is equal to the number of all the rankings before the discussed row plus one. If two or more rows are subject to the ranking constraints in the same partition, each row will receive the same ranking.
Selectid, -- createdate, dense_rank () over (order by score DESC) as 'sequence number ', studentid, classid, courseid, scorefromstudentscorewhere courseid = 8
The query result is as follows: 3 And the serial numbers below them are 4 (That is to say, the number of the dense_rank () function query is as continuous as row_number (), but the same sequence number is generated for rows with the same value. In this regard, for queries with the same query conditions and sorting conditions, the result set of the row_number () function query is a subset of the results of the dense_rank () function query ). This is the biggest difference between rank and dense_rank. 3. ntile () The ntile function associates rows in the result to a group and assigns each row a group number from the beginning. For each row, ntile returns the group ID of the row.
If the number of rows in a partition cannot be divisible by integer_expression, a Member may have two groups of different sizes. In the order specified by the over clause, a large group is placed before a small group.
Selectid, -- createdate, ntile (6) over (order by classid DESC) as 'group number', studentid, classid, courseid, scorefromstudentscorewhere courseid = 8
The query results are as follows: the introduction and examples in this article are very basic. However, using window functions can indeed help us optimize many complex queries. The preceding SQL statements seem to be simple, but the current simplicity hides the complexity behind them. It should be noted that, although the concept of grouping is very important, you must master it. With the skillful application of window functions, your SQL query will become even more powerful. Finally, I have been worried about the performance of SQL Server for massive data. Because recent development happens to encounter massive data queries, with a maximum of over 50 million data records and a minimum data volume of over million data records, I wonder if the performance of Partitioned Tables has significantly improved. Reference Article : Http://msdn.microsoft.com/zh-cn/library/ms189461.aspxhttp://www.cnblogs.com/aierong/archive/2008/08/26/1273890.htmlhttp://www.cnblogs.com/aierong/archive/2008/08/18/1269407.html
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.