SQL Server is grouped and then takes top N

Last Update:2016-04-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

SQL Server Group after top N (GO)

Recently, the work of Tuyu a demand: To Group a data table, and then take out each group in accordance with a certain sequence of the first n data. At first glance, this is an ordinary inquiry, no difficulty. Can write a pen, after all, is trapped in the author for a while. Think about it, check the network, and never thought it was a classic topic in the SQL world. Today, I would like to make a list of some of the methods I have come up with in order to discuss with them.

The sample table structure is described before the text.

　　　　　　　　　　　　　　　　　　　 table Sectiontransactionlog, a log table used to record various activities of each department 　　　　　　　　　　　　　　　　　　　　　 SectionID, Department ID sectiontransactiontype, activity type Totaltransactionvalue, activity spending transactiondate, event time

We set the scene to select the last two events held by each department (SectionID).

I used to test the data in the Sectiontransactionlog table is over 3,000,000.

One, nested subqueries query mode

1SELECT*From Sectiontransactionlog mlog2 where 3 (select count (from Sectiontransactionlog sublog4 where Sublog.sectionid Span style= "color: #808080;" >= mlog.sectionid and sublog.transactiondate >= mlog.transactiondate) <=2< Span style= "color: #008080;" >5 order by SectionID, transactiondate desc

Running time: 34 seconds

The principle of this method is relatively simple, only in the subquery to determine whether the record is one of the newly occurred in its section 2.

1SELECT*FromSectiontransactionlog Mlog2where mlog.idInch3 (select top 2 Id 4 from Sectiontransactionlog sublog5 where Sublog.sectionid = Mlog.sectionid6 Span style= "color: #0000ff;" >order by transactiondate desc ) 7 order by SectionID, transactiondate desc

Running time: 1 minutes 25 seconds

Use the TransactionDate sort in the subquery to take top 2. and apply the In keyword to determine whether the record conforms to the subquery.

Second, the self-connection mode

1Select Mlog.*FromSectiontransactionlog Mlog2InnerJoin3 (SELECT Rankleft.id,COUNT (*)As RanknumFromSectiontransactionlog Rankleft4InnerJoinSectiontransactionlog Rankright5On Rankleft.sectionid= Rankright.sectionidand rankleft.transactiondate <= Rankright.transactiondate6 group by Rankleft.id7  Having count (*) <= Span style= "color: #800000; Font-weight:bold; " >2) sublog on mlog.id =8 order by Mlog.sectionid, mlog.transactiondate desc

Running time: 56 seconds

This is a clever implementation, but it is slightly more complex than the previous method. The Sublog section, which is based on the Sectiontransactionlog table self-join, calculates the sort ranknum (by Time transactiondate) of each activity (identified by ID).

In the self-join condition Rankleft.sectionid = Rankright.sectionid and Rankleft.transactiondate <= Rankright.transactiondate filter, the query results are associated with an activity (identified by ID) with only one section and later or at the same time as the activity (including itself, of course). Active self-joins for id=1:

It can be seen from a glance that the count calculation based on this result is the Id=1 activity in section 9022 of the row ranknum.

Then have COUNT (*) <= 2 To select the row within 2, and then do a join select the required information.

Third, apply Row_number () (SQL SERVER 2005 and later)

1Select*From2(3Select*, Row_number ()over (Partition by SectionID order by transactiondate desc) as Rownum4 from Sectiontransactionlog5 ) Ranked6 where Ranked.rownum <= 27 order by ranked. SectionID, ranked. TransactionDate desc

Running time: 20 seconds

This is by far the most efficient way to achieve this. Row_number () over (partition by SectionID ORDER BY transactiondate Desc) completes the entire process of grouping, sorting, and fetching line numbers.

Efficiency considerations

Below we make an efficient statistic on the above 4 methods.

Method	Time-consuming (seconds)	Ranking
Application Row_number ()	20	1
Nested subqueries 1	34	2
Self-Join mode	56	3
Nested subqueries 2	85	4

Of the 4 methods, the nesting of subqueries 2 is the longest, and where is the efficiency loss? Did you really use the IN keyword? For its execution plan (execute plan):

, we can see that the optimizer will parse in to the left Semi Join, and its loss is very low. The majority of the query's performance is consumed by the subquery's order by (Top N Sort). Sure enough, if you delete the ORDER BY transactiondate DESC clause in the subquery (which, of course, is incorrect), it takes only 8 seconds.

Adding a valid index can improve the performance of the Query method.

SQL Server is grouped and then takes top N

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

SQL Server is grouped and then takes top N

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

SQL Server is grouped and then takes top N

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support