How to select the first row/last row/top row of each group in SQL

Source: Internet
Author: User

Reprint Please specify: theviper http://www.cnblogs.com/TheViper

Partially translated from How to select the First/least/max row per group in SQL

Some common SQL problems have similar workarounds, such as finding the most recent logs for each program, finding the most popular items in each category, and finding the top 5 points that each player plays. These problems can be summed up in selecting top N from each group.

Fruits table

Select the lowest-priced row in each category

Step: 1. Find the required value for the price. 2. Populating other fields

Method 1: Self-connect

Group By type and select the line with the lowest price

Select min  as Minprice  from Fruits Group  by type;

Use the self-connection to merge the remaining lines with the rows above, as the above query has been grouped, and the remaining fields are connected to the table without grouping with a subquery.

Select F.type, f.variety, F.price  from (   Selectmin as minprice   fromGroup  by  asinnerjoinas on= and = X.minprice;

In fact, this method can be grouped directly with group, and I don't know what the author thinks.

SELECT MIN  as Minprice  from Fruits GROUP  by TYPE;

Method 2 Dependent subquery

This method is inefficient, but very clear.

Select type, variety, price  from Fruits where = (Selectminfrom aswhere= fruits.type);

Select top N rows for each group

Selecttype, variety, price fromFruitswherePrice=(Select min(Price) fromFruits asFwhereF.type=Fruits.type)orPrice=(Select min(Price) fromFruits asFwhereF.type=Fruits.type andPrice>(Select min(Price) fromFruits asF2whereF2.type=Fruits.type));

As you can see, first select the lowest price line, and then select the second low price line, two are connected with or.

This can also be used for self-connected writing, but more complex. As you can see, if you need to elect TOP3,TOP4,... , this approach gets worse.

There's a better way.

Select type, variety, price  from Fruits where (   Selectcount(* from as F     where=and<=<=2;

This can be understood as traversing the outside of the fruits rows, if the same classification, there are other rows <= the row and the number of such rows <=2, that line meets the requirements, take out.

You can see that this approach is elegant, because changing N to another value does not require rewriting. But this method is essentially the same as the previous method, and it uses the subquery. Some query optimizer does not perform well on subqueries.

Using Union

If there is an index on (type, price), and the index can filter many rows, then you can use limit for each category. Then the union merges them.

(Select *  fromFruitswhereType= 'Apple' Order  byPrice limit2)Union  All(Select *  fromFruitswhereType= 'Orange' Order  byPrice limit2)Union  All(Select *  fromFruitswhereType= 'Pear' Order  byPrice limit2)Union  All(Select *  fromFruitswhereType= 'Cherry' Order  byPrice limit2)

Note that this is union all, not union. Doing so prevents the result from being sorted to remove duplicate rows before returning the result. There are no duplicate rows in this scenario, so let's tell the database not to sort the weight.

About union see Using union to implement loose index scan in MySQL

Use user variable (variables) for MySQL only

The above union is a good way to do this when the number of rows is small and the index can be used for sorting. The methods described below are only valid for MySQL. Before introducing this method, please see my other article how to number rows in MySQL.

In short, the article increments the number of rows in the same category

The method described below is based on this.

Set @num:= 0,@type:= "';Selecttype, variety, price from (   Selecttype, variety, price,@num:= if(@type =Type@num + 1,1) asRow_number,@type:=Type as Dummy   fromFruitsOrder  bytype, price) asXwhereX.row_number<= 2;

The subquery creates a temporary table and fills it with Row_number,dummy, which is an operation. Then choose the row_number<=2 line, which is another operation. Although there are two operations, its complexity is still O (n), which is only related to the size of the table, which is much better than the complexity O (N2) of the correlated subquery. The N of the correlated subquery is the number of categories, and if there are many classifications, the performance will be bad.

Finish

How to select the first row/last row/top row of each group in SQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.