Iv. Ranking
Sorting and ranking is our most commonly used statistical method, such as the class of students according to the ranking of members, or according to the score of the students divided into several echelon: for example, the best results of the 10 students belong to the first echelon, after 10 divided into the second echelon, and so on. Key words designed by ranking include:row_number (), RANK (), Dense_rank (), NTILE (). let's introduce how they are used and how they differ from each other.
1. 1. Row_number ()
Seeing Row_number (), I think most people will think of Oracle's rownum. They are similar in that they represent the index at which a record is located. Row_number () is more powerful than Oracle's rownum that it can specify a sort column by the over statement, for example: Row_number () Over (ORDER by CustomerID).
Let's take a look at an example: Sort Sales.SalesOrderHeader by CustomerID and display the row number for each record.
SELECT Salesorderid,customerid,row_number () over (ORDER by CustomerID) as RowNum
From Sales.SalesOrderHeader
Here are the results of the query:
We found that the final results were sorted according to Cutomerid, with rownum starting from 1 and each record (whether or not having the same CustomerID) had a different rownum.
Referring to the sort, we have to mention order by, what happens if we add an order by and specify a different sort field?
SELECT Salesorderid,customerid,row_number () over (ORDER by CustomerID) as RowNum
From Sales.SalesOrderHeader
ORDER by SalesOrderID
The results obtained by the query are:
As you can see, the final result is sorted by the SalesOrderID specified in the order by, but the values row_number () are sorted based on Custmerid.
Because Row_number () is based on the position of a DataRow after a certain field is sorted, it cannot be used directly into the aggregate column. For example, the following SQL is not legal:
SELECT Customerid,count (*) as Ordercount,row_number () over (ORDER by OrderCount)
From Sales.SalesOrderHeader
GROUP by CustomerID
If you want to follow OrderCount, you can use the CTE described in the first section:
With Cte_order (Customerid,ordercount)
As
(
SELECT Customerid,count (*) as OrderCount
From Sales.SalesOrderHeader
GROUP by CustomerID
)
SELECT Customerid,ordercount,row_number () over (ORDER by OrderCount)
From Cte_order
2. RANK ()
The use of RANK () is similar to Row_number (). However, it differs from row_number () in that it is the same as the return value for the field that is specified as ordered, with the same worth of row corresponding. Like what:
SELECT Salesorderid,customerid,rank () over (ORDER by CustomerID) as RowNum
From Sales.SalesOrderHeader
The following are the corresponding query results:
For rank (), it is also important to note that its return value is not continuous, such as the fifth record of the row_num is 5 instead of 2. If you want to implement this requirement, you need to use the following Function:dense_rank ().
3. 3. Dense_rank ()
Dense_rank () implements a continuous ranking. For example, the following SQL:
SELECT Salesorderid,customerid,dense_rank () over (ORDER by CustomerID) as RowNum
From Sales.SalesOrderHeader
To produce the following query results:
4. NTILE ()
Above we talk about dividing echelon, such a problem can be achieved by ntile () function. For example, we now sort by CustomerID, the CustomerID for 1 and 2 into the 3 echelon:
SELECT Salesorderid,customerid,ntile (3) over (ORDER by CustomerID) as RowNum
From Sales.SalesOrderHeader
WHERE CustomerID <3
Its query results are:
We can see that altogether 12 records, divided into 3 groups, averaged down each group of 4 records.
5. 5.PARTITION by
All the ranking mentioned above are based on the whole result base. In some cases, we need to group the result set according to a column to make the ranking based on groups. This will require partition by. PARTITION by is placed in the over clause, and order by peer.
For example, the following SQL groups order records according to CustomerID, outputting rankings in each group (an OrderDate sort):
SELECT Salesorderid,customerid,rank () over (PARTITION by CustomerID ORDER by OrderDate) as RowNum
From Sales.SalesOrderHeader
The corresponding query results:
SQL Server Syntax