Introduction to Oracle Common analytic functions (ranking function + window function)November 30, 2014 ? database 3903 words ? No comments? Read 7,772 times Rating function
The common rating functions are as follows:
- Rank (): Returns the rank of the data item in the grouping, leaving a vacancy in the rank when ranked equal, resulting in a discontinuous ranking.
- Dense_rank (): The same return data items are ranked in the group, but do not leave fame empty when ranked equal.
- Cume_dist (): Returns the position of a specific value relative to a set of values, which is a shorthand for the cumulative distribution (cumulative distribution).
- Percent_rank (): Returns the percentage rank of a value relative to a set of values.
- NTILE (): Returns the value after the N shard, such as three shards, four shards, and so on.
- Row_number (): Returns a number for each grouped record, noting that it differs from the rownum pseudo-column.
RANK () and Dense_rank ()
Both the rank () and dense_rank () functions can be used to calculate the rank of a data item in a grouping (all data is a group when not using partition by). The difference is that rank () is ranked equal, such as when there are 3 1th place, the next ranking is 4th, No 2, 3, and Dense_rank () in 3 1th, the next ranking is the 2nd name. That is, rank () appears at the rank interval, and Dense_rank () does not appear in the rank interval.
These two functions are used more in the SELECT clause, without the use of the partition BY clause without grouping. Use for example, find out the company's payroll rankings:
Select Ename,
Rank () Over (order by Sal Desc) rank,
Dense_rank () Over (order by Sal Desc) Dense_rank
from EMP;
As you can see from the statement, the rank () function needs to have the keyword over and order by. and rank () is a single-valued function, not an aggregate function. If you need to figure out the top wage for each job in the top wage of all jobs:
Select Job,
Rank () over (order by Max (SAL) desc) rank,
Dense_rank () over (order by Max (SAL) desc) Dense_rank
From EMP
Group by Job;
In the rankings, there will be a null value in the previous question, which can be controlled after the ORDER BY clause using the keyword nulls first/last.
PARTITION BY clause
You need to use the partition BY clause when you need to rank within groups after getting grouped. It differs from group by's grouping, which does not "merge aggregation", it is equivalent to grouping values after the calculation, and then repeating each value.
The most common examples include the name (name), Class (Class), and score (score) three fields in the table tables, with the first three names, classes, and fractions in each class, and the SQL statement:
Select Name,class,score
From (select name,
Class
Score,
Rank () Over (partition by class ORDER by score desc) rank
From table)
where rank <= 3;
In the Scott user test, ask for the name, department, job and salary of the top 3 people in each department, such as:
SELECT *
From (select Ename,
Deptno
Job
Sal
Dense_rank () over (partition by deptno ORDER BY sal Desc) rank
From EMP)
where rank <= 3;
Row_number ()
Row_number returns a number for each row, which is more commonly used in groupings (RowNum is commonly used in non-groups). For example, sort each job wage in the EMP table from high to Low:
Select Ename,job,sal,row_number () over (partition by job ORDER by Sal Desc) from EMP;
window functions (cumulative sum, moving average, etc.)
Window functions can be used to calculate cumulative sum, moving average, and center mean, as follows:
Calculate cumulative and
To query the cumulative sales from January 2003 to December, the SQL statement is as follows:
SELECT month,
SUM (amount) as Month_amount,
SUM (sum (amount)) over (ORDER by month ROWS between unbounded preceding and current ROW) as Cumulative_amount
From All_sales
Where year = 2003
GROUP by month
ORDER by month;
For the cumulative partial sum (SUM (amount)) over (ORDER by month ROWS between unbounded preceding and current ROW), the resolution is as follows:
- The sum (amount) inside the sum (sum (amount)) is used to calculate the sum of monthly sales, and the external sum () is used to calculate the cumulative sales.
- Order BY month sorts the records that the query reads by month.
- Rows between unbounded preceding and current row define the start and end point of the window, starting with unbounded preceding, meaning that the starting point is the first row of a fixed query result set; Row indicates the end point is the current row that processes the result set. When the external SUM function evaluates to return the current cumulative sales, the end of the window moves down one line. Preceding is the cumulative number, if the unbounded is replaced by a number such as 1, it is to accumulate with the previous record, but also backward, using the keyword following, you can specify that the backward cumulative number only need to add a number before the keyword, The number is the number of rows that are accumulated backwards (the importance of sorting can also be seen here).
Such as:
To calculate the cumulative sales for a given month, such as June through December, you only need to add the condition in the WHERE clause to month between 6 and 12.
To calculate the cumulative sales for the 3 months before the current month, the window statement:
SUM (sum (amount)) over (ORDER by month ROWS between 3 preceding and current ROW) as Cumulative_amount
Calculate cumulative sales for the first one months and one months, window statements:
SUM (sum (amount)) over (ORDER by month ROWS between 1 preceding and 1 following) as Cumulative_amount
Calculate moving Averages
Calculates the moving average of sales between this month and the first 3 months, with the following SQL statement:
SELECT month,
SUM (amount) as Month_amount,
AVG (SUM (amount)) over (ORDER by month ROWS between 3 preceding and current ROW) as Moving_average
From All_sales
Where year = 2003
GROUP by month
ORDER by month;
For the Moving Average part avg (SUM (amount)) over (ORDER by month ROWS between 3 preceding and current ROW) resolves as follows:
- The sum (amount) inside the AVG (sum (amount)) calculates the monthly sales and the external AVG () calculates the average.
- Order BY month sorts the records that the query reads by month (this is necessary because only sorting can be done to accumulate or to average before and after).
- ROWS between 3 preceding and current row defines the start of the window as the first 3 records of the current record, and the end of the window as the active record.
Computing Center Average
Calculates the moving average of sales for the one months before and after the current month, with the following SQL statement:
SELECT month,
SUM (amount) as Month_amount,
AVG (SUM (amount)) over (ORDER by month ROWS between 1 preceding and 1 following) as Moving_average
From All_sales
Where year = 2003
GROUP by month
ORDER by month;
For the center mean part avg (SUM (amount)) over (ORDER by month ROWS between 1 preceding and 1 following) is resolved as follows:
- The sum (amount) inside the AVG (sum (amount)) calculates the monthly sales and the external AVG () calculates the average.
- Order BY month sorts the records that the query reads by month (this is necessary because only sorting can be done to accumulate or to average before and after).
- ROWS between 1 preceding and 1 following defines the beginning of the window as the record before the current record, and the end of the window is the record after the current record.
Window first and last record
The First_value () and Last_value () functions can be used to get the first and last rows of data in a window, for example, to get sales for the one-month and one-month period before the current month:
SELECT month,
SUM (amount) as Month_amount,
First_value (SUM (amount)) over (ORDER by month ROWS between 1 preceding and 1 following) as Pre_month_amount,
Last_value (SUM (amount)) over (ORDER by month ROWS between 1 preceding and 1 following) as Next_month_amount
From All_sales
Where year = 2003
GROUP by month
ORDER by month;
Among them, the window defines the starting point for the first one months after the end of one months, so first_value (sum (amount)) for the first one months sales and last_value () for the last one months sales.
Analytic functions in Oracle