My colleague asked a very simple question: how can I retrieve the minimum value of another column in each partition?
create table t1 (int c1, int c2);
If the partition is based on C2, 0-10, 10-20, 20-30, 30-40, 40-50
insert into t1 values(101, 1);insert into t1 values(102, 2);insert into t1 values(111, 11);insert into t1 values(112, 12);insert into t1 values(121, 21);insert into t1 values(122, 22);insert into t1 values(131, 31);insert into t1 values(132, 32);insert into t1 values(133, 32);insert into t1 values(132, 33);insert into t1 values(141, 41);insert into t1 values(142, 42);insert into t1 values(142, 43);
This was a very simple problem, but I couldn't think of it today. I went around and looked at rank OLAP functions. After learning for a long time, I found that I couldn't solve it.
In fact, you only need to take an integer according to the partition table range.
select cast(c2/10 as integer), min(c1),max(c1) from t1group by cast(c2/10 as integer);
输出如下:
1 2 3 - --- --- 0101102 1111112 2121122 3131133 4141142
However, I have reviewed rank functions and summarized them briefly:
Rank () ranking function. The return value is discontinuous. If there are two identical first places, the third value is 3.
Dense_rank
() Olympic champion ranking function. The return value is continuous. You can join the first place, then the second place, and the third place.
Row_number () continuous value, which can be considered as rownum in Oracle
Over () can be used to add a partition column or column expression. In this way, if the value of max (C1) is output in a partition, all values are the same.
SELECT c1,c2,CAST(C2/10 AS INTEGER),MAX(C1) OVER( PARTITION BY CAST(C2/10 AS INTEGER)) FROM T1;
The output is as follows:
C1 C2 3 4 --- -- - --- 101 10102 102 20102 111111112 112121112 121212122 122222122 131313133 132333133 133323133 132323133 141414142 142434142 142424142
In the over clause, you can add order by to specify the Column Used for sorting.
Another very useful function is that window functions can specify rows or range in over () to specify a window with the current behavior range. In this window, Aggregate functions are calculated.
For example, I want to see the price of a product on the current date and the average price of the previous 30 days:
SELECT c1 as price,c2 as days,avg(C1) OVER(order by c2 range 30 PRECEDING ) as avg_priceFROM T1;
The result is as follows:
PRICE DAYS AVG_PRICE ----- ---- --------- 101 1 101 102 2 101 111 11 104 112 12 106 121 21 109 122 22 111 131 31 114 132 32 120 133 32 120 132 33 124 141 41 126 142 42 129 142 43 132
In my opinion, the most useful function of window functions is to check the current stock price and the average price of the previous period of time.
This is a brief introduction to the rank series functions of OLAP. With this knowledge, you can quickly write more useful SQL statements. I think the rank series functions are very powerful when processing data in each row and holding group by for a certain column. However, please note that any function in this section will cause a large amount of data to be scanned, so the performance of such SQL statements will not be very good. Please be careful when using online programs.
Rank functions from simple requirements to OLAP