Hive There are a number of analytic functions available to complete the responsible statistical analysis.
This article first describes the sum , AVG , MIN , the four functions of Max .
Environmental information:
Hive version number is Apache-hive-0.14.0-bin
Hadoop version number is hadoop-2.6.0
Tez version number is tez-0.7.0
To construct the data:
p088888888888,2016-02-10,1
p088888888888,2016-02-11,3
p088888888888,2016-02-12,1
p088888888888,2016-02-13,9
p088888888888,2016-02-14,3
p088888888888,2016-02-15,12
p088888888888,2016-02-16,3
To create a table:
Hive (hiveinaction) > CREATE TABLE Windows_func
> (
> Polno String,
> Createtime String,
> Pnum int
>)
>row FORMAT Delimited
>fields TERMINATED by ', '
>stored as Textfile;
Import the data into the table:
Load data local inpath '/home/hadoop/testhivedata/windows_func.txt ' into table windows_func;
Test:
SELECT Polno,
Createtime,
Pnum,
SUM (Pnum) over (PARTITION by Polno Createtime) as PNUM1,-- from the beginning to the current line
SUM (Pnum) over (PARTITION by Polno to createtime ROWS between unbounded preceding and current ROW) as Pnum2 ,-- from the starting point to the current row
SUM (Pnum) over (PARTITION by Polno) ASpnum3,-- all rows within a group
SUM (Pnum) over (PARTITION by Polno Createtime rows between 3 preceding and current row) as pnum4, -- present line + forward 3 line ( The value of the current row + The value of the front three rows
SUM (Pnum) over (PARTITION by Polno createtime ROWS between 3 preceding and 1 following) as PNUM5,-- Current Line + forward 3 rows + 1 rows
SUM (Pnum) over (PARTITION by Polno the createtime ROWS between current ROW and unbounded following) as Pnum 6--- current row + all future rows
From Windows_func;
Results:
Polno |
createtime |
Pnum |
Pnum1 |
pnum2 |
pnum3 |
pnum4 |
pnum5 |
PNUM6 |
P088888888888 |
2016/2/10 |
1 |
1 |
1 |
32 |
1 |
4 |
32 |
P088888888888 |
2016/2/11 |
3 |
4 |
4 |
32 |
4 |
5 |
31 |
P088888888888 |
2016/2/12 |
1 |
5 |
5 |
32 |
5 |
14 |
28 |
P088888888888 |
2016/2/13 |
9 |
14 |
14 |
32 |
14 |
17 |
27 |
P088888888888 |
2016/2/14 |
3 |
17 |
17 |
32 |
16 |
28 |
18 |
P088888888888 |
2016/2/15 |
12 |
29 |
29 |
32 |
25 |
28 |
15 |
P088888888888 |
2016/2/16 |
3 |
32 |
32 |
32 |
27 |
27 |
3 |
Gaze:
1. Assuming that no rows between is specified, the default is from the starting point to the current line ;
2. Assuming that no ORDERBy is specified, all the values in the group are incremented ;
UnderstandingROWS betweenmeaning,also calledWINDOWclause:
Preceding: Forward
Following: Future
Current ROW: Current Line
Unbounded: Starting point,unbounded precedingrepresents the starting point from the front,unbounded following: Indicates to the back end
otherAVG,MIN. MAX. and theSUMuse the same method.
Demo AVG Environment:
SELECT Polno,
Createtime,
Pnum,
AVG (Pnum) over (PARTITION by Polno ORDER by Createtime) as PNUM1,-- from the beginning to the current line
AVG (Pnum) over (PARTITION by Polno ORDER by createtime ROWS between unbounded preceding and current ROW) as Pnum 2,-- from the starting point to the current row
AVG (Pnum) over (PARTITION by Polno) as PNUM3,-- all rows within a group
AVG (Pnum) over (PARTITION by Polno ORDER by Createtime ROWS between 3 preceding and current ROW) as PNUM4,-- Current Line + forward 3 rows ( value of current row + value of previous three rows ) )
AVG (Pnum) over (PARTITION Bypolno ORDER by Createtime ROWS between 3 preceding and 1 following) as PNUM5,-- Current Line + forward 3 rows + 1 rows
AVG (Pnum) over (PARTITION Bypolno ORDER by Createtime ROWS between current ROW and unbounded following) as Pnum 6--- current row + all future rows
From Windows_func;
Results:
Polno |
Createtime |
Pnum |
Pnum1 |
Pnum2 |
Pnum3 |
Pnum4 |
Pnum5 |
Pnum6 |
P088888888888 |
2016/2/10 |
1 |
1 |
1 |
4.57142857 |
1 |
2 |
4.5714286 |
P088888888888 |
2016/2/11 |
3 |
2 |
2 |
4.57142857 |
2 |
1.666667 |
5.1666667 |
P088888888888 |
2016/2/12 |
1 |
1.66667 |
1.6667 |
4.57142857 |
1.666667 |
3.5 |
5.6 |
P088888888888 |
2016/2/13 |
9 |
3.5 |
3.5 |
4.57142857 |
3.5 |
3.4 |
6.75 |
P088888888888 |
2016/2/14 |
3 |
3.4 |
3.4 |
4.57142857 |
4 |
5.6 |
6 |
P088888888888 |
2016/2/15 |
12 |
4.83333 |
4.8333 |
4.57142857 |
6.25 |
5.6 |
7.5 |
P088888888888 |
2016/2/16 |
3 |
4.57143 |
4.5714 |
4.57142857 |
6.75 |
6.75 |
3 |
Other similar functions are not examples.
Sum,avg,min and Max for hive analytic form functions