User, daily, and daily consumption amount.
001 2013-1-10 100
002 200
001 50
002 80
001 300
Get
User, date, consumption amount on the current day, cumulative consumption amount
001 2013-1-10 100 100
002 200 200
001 50 150
002 80 280
001 300 450 implementation method: MySQL supports: Select field1, field2, field3,
(Select sum (field3)
From test_aa
Where field1 = A. field1
And field2 <= A. field2
) As field4
From test_aa A but hive does not support this. MySQL Implementation Method 2: (qq netizen watermelon Little Prince (365742944) provided): select. f1,. f2,. f3, sum (B. f3) from (select F1, F2, F3 from test_a) a join (select F3 from test_a) B on (. f1 = B. f1 and. f2> = B. f2)
Group by A. F1, A. f2.a. F3 but hive does not support non-equals in join conditions on, so it cannot be implemented in hive. Another method is as follows:
Select a. ID, A. Date, A. Num as nu, sum (if (A. Date> = B. Date, B. Num, 0 ))
From
(Select ID, date, num from cost)
Join
(Select ID, date, num from cost) B
On (A. ID = B. ID)
Group by A. ID, A. Date, A. Num
Little Prince watermelon 18:18:59
I tried it in hive. No problem
Little Prince watermelon 18:20:33
However, the data of Count (*) multiplied by count (*) is created in the memory.
After testing, it can run normally, but because the order of magnitude is full join, hive execution is also very slow.