The double count (distinct) here refers to a statement similar to the following
Select Day,count (Distinct session_id), COUNT (distinct user_id) from log a group by;
If you want to execute such a statement, you must set the parameters: set hive.groupby.skewindata=true;
We can solve the problem with the idea of "space Change Time":
Select Day,
count (case if type= ' session ' then 1 else null end) as SESSION_CNT,
count (case is type= ' user ' then 1 else null end) as user_cnt from
(
select Day,session_id,type from (
select day,session_id, ' Session ' as type F ROM Log
UNION ALL
-Select Day user_id, ' user ' as-type from
log
]
GROUP by Day,session_id,type) t1
group by day
The Type field here is completely self-defined, the purpose is to pass the extra space, the "check value", "Go to Weight", "add 1" operation scattered into different Mr Tasks, to speed up the effect.
Note that the number of values in the type is consistent with several counts (distinct) in the original statement, and the number of session_id, user_id, is OK.