Impala,hive Row Column

Impala,hive Row Column _hive

Last Update:2018-08-23 Source: Internet

Author: User

Tags explode

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hive

For Hive, I use Collect_set () + CONCAT_WS () from Https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF.
But If you are want to remove duplicated elements, write your own UDF should is the only choice till now. Hive> Select uid, Concat_ws (', ', collect_set (tag)) from the test group by UID; failed:semanticexception [Error 10016]: line 1:27 Argument type mismatch ' tag ': Argument 2 of function Concat_ws must "String or array<string>", but "array<int>" was found. Hive> Select uid, Concat_ws (', ', Collect_set (CAST (tag as STRING)) from the test group by UID; ... Job 0:map:3 Reduce:1 Cumulative cpu:8.43 sec HDFS read:890 HDFS write:18 SUCCESS total MapReduce CPU time Spent:8 s Econds 430 msec OK 1 2,1,3 2 1,4 3 5 1 2 3 4 5 6 7 8 9 hive > select UID, concat_ws (', ', collect_set (tag)) F ROM test GROUP by UID; failed:semanticexception [Error 10016]: line 1:27 Argument type mismatch ' tag ': Argument 2 of function Concat_ws m UST be "A string or array<string>", but "array<int>" was found. Hive > select uid, concat_ws (', ', Collect_set) (CAST (tagAs STRING)) from the test group by UID; . . . Job 0:map:3 reduce:1 cumulative cpu:8.43 sec HDFS read:890 HDFS write:18 SUCCESS Total MapReduce CPU time spent:8 seconds 430 msec OK 1 2, 1, 3 2 1, 4 3 5 Impala

Impala also has a group_concat () but different from MySQL

Group_concat (string s [, a string Sep])
Purpose:returns A single string representing the argument value concatenated together for each row of the result set. If the optional separator string is specified, the separator are added between each pair of concatenated values.
Return type:string

Usage Notes:concat () and CONCAT_WS () are appropriate for concatenating the values of the multiple columns within the same row , while Group_concat () joins together values from different rows.

By default, returns a single string covering the whole result set. To include "other" columns or values in the result set, or to produce multiple concatenated strings for subsets of Rows, Inc. Lude a GROUP by clause in the query.

Group_concat (string s [, String Sep]) is used in conjunction with grouping functions, group_concat (field, separator), and the following example:
[hadoop4.xxx.com:21000] > select UID, group_concat (CAST (tag as String), ', ') as Tag_list from Test3 Group by UID; Query:select uid, Group_concat (CAST (tag as String), ', ' as tag_list from TEST3 Group BY UID +-----+----------+ | UID | Tag_list | +-----+----------+ | 3 | 3 4 2 2 | 1,4 | | 1 | 1,2,3 | +-----+----------+ returned 3 row (s) in 0.68s 1 2 3 4 5 6 7 8 9 [HADOOP4. xxx. com:21000] > select UID, group _concat (CAST (tag as String), ', ') as Tag_list from Test3 Group by UID; Query:select uid, Group_concat (CAST (tag as String), ', ') as Tag_list from TEST3 Group by UID +-----+----- - -- -- + | UID | Tag_list | + -- -- - + -- -- -- -- -- + | 3 | 5 | | 2 | 1, 4 | | 1 | 1, 2, 3 | +-----+----------+ returned 3 row (s) in 0.68s Rows to Columns from: +------+------+------+ | UID | Tag | Val| +------+------+------+ | 1 | 1 | 1 | | 1 | 2 | 0 | | 1 | 3 | 1 | | 2 | 1 | 1 | | 2 | 4 | 0 | | 3 | 5 | 1 | +------+------+------+ to: +------+----------+----------+----------+----------+----------+ | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | +------+----------+----------+----------+----------+----------+ | 1 | 1 | 0 | 1 | 0 | 0 | | 2 | 1 | 0 | 0 | 0 | 0 | | 3 | 0 | 0 | 0 | 0 | 1 | +------+----------+----------+----------+----------+----------+ 1 2 3 4 5 6 7 8 9 m (+): +- - -- -- + -- -- -- + -- -- -- + | UID | Tag | Val | + -- -- -- + -- -- -- + -- -- -- + | 1 | 1 | 1 | | 1 | 2 | 0 | | 1 | 3 | 1 | | 2 | 1 | &nBsp 1 | | 2 | 4 | 0 | | 3 | 5 | 1 | +------+------+------+ to: +------+----------+----------+----------+----------+- - -- -- -- -- + | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | + -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + | 1 | 1 | 0 | 1 | 0 | 0 | | 2 | 1 | 0 | 0 | 0 | 0 | | 3 | 0 | 0 | 0 | 0 | 1 | +------+----------+----------+----------+----------+----------+ [hadoop4.x XX.COM:21000] > select > UID, > Max (case when Tag=1 then Val else 0) as Tag1_val, > Max (case when tag=2nd En val Else 0 end) as Tag2_val, > Max (case when Tag=3 then Val else 0) as Tag3_val, > Max (case when tag=4 then V Al Else 0 end) as Tag4_val, > Max (case when Tag=5 then Val else 0-end) as Tag5_val > from Test2 > Group by UID; Query:select uid, max (case when Tag=1 then Val else 0) as Tag1_val, Max (case when tag=2 then Val else 0) as Tag2_ Val Max (case when Tag=3 then Val else 0) as Tag3_val, Max (case when Tag=4 then Val else 0) as Tag4_val, Max Tag=5 then Val else 0 as Tag5_val from Test2 Group BY UID +-----+----------+----------+----------+----------+------- ---+ | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | +-----+----------+----------+----------+----------+----------+ | 3 | 0 | 0 | 0 | 0 | 1 | | 2 | 1 | 0 | 0 | 0 | 0 | | 1 | 1 | 0 | 1 | 0 | 0 | +-----+----------+----------+----------+----------+----------+ returned 3 row (s) in 0.99s 3 4 5 6 7 8 9 10 11 12 13 14 15 [HADOOP4. xxx. com:21000] > select &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NB sp; > uid, > Max (caseTag = 1 Then val else 0 end) as Tag1_val, > max (case when tag = 2 then Val else 0 end) as Tag2_val, & nbsp; > max (case when tag = 3 Then Val else 0 end) as Tag3_val, > max (case when tag = 4 then Val else 0 end) as Tag4_val, & nbsp; > max (case when tag = 5 Then Val else 0 end) as Tag5_val > from Test2 > GROUP by UID; Query:select uid, max (case when tag = 1 then val else 0 ") as Tag1_val, Max (case when tag = 2 then val else 0 E nd) as Tag2_val, Max (case when tag = 3 then Val else 0 ") as Tag3_val, Max (case when tag = 4 then Val else 0 end As Tag4_val, Max (case when tag = 5 then Val else 0 ") as Tag5_val from Test2 Group by UID +-----+-------- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | + -- -- - + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + | 3 | 0 | 0 | 0 | 0 | 1 | | 2 | 1 | 0 | 0 | 0 | 0 | | 1 | 1 | 0 | 1 | 0 | 0 | +-----+----------+----------+----------+----------+----------+ returned 3 row (s) in 0 .99s Columns to Rows Comma separated String to rows from: +-----+----------+ | UID | Tag_list| +-----+----------+ | 1 | 1,2,3 | | 2 | 1,4 | | 3 | 5 | +-----+----------+ to: +-----+-----+ | UID | Tag | +-----+-----+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 2 | 1 | | 2 | 4 | | 3 | 5 | +-----+-----+ 1 2 3 4 5 6 7 8 9-A-plus-----+----------+ | UID | Tag_list | + -- -- - + -- -- -- -- -- + | 1 | 1, 2, 3 | | 2 | 1, 4 | | 3 | 5 | +-----+----------+ to: +-----+-----+ | UID | Tag | + -- -- - + -- -- - + | 1 | 1 | | 1 | 2 | | 1 | 3 | | 2 | 1 | | 2 | 4 | | 3 | 5 | + -- -- - + -- -- - +

UNION [All] SELECT seems to be a solution. Mysql

And ... A Stored Procedure or a UDF? Hive

Lateral View is awesome!
I tried explode () which can split a array into rows and before that split () which the split string into array. Hive> Select UID, tag from test4 lateral view explode (split (tag_list, ', ')) tag_table as tag; ... Job 0:map:1 Cumulative cpu:1.69 sec HDFS read:293 HDFS write:24 SUCCESS total MapReduce CPU time spent:1 seconds 690 msec OK 1 1 1 2 1 3 2 1 2 4 3 5 time taken:12.894 seconds hive> 1 2 3 4 5 6 7 8 9 hive > select UID, Tag from test4 lateral view explode (split (tag_list, ', ')) tag_table as tag; . . . Job 0:map:1 Cumulative cpu:1.69 sec HDFS read:293 HDFS write:24 SUCCESS total MapReduce CPU time spent:1 Seconds 690 msec OK 1 1 1 2 1 3 2 1 2 4 3 5 time taken:12.894 seconds hive > Presto

Not figured out. Impala

Not figured out. Columns to Rows from: +------+----------+----------+----------+----------+----------+ | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | +------+----------+----------+----------+----------+----------+ | 1 | 1 | 0 | 1 | 0 | 0 | | 2 | 1 | 0 | 0 | 0 | 0 | | 3 | 0 | 0 | 0 | 0 | 1 | +------+----------+----------+----------+----------+----------+ to: +------+------+------+ | UID | Tag | Val | +------+------+------+ | 1 | 1 | 1 | | 1 | 2 | 0 | | 1 | 3 | 1 | | 2 | 1 | 1 | | 2 | 4 | 0 | | 3 | 5 | 1 | +------+------+------+ 1 2 3 4 5 6 7 8 9 ' ": +------+----------+---------- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + | UID | Tag1_val | Tag2_val | Tag3_val | Tag4_val | Tag5_val | + -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + -- -- -- -- -- + | 1 | 1 | 0 | 1 | 0 | 0 | | 2 | 1 | 0 | 0 | 0 | 0 | | 3 | 0 | 0 | 0 | 0 | 1 | +------+----------+----------+----------+----------+----------+ to: +------+---- -- + -- -- -- + | UID | Tag | Val | + -- -- -- + -- -- -- + -- -- -- + | 1 | 1 | 1 | | 1 | 2 | 0 | | 1 | 3 | 1 | | 2 | 1 | 1 | | 2 | 4 | 0 | | 3 | 5 | 1 | + -- -- -- + -- -- -- + -- -- --

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More