1. When Hive creates a table, the regular expression is the same as that in java code. \ is used in RegexSerDe to represent the anti-transcoding.
Create table inst_cn_3 (
Ip string,
Time string,
Mac string,
Lan string,
Ver string,
Lc string,
Pn string,
Reg string,
Vrf string,
Line string)
Row format serde 'org. apache. Hadoop. hive. contrib. serde2.regexserde'
With serdeproperties (
"Input. regex "="-* ([\ d \.] +) * \ [([\ d] {2}/[\ w] +/[\ d] {4 }: [\ d] {2}: [\ d] {2}: [\ d] {2} \ s + \ + [\ d] +) \] * GET */mx3/inst/([0-9a-f] {12})/ver = ([\\. \ d] +), lan = (0x [\ w] + )(? :, Lc = ([\ w] + ))(? :, Pn = ([\ w] + ))(? :, Reg = ([0-1]) (? :, Vrf = ([\ w] + ))?. *"
)
Stored as textfile;
Alter table inst_cn_3 add columns (line string );
Alter table inst_cn_3 set serdeproperties (
"Input. regex "="-([\ d \.] +) \ [([\ d] {2}/[\ w] +/[\ d] {4}: [\ d] {2 }: [\ d] {2}: [\ d] {2} \ s + \ + [\ d] +) \] GET/mx3/inst/([0-9a-f] {12})/ver = ([\\. \ d] +), lan = (0x [\ w] + )(? :, Lc = ([\ w] + ))(? :, Pn = ([\ w] + ))(? :, Reg = ([0-1]) (? :, Vrf = ([\ w] + ))?. * | (.*)"
);
Select * from inst_cn_3 limit 100;
Select
Line
From inst_cn_3
Where
1 = 1
And mac is null
And line is not null
And! (Line rlike '. * unknowuser00 .*')
;
Hadoop fs-cp/mnt/nfs/log/statcn/inst/inst_cn_3. *. txt/hive/warehouse/inst_cn_3/
2011.06.09 hive Time Processing
1. select
From_unixtime (unix_timestamp ('02/May/2011: 00: 00: 00 + 000000', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('02/May/2011: 23: 59: 59 + 0800 ', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('03/May/2011: 00: 00: 00 + 100', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('03/May/2011: 23: 59: 59 + 0800 ', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss ')
From
Stat_c_log
Where
1 = 1
And partkey = '000000'
And logType = 'inst _ cn_3'
And url rlike '/mx3/inst /.*'
Limit
10
;
2.
Select
From_unixtime (unix_timestamp ('02/May/2011: 00: 00: 00 + 000000', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('02/May/2011: 23: 59: 59 + 0800 ', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('03/May/2011: 00: 00: 00 + 100', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
From_unixtime (unix_timestamp ('03/May/2011: 23: 59: 59 + 0800 ', 'dd/MMMMM/yyyy: HH: mm: ss Z '), 'yyyy-MM-dd HH: mm: ss '),
Round (unix_timestamp ('02/May/2011: 00: 00: 00 + 100', 'dd/MMMMM/yyyy: HH: mm: ss Z ') -4*3600)/(3600*24) * (3600*24*1000 ),
Round (unix_timestamp ('02/May/2011: 23: 59: 59 + 0800 ', 'dd/MMMMM/yyyy: HH: mm: ss Z ') -4*3600)/(3600*24) * (3600*24*1000)
From
Stat_c_log
Where
1 = 1
And partkey = '000000'
And logType = 'inst _ cn_3'
And url rlike '/mx3/inst /.*'
Limit
10
;
2012.03.01
1. Hive method registration class FunctionRegistry
2012.06.14
1. set hive. cli. print. header = true; you can set the output of hive shell.
2012.06.26
1. After hive cdh4b2 uses arichive to archive a table and uses select line to query the archived partition, The FileNotFoundException exception is reported.
The https://issues.apache.org/jira/browse/MAPREDUCE-2704 is because CombineFileInputFormat constructs new Path objects by converting an existing path to a URI, and then only pulling out the "path" part of it. this drops the scheme and host, which makes CombineFileInputFormat fail if the paths are on a filesystem other than the default one.
2012.07.16
1. explain extended hive_query; view the run.
2012.07.29
1. describe formatted mock; displays columns, location, params, and so on.
2./src/ql/src/test/queries/clientpositive/