Partitioning a table by date

Source: Internet
Author: User

An Internet statistics zoning problem using SpagoBI and hive?

1 originally based on the date of the partition and then splicing SQL will be very complex, need to judge in where conditions whether cross-year, cross-month, cross-day and other issues.

2 Hive's Custom function cannot return a string condition after a where condition for data filtering, but can return a Boolean for data filtering.

For example: Custom Function Date_where (StartTime, EndTime), the return format according to the start date and end date is:

Year=2015 and month=08 and day > 1 and Day < 10 stitching into where date_where ("2015-08-01", "2015-08-10") is not supported.


Second, based on the above problem, the partition is established by the date (DT)

Step1 Creating a Database

1 Create tvlog_test database tvlog_test;


STEP2 Creating a data table

1 Creating TVLOG_TCL data tables

CREATE table if not exists TVLOG_TEST.TVLOG_TCL (

ID string,

UserID String,

Channelid String,

ChannelName string,

Region string,

Channelcode String,

IP String,

StartTime String,

Endtime String,

Fromchannel String,

Tochannel String,

Mac String,

DeviceID String,

Dnum string

)

Partitioned by (DT string)

stored as ORC;

2 Creating Epg_wiki_info data tables

CREATE table if not exists tvlog_test.epg_wiki_info (

ID string,

Name String,

StartTime String,

Endtime String,

Wikiscreenshots Array<string>

Wikicover map<string, String>

Wikititle String,

tags array<string>

Wikiid String,

Channelcode String,

ChannelName string,

Timestamp string

)

Partitioned by (DT string)

stored as ORC;


STEP3 enable hive dynamic partition insertion

Set hive.exec.dynamic.partition=true;

Set hive.exec.dynamic.partition.mode=nonstrict;

Set hive.exec.max.dynamic.partitions.pernode=1000;


STEP4 Inserting data (2015-09-01 ~ 2015-09-05)

1 inserting data into the TVLOG_TEST.TVLOG_TCL table

Insert Overwrite table TVLOG_TEST.TVLOG_TCL

Partition (DT)

Select ID, userid, Channelid, ChannelName, Region,

Channelcode, IP, StartTime, Endtime, Fromchannel,

Tochannel, Mac, DeviceID, dnum, DT

From TVLOG.TVLOG_TCL

Where year = + and month = 9 and (day between 1 and 5);


2 inserting data into the Tvlog_test.epg_wiki_info table

Insert Overwrite table Tvlog_test.epg_wiki_info

Partition (DT)

Select ID, name, StartTime, Endtime, Wikiscreenshots, Wikicover,

Wikititle, tags, wikiid, channelcode, channelname, timestamp, DT

From Tvlog.epg_wiki_info

where dt between ' 2015-09-01 ' and ' 2015-09-05 ';


Partitioning a table by date

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.