Hive External Table Partition table

Source: Internet
Author: User

Before the main study of Oracle and MySQL, think Hive is actually a data warehouse framework, there is not too many alternative, so the main focus on the study of Hadoop,hbase,sqoop,mahout, recently a little attentively to see the next hive, In fact, hive is still more useful than I imagined, a little bit dark in the heart, whether it is connected with Hadoop, or in the external query analysis, regular hsql generated reports, are very convenient, can not use MapReduce, directly using hive to generate reports. It's convenient. Hive provides two ways to use both tables, one is an internal table (a managed table) and the other is an external table.
The main difference between the two table usage scenarios is where the data is stored, and if you don't just use hive to invoke part of the data, then simply use the internal table. If there are other tasks to share this batch of data, then use an external table, which is what I am doing now. But I'm more inclined to use external tables, because I'm more concerned with a feature that is flexible extensibility! External tables can use changes that may occur later in the use of data. I generally use external table methods.  Partitioning of external tables: For a simple case, the external table is used for partitioning: first, the Data sample: Channel area service   role name unique identification level   National VIP level UID registration time last login time remaining recharge gold   remaining system gold recharge System gold consumption grain consumption wrought iron consumption silver coin consumption credits remaining quantity   Date 178 xxx_260 Triumph 178_263_1809 33 3 0 34191904 2014-12-05 15:33:23 2014-12-05 17:18:5822 5 0 0 0 0 0 0 2015-06-07178 xxx_262 Triumph 178_264_801 33 2 0 34191904 2014-12-06 12:13:03 2014-12-06 14:12:51237 0 0 0 0 0 0 2015-06-07178 xxx_274 "Fetish" 178_276_2131 11 1 0 34308396 2014-12-18 15:39:34 2014-12-18 21:28:38 0 100 0 0 0 0 0 0 2015-06-  07178 xxx_286 Seven Rain 1 178_288_859 2 0 34016769 2014-12-30 12:46:18 2014-12-30 19:27:21100 0 0 0 0 0 0 2015-06-07178 XXX_323 丿 know 178_324_1780 29 2 0 34645311 2015-02-04 13:42:59 2015-02-04 20:18:12 0 120 0 0 0 0 0 0 2015-06-07 Creating a database: Create data Base gcld;use gcld; create partition external table (partitioned by day): Create external Table Gcldlog (coopid string,zoneid string,rolename strIng,uniquecode string,level string,country string,viplevel string,uid  string,regdate string,lastlogintime String,remain_reg_coin  bigint,remain_sys_coin bigint,cost_reg_coin Bigint,cost_sys_coin Bigint,cost_food Bigint,cost_iron bigint,cost_silvercoin bigint,remain_coupon  bigint,datetime string) partitioned by (DT string) Row format delimited    fields  terminated by ' \ t '    collection items terminated by ' \002 ' & Nbsp;map keys terminated by   ' \003 '    lines terminated by ' \ n ' stored as textfile  location   '/SO urce/gcld/'; add partition to external partition table: ALTER TABLE Gcldlog add partition (dt= ' 2015-06-07 ')  location ' 2015-06-07 '; ALTER TABLE Gcldlog Add partition (dt= ' 2015-06-08 ')  location ' 2015-06-08 '; ALTER TABLE Gcldlog add partition (dt= ' 2015-06-09 ') & Nbsp;location ' 2015-06-09 '; to upload data to HDFs, the HDFS directory structure should be as follows:/source/gcld/2015-06-07/xxx.log/source/gcld/2015-06-08/ Xxx.log/source/gcld/2015-06-09/xxx.log Create a partition-corresponding HDFs directory: Hadoop fs-mkdir-p/soUrce/gcld/2015-06-07hadoop fs-mkdir-p/source/gcld/2015-06-08hadoop fs-mkdir-p/source/gcld/ 2015-06-09 uploading files to the HDFs directory: Hadoop  fs-put      local_log_dir/2015-06-07.log/source/gcld/2015-06-07 /hadoop  fs-put      local_log_dir/2015-06-08.log/source/gcld/2015-06-08/hadoop  fs-put & nbsp    local_log_dir/2015-06-09.log/source/gcld/2015-06-09/data query (query using partitioned mode):
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.propertieshive> show Databases;o  Kdefaultgcldopensdkdbtmptime taken:1.754 seconds, Fetched:4 row (s) hive> use Gcld;oktime taken:0.126 secondshive>  Show Tables;okgcldlogtime taken:0.114 seconds, fetched:1 row (s) hive> select * from Gcldlog t where t.dt= ' 2015-06-08 '  Limit 10;ok07073sy xxxx_220       Traitor  07073sy_228_25641       PNS     &NBSP;3       0       140958  2015-04-30 09:08:54     2015-05-19 19:27:30590      0       0       0       0       0     & nbsp 0       2015-06-08      2015-06-0807073sy xxxx_268       Platform promotion     &N Bsp  07073sy_268_7017        18      1       0       134   2014-12-27 16:38:51     2014-12-29 11:48:21 0            0       0   &N Bsp   0       0       0       0       2015-06-08   &NBSP ;  2015-06-08</span>07073sy xxxx_274       Teijin    07073sy_276_147 6       2       0       6037    2014-12-18 10:59:35     2014-12-18 10:59:35 &nbs P   0  150      0       0       0       0   &NBSP ;   0       0       2015-06-08      2015-06-08</span>07073sy xxxx_27 4     &NBSP; youth ℡07073sy_276_7459        22      2       0 &NB Sp     10297   2014-12-19 09:01:54     2014-12-19 11:44:40100      0     &N Bsp 0 &NBsp     0       0       0       0       2015-06-08 &nbsp ;     2015-06-0807073sy xxxx_274       General    07073sy_276_8065       &NBSP;2       2       0       11287   2014-12-19 20:01:06     2014-12 -19 20:01:06100      0       0       0       0     & nbsp 0       0       2015-06-08      2015-06-0807073sy xxxx_275     &NBS P, Su Xiaofan,        07073sy_277_226      2       0       108   2014-12-19 11:07:14     2014-12-19 11:07:14100      0       0   &NB Sp   0       0       0       0       2015-06-08   &NBSP ;  2015-06-0807073sy xxxx_275       Ning Xiao  07073sy_277_571 6       3       0 & nbsp     11132   2014-12-19 11:48:46     2014-12-19 11:48:46     0  150     & Nbsp;0       0       0       0       0       0       2015-06-08      2015-06-0807073sy xxxx_278       Jie    07073sy _278_7297        36      1       1       11885   201  4-12-21 01:54:01     2014-12-21 11:57:222015-06-08       2015-06-0807073sy xxxx_273        NN  07073sy_279_2403        18      3       0        12094   2014-12-21 15:21:43     2014-12-22 17:50:17100      0       0     &NBSp 0       0       0       0       2015-06-08     &NBSP ; 2015-06-0807073sy xxxx_273       Zhuge, invincible      07073sy_279_3152        9 &N Bsp     1       0       12137   2014-12-21 17:00:53     2014-12-21 19:0 9:52 0       &nbsp     0       0       0       0 ;     0       0       2015-06-08      2015-06-08time taken:2.428 sec Onds, Fetched:10 row (s)

Hive External Table partition table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.