Hive 7, Hive's inner table, appearance, partition (22)

Source: Internet
Author: User

1. Hive's inner table

The inner table of Hive is the normally created table, which is already mentioned in http://www.cnblogs.com/raphael5200/p/5208437.html;

2, the appearance of hive

To create a hive's appearance, you need to use the keyword External:

CREATE EXTERNAL TABLE [IF not EXISTS] [db_name.] TABLE_NAME      [(col_name data_type [COMMENT col_comment], ...)]  [COMMENT Table_comment]  [Partitioned by (Col_name data_type [COMMENT col_comment], ...)]  [CLUSTERED by (Col_name, col_name, ...) [SORTED by (Col_name [asc| DESC], ...)] into num_buckets buckets]  [skewed by (Col_name, Col_name, ...)                      On (Col_value, Col_value, ...), (Col_value, Col_value, ...), ...)     [STORED as directories]  [   

Let's look at an example:

--Load data
Load data local inpath '/opt/food.txt ' overwrite into table food_ex;
SELECT * from FOOD_EX;

These two, the left side is the appearance, the right side is the inner table from the general point of view seems to be no different, but his main difference is the deletion operation:

  Delete table or partition metadata and data in the table is deleted.

  Appearance Delete Table metadata Delete, data retention

The following two statements are executed separately:

drop table food; drop table food_ex;

After executing these two statements, two tables were deleted, but the results were different, accessing the Namenode 50070 port:

As can be seen, although the table Delete statements are executed, the internal table is deleted after the metadata and data are deleted, and the appearance has only deleted the metadata (table information) but the real data has been preserved;

3. Hive Partition partition

Partition must be created when the table is defined

A, single partition to build the table statement:

CREATE TABLE day_table (id int, content string)
Partitioned by (DT string);

Single partition table, partitioned by day, Id,content,dt three columns exist in the table structure. Differentiate with DT for folders

Cases:

# Below is the table structure information for log_info, the partition has been created
Hive> desc log_info;okip string times String # Partition Information # col_name data_ Type comment times string time taken:0.077 seconds, Fetched:7 row (s)

b, two-Partition build Table statement:


Partitioned by (DT string, hour string);

The two-partition table, partitioned by day and hour, adds new DT and hour two columns in the table structure. Start with the DT folder, then the hour sub-folder to differentiate

# Below is the table structure information for Log_info2, the partition has been created
Hive> desc log_info2;okip                      String                                      days                    string                                      hours                   String                                                # Partition Information          # Col_name                data_type               comment                       days                    string                                      hours                   string time                                      taken:0.08 seconds, Fetched:9 Row (s)

C, hive Add partition table syntax (table created, add partition on this basis):

ALTER TABLE table_name Addpartition_spec [location ' location1 ']partition_spec [location ' Location2 '] ... ALTER TABLE day_table addpartition (dt= ' 2008-08-08 ', hour= ') location '/path/pv1.txt '

D, hive Delete partition syntax:

ALTER TABLE table_name DROP PARTITION partition_spec, Partition_spec,...

The user can delete the partition with ALTER TABLE drop PARTITION. The metadata and data for the partition will be deleted together. Cases:

ALTER TABLE day_hour_table DROP PARTITION (dt= ' 2008-08-08 ', hour= ' 09 ');
ALTER TABLE Log_info drop partition (times= ' 20160222 ');

E, the hive data is loaded into the partition table syntax:

Cases:

Single partition Data loading

Load data local inpath '/opt/log ' overwrite into table  log_info partition (times= ' 20160223 ');
Load data local inpath '/opt/log2 ' overwrite into table log_info partition (times= ' 20160222 ');

Hive> select * from Log_info;
Ok
23.45.66.77 20160222
45.66.11.8 20160222
2.3.4.5 20160223
4.56.77.31 20160223
34.55.6.77 20160223
34.66.11.6 20160223
Time taken:0.125 seconds, Fetched:6 row (s)

A new two partition directory is created in hive based on the name of the partition

Dual partition Data loading

Load data local inpath '/opt/log3 ' overwrite into table Log_info2 partition (days= ' All ', hours= ' 12 ');
Hive> SELECT * from log_info2;ok12.3.33.66    1223.44.56.6    1212.22.33.4    128.78.99.4    12233.23.211.2    12Time taken:0.069 seconds, Fetched:5 row (s)

When the data is loaded into the table, no transformations are made to the data. The load operation simply copies the data to the location corresponding to the Hive table. When data is loaded, a directory-based partitioned query is automatically created under the table:

SELECT day_table.* from day_table WHERE day_table.dt>= ' 2008-08-08 ';

F, Hive view Partition statement:





dt=2008-08-09/hour=09
Hive> Show Partitions log_info;oktimes=20160222times=20160223time taken:0.06 seconds, Fetched:2 row (s)

Hive 7, Hive's inner table, appearance, partition (22)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.